Introduction To UNIX and Linux - Lecture Three
Introduction To UNIX and Linux - Lecture Three
html
Introduction to UNIX:
Lecture Three
3.1 Objectives
This lecture covers:
File and directory permissions in more detail and how these can be changed.
Ways to examine the contents of files.
How to find files when you don't know how their exact location.
Ways of searching files for text patterns.
How to sort files.
Tools for compressing files and making backups.
Accessing floppy disks and other removable media.
As we have seen in the previous chapter, every file or directory on a UNIX system has
three types of permissions, describing what operations can be performed on it by
various categories of users. The permissions are read (r), write (w) and execute (x),
and the three categories of users are user/owner (u), group (g) and others (o). Because
files and directories are different entities, the interpretation of the permissions
assigned to each differs slightly, as shown in Fig 3.1.
File and directory permissions can only be modified by their owners, or by the
1 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
--- 0
--x 1
-w- 2
-wx 3
r-- 4
r-x 5
rw- 6
rwx 7
sets the permissions on private.txt to rw------- (i.e. only the owner can
read and write to the file).
sets the permissions on all files ending in *.txt to rw-rw---- (i.e. the owner
and users in the file's group can read and write to the file, while the general
public do not have any sort of access).
chmod also supports a -R option which can be used to recursively modify file
permissions, e.g.
$ chmod -R go+r play
will grant group and other read rights to the directory play and all of the files
and directories within play.
2 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
can be used to change the group that a file or directory belongs to. It also
supports a -R option.
file filename(s)
file analyzes a file's contents for you and reports a high-level description of
what type of file it appears to be:
$ file myprog.c letter.txt webpage.html
myprog.c: C program text
letter.txt: English text
webpage.html: HTML document text
file can identify a wide range of files but sometimes gets understandably
confused (e.g. when trying to automatically detect the difference between C++
and Java code).
head and tail display the first and last few lines in a file respectively. You can
specify the number of lines as an option, e.g.
tail includes a useful -f option that can be used to continuously monitor the
last few lines of a (possibly changing) file. This can be used to monitor log files,
for example:
$ tail -f /var/log/messages
objdump can be used to disassemble binary files - that is it can show the
machine language instructions which make up compiled application programs
and system utilities.
3 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
There are also several other useful content inspectors that are non-standard (in terms
of availability on UNIX systems) but are nevertheless in widespread use. They are
summarised in Fig. 3.2.
If you have a rough idea of the directory tree the file might be in (or even if you
don't and you're prepared to wait a while) you can use find:
find will look for a file called targetfile in any part of the directory tree rooted
at directory. targetfile can include wildcard characters. For example:
$ find /home -name "*.txt" -print 2>/dev/null
4 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
will search all user directories for any file ending in ".txt" and output any
matching files (with a full absolute or relative path). Here the quotes (") are
necessary to avoid filename expansion, while the 2>/dev/null suppresses
error messages (arising from errors such as not being able to read the contents of
directories for which the user does not have the right permissions).
find can in fact do a lot more than just find files by name. It can find files by
type (e.g. -type f for files, -type d for directories), by permissions (e.g.
-perm o=r for all files and directories that can be read by others), by size
(-size) etc. You can also execute commands on the files you find. For
example,
counts the number of lines in every text file in and below the current directory.
The '{}' is replaced by the name of each file found and the ';' ends the
-exec clause.
For more information about find and its abilities, use man find and/or info
find.
If you can execute an application program or system utility by typing its name at
the shell prompt, you can use which to find out where it is stored on disk. For
example:
$ which ls
/bin/ls
locate string
find can take a long time to execute if you are searching a large filespace (e.g.
searching from / downwards). The locate command provides a much faster
way of locating all files whose names match a particular search string. For
example:
$ locate ".txt"
will find all filenames in the filesystem that contain ".txt" anywhere in their
full paths.
5 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
track down files on the basis of their permissions, size and so on.
grep searches the named files (or standard input if no files are named) for lines
that match a given pattern. The default behaviour of grep is to print out the
matching lines. For example:
$ grep hello *.txt
searches all text files in the current directory for lines containing "hello". Some
of the more useful options that grep provides are:
-c (print a count of the number of lines that match), -i (ignore case), -v (print
out the lines that don't match the pattern) and -n (printout the line number
before printing the matching line). So
searches all text files in the current directory for lines that do not contain any
form of the word hello (e.g. Hello, HELLO, or hELlO).
If you want to search all files in an entire directory tree for a particular pattern,
you can combine grep with find using backward single quotes to pass the
output from find into grep. So
will search all text files in the directory tree rooted at the current directory for
lines containing the word "hello".
The patterns that grep uses are actually a special type of pattern known as
regular expressions. Just like arithemetic expressions, regular expressions are
made up of basic subexpressions combined by operators.
6 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
The caret `^' and the dollar sign `$' are special characters that
match the beginning and end of a line respectively. The dot '.' matches any
character. So
matches any line in hello.txt that contains a three character sequence that
ends with a lowercase letter from l to z.
Note that UNIX systems also usually support another grep variant called fgrep
(fixed grep) which simply looks for a fixed string inside a file (but this facility is
largely redundant).
sort filenames
sort sorts lines contained in a group of files alphabetically (or if the -n option
7 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
is specified) numerically. The sorted output is displayed on the screen, and may
be stored in another file by redirecting the output. So
$ sort input1.txt input2.txt > output.txt
uniq filename
uniq removes duplicate adjacent lines from a file. This facility is most useful
when combined with sort:
$ sort input.txt | uniq > output.txt
tar backs up entire directories and files onto a tape device or (more commonly)
into a single disk file known as an archive. An archive is a file that contains
other files plus information about them, such as their filename, owner,
timestamps, and access permissions. tar does not perform any compression by
default.
where archivename will usually have a .tar extension. Here the c option
means create, v means verbose (output filenames as they are archived), and f
means file.To list the contents of a tar archive, use
cpio
cpio is another facility for creating and reading archives. Unlike tar, cpio
doesn't automatically archive the contents of directories, so it's common to
combine cpio with find when creating an archive:
8 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
This will take all the files in the current directory and the
directories below and place them in an archive called archivename.The -depth
option controls the order in which the filenames are produced and is
recommended to prevent problems with directory permissions when doing a
restore.The -o option creates the archive, the -v option prints the names of the
files archived as they are added and the -H option specifies an archive format
type (in this case it creates a tar archive). Another common archive type is crc,
a portable format with a checksum for error control.
Here the -d option will create directories as necessary. To force cpio to extract
files on top of files of the same name that already exist (and have the same or
later modification time), use the -u option.
compress, gzip
compress and gzip are utilities for compressing and decompressing individual
files (which may be or may not be archive files). To compress files, use:
$ compress filename
or
$ gzip filename
In each case, filename will be deleted and replaced by a compressed file called
filename.Z or filename.gz. To reverse the compression process, use:
$ compress -d filename
or
$ gzip -d filename
The mount command serves to attach the filesystem found on some device to
the filesystem tree. Conversely, the umount command will detach it again (it is
9 of 10 10/17/2009 2:14 PM
Introduction to UNIX and Linux: Lecture 3 http://www.doc.ic.ac.uk/~wjk/UnixIntro/Lecture3.html
$ cat /etc/fstab
/dev/fd0 /mnt/floppy auto rw,user,noauto 0 0
/dev/hdc /mnt/cdrom iso9660 ro,user,noauto 0 0
In this case, the mount point for the floppy drive is /mnt/floppy and the
mount point for the CDROM is /mnt/cdrom. To access a floppy we can use:
$ mount /mnt/floppy
$ cd /mnt/floppy
$ ls (etc...)
To force all changed data to be written back to the floppy and to detach the
floppy disk from the filesystem, we use:
$ umount /mnt/floppy
mtools
10 of 10 10/17/2009 2:14 PM