Unix (Officially Trademarked As UNIX, Sometimes Also Written As U
Unix (Officially Trademarked As UNIX, Sometimes Also Written As U
The Open Group, an industry standards consortium, owns the “Unix” trademark. Only
systems fully compliant with and certified according to the Single UNIX Specification
are qualified to use the trademark; others may be called "Unix system-like" or "Unix-
like" (though the Open Group disapproves of this term). However, the term "Unix" is
often used informally to denote any operating system that closely resembles the
trademarked system.
During the late 1970s and early 1980s, the influence of Unix in academic circles led to
large-scale adoption of Unix (particularly of the BSD variant, originating from the
University of California, Berkeley) by commercial startups, the most notable of which are
Solaris, HP-UX and AIX. Today, in addition to certified Unix systems such as those
already mentioned, Unix-like operating systems such as Linux and BSD descendants
(FreeBSD, NetBSD, and OpenBSD) are commonly encountered. The term "traditional
Unix" may be used to describe a Unix or an operating system that has the characteristics
of either Version 7 Unix or UNIX System V.
Unix operating systems are widely used in both servers and workstations. The Unix
environment and the client–server program model were essential elements in the
development of the Internet and the reshaping of computing as centered in networks
rather than in individual computers.
Both Unix and the C programming language were developed by AT&T and distributed to
government and academic institutions, which led to both being ported to a wider variety
of machine families than any other operating system. As a result, Unix became
synonymous with "open systems".
Under Unix, the "operating system" consists of many of these utilities along with the
master control program, the kernel. The kernel provides services to start and stop
programs, handles the file system and other common "low level" tasks that most
programs share, and, perhaps most importantly, schedules access to hardware to avoid
conflicts if two programs try to access the same resource or device simultaneously. To
mediate such access, the kernel was given special rights on the system, leading to the
division between user-space and kernel-space.
The microkernel concept was introduced in an effort to reverse the trend towards larger
kernels and return to a system in which most tasks were completed by smaller utilities. In
an era when a "normal" computer consisted of a hard disk for storage and a data terminal
for input and output (I/O), the Unix file model worked quite well as most I/O was
"linear". However, modern systems include networking and other new devices. As
graphical user interfaces developed, the file model proved inadequate to the task of
handling asynchronous events such as those generated by a mouse, and in the 1980s non-
blocking I/O and the set of inter-process communication mechanisms was augmented
(sockets, shared memory, message queues, semaphores), and functionalities such as
network protocols were moved out of the kernel.
Components
See also: List of Unix programs
The Unix system is composed of several components that are normally packed together.
By including — in addition to the kernel of an operating system — the development
environment, libraries, documents, and the portable, modifiable source-code for all of
these components, Unix was a self-contained software system. This was one of the key
reasons it emerged as an important teaching and learning tool and has had such a broad
influence.
The inclusion of these components did not make the system large — the original V7
UNIX distribution, consisting of copies of all of the compiled binaries plus all of the
source code and documentation occupied less than 10MB, and arrived on a single 9-track
magnetic tape. The printed documentation, typeset from the on-line sources, was
contained in two volumes.
The names and filesystem locations of the Unix components have changed substantially
across the history of the system. Nonetheless, the V7 implementation is considered by
many to have the canonical early structure:
The 'man' command can display a manual page for any command on the system,
including itself.
• Documentation — Unix was the first operating system to include all of its
documentation online in machine-readable form. The documentation included:
o man — manual pages for each command, library component, system call,
header file, etc.
o doc — longer documents detailing major subsystems, such as the C
language and troff
[edit] Impact
See also: Unix-like
The Unix system had significant impact on other operating systems. It won its success
by:
• Direct interaction.
• Moving away from the total control of businesses like IBM and DEC.
• AT&T being willing to give the software away for free.
• Running on cheap hardware.
• Being easy to adopt and move to different machines.
It was written in high level language rather than assembly language (which had been
thought necessary for systems implementation on early computers). Although this
followed the lead of Multics and Burroughs, it was Unix that popularized the idea.
Unix had a drastically simplified file model compared to many contemporary operating
systems, treating all kinds of files as simple byte arrays. The file system hierarchy
contained machine services and devices (such as printers, terminals, or disk drives),
providing a uniform interface, but at the expense of occasionally requiring additional
mechanisms such as ioctl and mode flags to access features of the hardware that did not
fit the simple "stream of bytes" model. The Plan 9 operating system pushed this model
even further and eliminated the need for additional mechanisms.
Unix also popularized the hierarchical file system with arbitrarily nested subdirectories,
originally introduced by Multics. Other common operating systems of the era had ways to
divide a storage device into multiple directories or sections, but they had a fixed number
of levels, often only one level. Several major proprietary operating systems eventually
added recursive subdirectory capabilities also patterned after Multics. DEC's RSX-11M's
"group, user" hierarchy evolved into VMS directories, CP/M's volumes evolved into MS-
DOS 2.0+ subdirectories, and HP's MPE group.account hierarchy and IBM's SSP and
OS/400 library systems were folded into broader POSIX file systems.
A fundamental simplifying assumption of Unix was its focus on ASCII text for nearly all
file formats. There were no "binary" editors in the original version of Unix — the entire
system was configured using textual shell command scripts. The common denominator in
the I/O system was the byte — unlike "record-based" file systems. The focus on text for
representing nearly everything made Unix pipes especially useful, and encouraged the
development of simple, general tools that could be easily combined to perform more
complicated ad hoc tasks. The focus on text and bytes made the system far more scalable
and portable than other systems. Over time, text-based applications have also proven
popular in application areas, such as printing languages (PostScript, ODF), and at the
application layer of the Internet protocols, e.g., FTP, SMTP, HTTP, SOAP and SIP.
Unix popularized a syntax for regular expressions that found widespread use. The Unix
programming interface became the basis for a widely implemented operating system
interface standard (POSIX, see above).
The C programming language soon spread beyond Unix, and is now ubiquitous in
systems and applications programming.
Early Unix developers were important in bringing the concepts of modularity and
reusability into software engineering practice, spawning a "software tools" movement.
The Unix policy of extensive on-line documentation and (for many years) ready access to
all system source code raised programmer expectations, and contributed to the 1983
launch of the free software movement.
Over time, the leading developers of Unix (and programs that ran on it) established a set
of cultural norms for developing software, norms which became as important and
influential as the technology of Unix itself; this has been termed the Unix philosophy.
Kinds of Memory:
• Main - The physical Random Access Memory located on the CPU motherboard
that most people think of when they talk about RAM. Also called Real Memory.
This does not include processor caches, video memory, or other peripheral
memory.
• File System - Disk memory accessible via pathnames. This does not include raw
devices, tape drives, swap space, or other storage not addressable via normal
pathnames. It does include all network file systems.
• Swap Space - Disk memory used to hold data that is not in Real or File System
memory. Swap space is most efficient when it is on a separate disk or partition,
but sometimes it is just a large file in the File System.
OS Memory Uses:
• Data - Memory allocated and used by the program (usually via malloc, new, or
similar runtime calls).
• Stack - The program's execution stack (managed by the OS).
• Mapped - File contents addressable within the process memory space.
The amount of memory available for processes is at least the size of Swap, minus
Kernel. On more modern systems (since around 1994) it is at least Main plus Swap
minus Kernel and may also include any files via mapping.
Swapping
Virtual memory is divided up into pages, chunks that are usually either 4096 or 8192
bytes in size. The memory manager considers pages to be the atomic (indivisible) unit of
memory. For the best performance, we want each page to be accessible in Main memory
as it is needed by the CPU. When a page is not needed, it does not matter where it is
located.
The collection of pages which a process is expected to use in the very near future (usually
those pages it has used in the very near past, see the madvise call) is called its resident
set. (Some OSs consider all the pages currently located in main memory to be the
resident set, even if they aren't being used.) The process of moving some pages out of
main memory and moving others in, is called swapping. (For the purposes of this
discussion, disk caching activity is included in this notion of swapping, even though it is
generally considered a separate activity.)
A page fault occurs when the CPU tries to access a page that is not in main memory,
thus forcing the CPU to wait for the page to be swapped in. Since moving data to and
from disks takes a significant amount of time, the goal of the memory manager is to
minimize the number of page faults.
Kernel
Never swapped out.
Cache
Page is discarded.
Data
Moved to swap space.
Stack
Moved to swap space.
Mapped
Moved to originating file if changed and shared.
Moved to swap space if changed and private.
It is important to note that swapping itself does not necessarily slow down the computer.
Performance is only impeded when a page fault occurs. At that time, if memory is
scarce, a page of main memory must be freed for every page that is needed. If a page that
is being swapped out has changed since it was last written to disk, it can't be freed from
main memory until the changes have been recorded (either in swap space or a mapped
file).
Writing a page to disk need not wait until a page fault occurs. Most modern UNIX
systems implement preemptive swapping, in which the contents of changed pages are
copied to disk during times when the disk is otherwise idle. The page is also kept in main
memory so that it can be accessed if necessary. But, if a page fault occurs, the system
can instantly reclaim the preemptively swapped pages in only the time needed to read in
the new page. This saves a tremendous amount of time since writing to disk usually
takes two to four times longer than reading. Thus preemptive swapping may occur even
when main memory is plentiful, as a hedge against future shortages.
Since it is extremely rare for all (or even most) of the processes on a UNIX system to be
in use at once, most of virtual memory may be swapped out at any given time without
significantly impeding performance. If the activation of one process occurs at a time
when another is idle, they simply trade places with minimum impact. Performance is only
significantly affect when more memory is needed at once than is available. This is
discussed more below.
Mapped Files
The subject of file mapping deserves special attention simply because most people, even
experienced programmers, never have direct experience with it and yet it is integral to the
function of modern operating systems. When a process maps a file, a segment of its
virtual memory is designated as corresponding to the contents of the given file.
Retrieving data from those memory addresses actually retrieves the data from the file.
Because the retrieval is handled transparently by the OS, it is typically much faster and
more efficient than the standard file access methods. (See the manual page mmap and its
associated system calls.)
In general, if multiple processes map and access the same file, the same real memory and
swap pages will be shared among all the processes. This allows multiple programs to
share data without having to maintain multiple copies in memory.
The primary use for file mapping is for the loading of executable code. When a program
is executed, one of the first actions is to map the program executable and all of its shared
libraries into the newly created virtual memory space. (Some systems let you see this
effect by using trace, ktrace, truss, or strace, depending on which UNIX you have. Try
this on a simple command like ls and notice the multiple calls to mmap.)
As the program begins execution, it page faults, forcing the machine instructions to be
loaded into memory as they are needed. Multiple invocations of the same executable, or
programs which use the same code libraries, will share the same pages of real memory.
What happens when a process attempts to change a mapped page depends upon the
particular OS and the parameters of the mapping. Executable pages are usually mapped
"read-only" so that a segmentation fault occurs if they are written to. Pages mapped as
"shared" will have changes marked in the shared real memory pages and eventually will
be written back to the file. Those marked as "private" will have private copies created in
swap space as they are changed and will not be written back to the originating file.
Some older operating systems (such as SunOS 4) always copy the contents of a mapped
file into the swap space. This limits the quantity of mapped files to the size of swap
space, but it means that if a mapped file is deleted or changed (by means other than a
mapping), the mapping processes will still have a clean copy. Other operating systems
(such as SunOS 5) only copy privately changed pages into swap space. This allows an
arbitrary quantity of mapped files, but means that deleting or changing such a file may
cause a bus in error in the processes using it.
The total virtual memory depends on the degree to which processes use mapped files.
For data and stack space, the limitation is the amount of real and swap memory. On
some systems it is simply the amount of swap space, on others it is the sum of the two. If
mapped files are automatically copied into swap space, then they must also fit into swap
memory making that amount the limiting factor. But if mapped files act as their own
swap area, or if swap space is just a growable file in the file system, then the limit to the
amount of virtual memory that could be mapped onto them is the amount of hard drive
space available.
In practice, it is easy and cheap to add arbitrary amounts of swap space and thus virtual
memory. The real limiting factor on performance will be the amount real memory.
Thus on most OSs there is no easy way to calculate how much memory is being used for
what. Some utility programs, like top or yamm, may be able to give you an idea, but their
numbers can be misleading since its not possible distinguish different types of use. Just
because top says you have very little memory free doesn't necessarily mean you need to
buy more. A better indication is to look for swap activity numbers. These may be
reported as "swap-outs", or "pageouts", and can usually be found using utilities like
"vm_stat". If this number is frequently increasing, then you may need more RAM.
One of the best and simplest indications of memory problems is to simply listen to the
hard drive. If there is less memory available than the total resident sets of all running
processes (after accounting for sharing), then the computer will need to be continuously
swapping. This non-stop disk activity is called thrashing and is an indication that there
are too many active processes or that more memory is needed. If there is just barely
enough memory for the resident sets but not enough for all virtual memory, then
thrashing will occur only when new programs are run or patterns of user interaction
change. If there is more real memory than virtual memory, then there will be plenty of
extra for disk caching and repeated launching of applications or access to files will
produce little or no disk sounds.
Conclusion
These days you can find high-quality RAM online from a number of inexpensive
sources. So unless you are strapped for cash (or configuring a lot of machines), doubt
favors the upgrade. Just make sure the RAM you buy is from a reputable vendor and
includes a lifetime warranty, as bad RAM can cause a mess of sometimes subtle system
stability problems.
If you are really interested in figuring out exactly how much RAM a system needs, run a
monitoring program like the ones described above and listen for swapping. If the
monitoring programs show very little free memory, and the pageout count is rising, and
you hear swapping when you switch focus between large applications, then you probably
need more RAM.
Finally, be aware that some operating systems are much better than others at managing
memory. Closed systems like Windows or Solaris are generally much less efficient than
the more modern open source systems such as Linux, *BSD, or Darwin.
Process Management describes how to control programs in UNIX
including how to start a job (program) and how to kill it.
A multitasking* operating system may just switch between processes to give the
appearance of many processes executing concurrently or simultaneously, though in fact
only one process can be executing at any one time on a single-core CPU (unless using
multi-threading or other similar technology).[3]
It is usual to associate a single process with a main program, and 'daughter' ('child')
processes with any spin-off, parallel processes, which behave like asynchronous
subroutines. A process is said to own resources, of which an image of its program (in
memory) is one such resource. (Note, however, that in multiprocessing systems, many
processes may run off of, or share, the same reentrant program at the same location in
memory— but each process is said to own its own image of the program.)
Processes are often called tasks in embedded operating systems. The sense of 'process' (or
task) is 'something that takes up time', as opposed to 'memory', which is 'something that
takes up space'. (Historically, the terms 'task' and 'process' were used interchangeably, but
the term 'task' seems to be dropping from the computer lexicon.)
The above description applies to both processes managed by an operating system, and
processes as defined by process calculi.
If a process requests something for which it must wait, it will be blocked. When the
process is in the Blocked State, it is eligible for swapping to disk, but this is transparent
in a virtual memory system, where blocks of memory values may be really on disk and
not in main memory at any time. Note that even unused portions of active processes/tasks
(executing programs) are eligible for swapping to disk. All parts of an executing program
and its data do not have to be in physical memory for the associated process to be active.
______________________________