0% found this document useful (0 votes)
81 views

Unix (Officially Trademarked As UNIX, Sometimes Also Written As U

Unix was originally developed in 1969 at Bell Labs and has since branched into various commercial and non-profit versions. It uses a hierarchical file system, ports programs as text files, and employs small, interconnected programs rather than large monolithic programs. The Unix philosophy emphasizes modularity, portability, and that everything is a file. Unix became popular in academia and led to commercial versions like Solaris, HP-UX, and AIX, as well as modern Unix-like systems like Linux and BSD.

Uploaded by

gloryjac
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Unix (Officially Trademarked As UNIX, Sometimes Also Written As U

Unix was originally developed in 1969 at Bell Labs and has since branched into various commercial and non-profit versions. It uses a hierarchical file system, ports programs as text files, and employs small, interconnected programs rather than large monolithic programs. The Unix philosophy emphasizes modularity, portability, and that everything is a file. Unix became popular in academia and led to commercial versions like Solaris, HP-UX, and AIX, as well as modern Unix-like systems like Linux and BSD.

Uploaded by

gloryjac
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

Unix (officially trademarked as UNIX, sometimes also written as UNIX with small caps)

is a computer operating system originally developed in 1969 by a group of AT&T


employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan,
Douglas McIlroy, and Joe Ossanna. Today's Unix systems are split into various branches,
developed over time by AT&T as well as various commercial vendors and non-profit
organizations.

The Open Group, an industry standards consortium, owns the “Unix” trademark. Only
systems fully compliant with and certified according to the Single UNIX Specification
are qualified to use the trademark; others may be called "Unix system-like" or "Unix-
like" (though the Open Group disapproves of this term). However, the term "Unix" is
often used informally to denote any operating system that closely resembles the
trademarked system.

During the late 1970s and early 1980s, the influence of Unix in academic circles led to
large-scale adoption of Unix (particularly of the BSD variant, originating from the
University of California, Berkeley) by commercial startups, the most notable of which are
Solaris, HP-UX and AIX. Today, in addition to certified Unix systems such as those
already mentioned, Unix-like operating systems such as Linux and BSD descendants
(FreeBSD, NetBSD, and OpenBSD) are commonly encountered. The term "traditional
Unix" may be used to describe a Unix or an operating system that has the characteristics
of either Version 7 Unix or UNIX System V.

Unix operating systems are widely used in both servers and workstations. The Unix
environment and the client–server program model were essential elements in the
development of the Internet and the reshaping of computing as centered in networks
rather than in individual computers.

Both Unix and the C programming language were developed by AT&T and distributed to
government and academic institutions, which led to both being ported to a wider variety
of machine families than any other operating system. As a result, Unix became
synonymous with "open systems".

Unix was designed to be portable, multi-tasking and multi-user in a time-sharing


configuration. Unix systems are characterized by various concepts: the use of plain text
for storing data; a hierarchical file system; treating devices and certain types of inter-
process communication (IPC) as files; and the use of a large number of software tools,
small programs that can be strung together through a command line interpreter using
pipes, as opposed to using a single monolithic program that includes all of the same
functionality. These concepts are collectively known as the Unix philosophy.

Under Unix, the "operating system" consists of many of these utilities along with the
master control program, the kernel. The kernel provides services to start and stop
programs, handles the file system and other common "low level" tasks that most
programs share, and, perhaps most importantly, schedules access to hardware to avoid
conflicts if two programs try to access the same resource or device simultaneously. To
mediate such access, the kernel was given special rights on the system, leading to the
division between user-space and kernel-space.

The microkernel concept was introduced in an effort to reverse the trend towards larger
kernels and return to a system in which most tasks were completed by smaller utilities. In
an era when a "normal" computer consisted of a hard disk for storage and a data terminal
for input and output (I/O), the Unix file model worked quite well as most I/O was
"linear". However, modern systems include networking and other new devices. As
graphical user interfaces developed, the file model proved inadequate to the task of
handling asynchronous events such as those generated by a mouse, and in the 1980s non-
blocking I/O and the set of inter-process communication mechanisms was augmented
(sockets, shared memory, message queues, semaphores), and functionalities such as
network protocols were moved out of the kernel.

Components
See also: List of Unix programs

The Unix system is composed of several components that are normally packed together.
By including — in addition to the kernel of an operating system — the development
environment, libraries, documents, and the portable, modifiable source-code for all of
these components, Unix was a self-contained software system. This was one of the key
reasons it emerged as an important teaching and learning tool and has had such a broad
influence.

The inclusion of these components did not make the system large — the original V7
UNIX distribution, consisting of copies of all of the compiled binaries plus all of the
source code and documentation occupied less than 10MB, and arrived on a single 9-track
magnetic tape. The printed documentation, typeset from the on-line sources, was
contained in two volumes.

The names and filesystem locations of the Unix components have changed substantially
across the history of the system. Nonetheless, the V7 implementation is considered by
many to have the canonical early structure:

• Kernel — source code in /usr/sys, composed of several sub-components:


o conf — configuration and machine-dependent parts, including boot code
o dev — device drivers for control of hardware (and some pseudo-hardware)
o sys — operating system "kernel", handling memory management, process
scheduling, system calls, etc.
o h — header files, defining key structures within the system and important
system-specific invariables
• Development Environment — Early versions of Unix contained a development
environment sufficient to recreate the entire system from source code:
o cc — C language compiler (first appeared in V3 Unix)
o as — machine-language assembler for the machine
o ld — linker, for combining object files
o lib — object-code libraries (installed in /lib or /usr/lib) libc, the system
library with C run-time support, was the primary library, but there have
always been additional libraries for such things as mathematical functions
(libm) or database access. V7 Unix introduced the first version of the
modern "Standard I/O" library stdio as part of the system library. Later
implementations increased the number of libraries significantly.
o make — build manager (introduced in PWB/UNIX), for effectively
automating the build process
o include — header files for software development, defining standard
interfaces and system invariants
o Other languages — V7 Unix contained a Fortran-77 compiler, a
programmable arbitrary-precision calculator (bc, dc), and the awk
scripting language, and later versions and implementations contain many
other language compilers and toolsets. Early BSD releases included Pascal
tools, and many modern Unix systems also include the GNU Compiler
Collection as well as or instead of a proprietary compiler system.
o Other tools — including an object-code archive manager (ar), symbol-
table lister (nm), compiler-development tools (e.g. lex & yacc), and
debugging tools.
• Commands — Unix makes little distinction between commands (user-level
programs) for system operation and maintenance (e.g. cron), commands of
general utility (e.g. grep), and more general-purpose applications such as the text
formatting and typesetting package. Nonetheless, some major categories are:
o sh — The "shell" programmable command line interpreter, the primary
user interface on Unix before window systems appeared, and even
afterward (within a "command window").
o Utilities — the core tool kit of the Unix command set, including cp, ls,
grep, find and many others. Subcategories include:
 System utilities — administrative tools such as mkfs, fsck, and
many others.
 User utilities — environment management tools such as passwd,
kill, and others.
o Document formatting — Unix systems were used from the outset for
document preparation and typesetting systems, and included many related
programs such as nroff, troff, tbl, eqn, refer, and pic. Some modern Unix
systems also include packages such as TeX and Ghostscript.
o Graphics — The plot subsystem provided facilities for producing simple
vector plots in a device-independent format, with device-specific
interpreters to display such files. Modern Unix systems also generally
include X11 as a standard windowing system and GUI, and many support
OpenGL.
o Communications — Early Unix systems contained no inter-system
communication, but did include the inter-user communication programs
mail and write. V7 introduced the early inter-system communication
system UUCP, and systems beginning with BSD release 4.1c included
TCP/IP utilities.

The 'man' command can display a manual page for any command on the system,
including itself.

• Documentation — Unix was the first operating system to include all of its
documentation online in machine-readable form. The documentation included:
o man — manual pages for each command, library component, system call,
header file, etc.
o doc — longer documents detailing major subsystems, such as the C
language and troff

[edit] Impact
See also: Unix-like

The Unix system had significant impact on other operating systems. It won its success
by:

• Direct interaction.
• Moving away from the total control of businesses like IBM and DEC.
• AT&T being willing to give the software away for free.
• Running on cheap hardware.
• Being easy to adopt and move to different machines.

It was written in high level language rather than assembly language (which had been
thought necessary for systems implementation on early computers). Although this
followed the lead of Multics and Burroughs, it was Unix that popularized the idea.

Unix had a drastically simplified file model compared to many contemporary operating
systems, treating all kinds of files as simple byte arrays. The file system hierarchy
contained machine services and devices (such as printers, terminals, or disk drives),
providing a uniform interface, but at the expense of occasionally requiring additional
mechanisms such as ioctl and mode flags to access features of the hardware that did not
fit the simple "stream of bytes" model. The Plan 9 operating system pushed this model
even further and eliminated the need for additional mechanisms.

Unix also popularized the hierarchical file system with arbitrarily nested subdirectories,
originally introduced by Multics. Other common operating systems of the era had ways to
divide a storage device into multiple directories or sections, but they had a fixed number
of levels, often only one level. Several major proprietary operating systems eventually
added recursive subdirectory capabilities also patterned after Multics. DEC's RSX-11M's
"group, user" hierarchy evolved into VMS directories, CP/M's volumes evolved into MS-
DOS 2.0+ subdirectories, and HP's MPE group.account hierarchy and IBM's SSP and
OS/400 library systems were folded into broader POSIX file systems.

Making the command interpreter an ordinary user-level program, with additional


commands provided as separate programs, was another Multics innovation popularized
by Unix. The Unix shell used the same language for interactive commands as for
scripting (shell scripts — there was no separate job control language like IBM's JCL).
Since the shell and OS commands were "just another program", the user could choose (or
even write) his own shell. New commands could be added without changing the shell
itself. Unix's innovative command-line syntax for creating chains of producer-consumer
processes (pipelines) made a powerful programming paradigm (coroutines) widely
available. Many later command-line interpreters have been inspired by the Unix shell.

A fundamental simplifying assumption of Unix was its focus on ASCII text for nearly all
file formats. There were no "binary" editors in the original version of Unix — the entire
system was configured using textual shell command scripts. The common denominator in
the I/O system was the byte — unlike "record-based" file systems. The focus on text for
representing nearly everything made Unix pipes especially useful, and encouraged the
development of simple, general tools that could be easily combined to perform more
complicated ad hoc tasks. The focus on text and bytes made the system far more scalable
and portable than other systems. Over time, text-based applications have also proven
popular in application areas, such as printing languages (PostScript, ODF), and at the
application layer of the Internet protocols, e.g., FTP, SMTP, HTTP, SOAP and SIP.

Unix popularized a syntax for regular expressions that found widespread use. The Unix
programming interface became the basis for a widely implemented operating system
interface standard (POSIX, see above).

The C programming language soon spread beyond Unix, and is now ubiquitous in
systems and applications programming.

Early Unix developers were important in bringing the concepts of modularity and
reusability into software engineering practice, spawning a "software tools" movement.

Unix provided the TCP/IP networking protocol on relatively inexpensive computers,


which contributed to the Internet explosion of worldwide real-time connectivity, and
which formed the basis for implementations on many other platforms. This also exposed
numerous security holes in the networking implementations.

The Unix policy of extensive on-line documentation and (for many years) ready access to
all system source code raised programmer expectations, and contributed to the 1983
launch of the free software movement.
Over time, the leading developers of Unix (and programs that ran on it) established a set
of cultural norms for developing software, norms which became as important and
influential as the technology of Unix itself; this has been termed the Unix philosophy.

Move Data Faster


Memory Management Seven times faster than FTP
for high-speed WAN throughput
Click for free trials.
ExpeDat File Server
How Much Memory?
Unlike traditional PC operating systems, Unix related systems use very sophisticated
memory management algorithms to make efficient use of memory resources. This makes
the questions "How much memory do I have?" and "How much memory is being used?"
rather complicated to answer. First you must realize that there are three different kinds of
memory, three different ways they can be used by the operating system, and three
different ways they can be used by processes.

Kinds of Memory:

• Main - The physical Random Access Memory located on the CPU motherboard
that most people think of when they talk about RAM. Also called Real Memory.
This does not include processor caches, video memory, or other peripheral
memory.
• File System - Disk memory accessible via pathnames. This does not include raw
devices, tape drives, swap space, or other storage not addressable via normal
pathnames. It does include all network file systems.
• Swap Space - Disk memory used to hold data that is not in Real or File System
memory. Swap space is most efficient when it is on a separate disk or partition,
but sometimes it is just a large file in the File System.

OS Memory Uses:

• Kernel - The Operating System's own (semi-)private memory space. This is


always in Main memory.
• Cache - Main memory that is used to hold elements of the File System and other
I/O operations. Not to be confused with the CPU cache or or disk drive cache,
which are not part of main memory.
• Virtual - The total addressable memory space of all processes running on the
given machine. The physical location of such data may be spread among any of
the three kinds of memory.

Process Memory Uses:

• Data - Memory allocated and used by the program (usually via malloc, new, or
similar runtime calls).
• Stack - The program's execution stack (managed by the OS).
• Mapped - File contents addressable within the process memory space.
The amount of memory available for processes is at least the size of Swap, minus
Kernel. On more modern systems (since around 1994) it is at least Main plus Swap
minus Kernel and may also include any files via mapping.

Swapping
Virtual memory is divided up into pages, chunks that are usually either 4096 or 8192
bytes in size. The memory manager considers pages to be the atomic (indivisible) unit of
memory. For the best performance, we want each page to be accessible in Main memory
as it is needed by the CPU. When a page is not needed, it does not matter where it is
located.

The collection of pages which a process is expected to use in the very near future (usually
those pages it has used in the very near past, see the madvise call) is called its resident
set. (Some OSs consider all the pages currently located in main memory to be the
resident set, even if they aren't being used.) The process of moving some pages out of
main memory and moving others in, is called swapping. (For the purposes of this
discussion, disk caching activity is included in this notion of swapping, even though it is
generally considered a separate activity.)

A page fault occurs when the CPU tries to access a page that is not in main memory,
thus forcing the CPU to wait for the page to be swapped in. Since moving data to and
from disks takes a significant amount of time, the goal of the memory manager is to
minimize the number of page faults.

Where a page will go when it is "swapped-out" depends on how it is being used. In


general, pages are swapped out as follows:

Kernel
Never swapped out.
Cache
Page is discarded.
Data
Moved to swap space.
Stack
Moved to swap space.
Mapped
Moved to originating file if changed and shared.
Moved to swap space if changed and private.

It is important to note that swapping itself does not necessarily slow down the computer.
Performance is only impeded when a page fault occurs. At that time, if memory is
scarce, a page of main memory must be freed for every page that is needed. If a page that
is being swapped out has changed since it was last written to disk, it can't be freed from
main memory until the changes have been recorded (either in swap space or a mapped
file).
Writing a page to disk need not wait until a page fault occurs. Most modern UNIX
systems implement preemptive swapping, in which the contents of changed pages are
copied to disk during times when the disk is otherwise idle. The page is also kept in main
memory so that it can be accessed if necessary. But, if a page fault occurs, the system
can instantly reclaim the preemptively swapped pages in only the time needed to read in
the new page. This saves a tremendous amount of time since writing to disk usually
takes two to four times longer than reading. Thus preemptive swapping may occur even
when main memory is plentiful, as a hedge against future shortages.

Since it is extremely rare for all (or even most) of the processes on a UNIX system to be
in use at once, most of virtual memory may be swapped out at any given time without
significantly impeding performance. If the activation of one process occurs at a time
when another is idle, they simply trade places with minimum impact. Performance is only
significantly affect when more memory is needed at once than is available. This is
discussed more below.

Mapped Files
The subject of file mapping deserves special attention simply because most people, even
experienced programmers, never have direct experience with it and yet it is integral to the
function of modern operating systems. When a process maps a file, a segment of its
virtual memory is designated as corresponding to the contents of the given file.
Retrieving data from those memory addresses actually retrieves the data from the file.
Because the retrieval is handled transparently by the OS, it is typically much faster and
more efficient than the standard file access methods. (See the manual page mmap and its
associated system calls.)

In general, if multiple processes map and access the same file, the same real memory and
swap pages will be shared among all the processes. This allows multiple programs to
share data without having to maintain multiple copies in memory.

The primary use for file mapping is for the loading of executable code. When a program
is executed, one of the first actions is to map the program executable and all of its shared
libraries into the newly created virtual memory space. (Some systems let you see this
effect by using trace, ktrace, truss, or strace, depending on which UNIX you have. Try
this on a simple command like ls and notice the multiple calls to mmap.)

As the program begins execution, it page faults, forcing the machine instructions to be
loaded into memory as they are needed. Multiple invocations of the same executable, or
programs which use the same code libraries, will share the same pages of real memory.

What happens when a process attempts to change a mapped page depends upon the
particular OS and the parameters of the mapping. Executable pages are usually mapped
"read-only" so that a segmentation fault occurs if they are written to. Pages mapped as
"shared" will have changes marked in the shared real memory pages and eventually will
be written back to the file. Those marked as "private" will have private copies created in
swap space as they are changed and will not be written back to the originating file.

Some older operating systems (such as SunOS 4) always copy the contents of a mapped
file into the swap space. This limits the quantity of mapped files to the size of swap
space, but it means that if a mapped file is deleted or changed (by means other than a
mapping), the mapping processes will still have a clean copy. Other operating systems
(such as SunOS 5) only copy privately changed pages into swap space. This allows an
arbitrary quantity of mapped files, but means that deleting or changing such a file may
cause a bus in error in the processes using it.

So, how much memory is there?


The total real memory is calculated by subtracting the kernel memory from the amount of
RAM. (Utilities like top or yamm can you show you the total real memory available.)
Some of the rest may be used for caching, but process needs usually take priority over
cached data.

The total virtual memory depends on the degree to which processes use mapped files.
For data and stack space, the limitation is the amount of real and swap memory. On
some systems it is simply the amount of swap space, on others it is the sum of the two. If
mapped files are automatically copied into swap space, then they must also fit into swap
memory making that amount the limiting factor. But if mapped files act as their own
swap area, or if swap space is just a growable file in the file system, then the limit to the
amount of virtual memory that could be mapped onto them is the amount of hard drive
space available.

In practice, it is easy and cheap to add arbitrary amounts of swap space and thus virtual
memory. The real limiting factor on performance will be the amount real memory.

Okay then, how much memory is being used?


If no programs were sharing memory or mapping files, you could just add up their
resident sets to get the amount of real memory in use and their virtual memories to get the
amount of swap space in use. But shared memory means that the resident sets of multiple
processes may be counting the same real memory pages more than once. Likewise,
mapped files (on OSs that use them for swapping) will count toward a process' virtual
memory use but won't consume swap space. The issue is further confused by the fact that
the system will use any available RAM for disk caching. Since cache memory is low
priority and can be freed at any time, it doesn't impede process performance, but is
usually counted as "used".

Thus on most OSs there is no easy way to calculate how much memory is being used for
what. Some utility programs, like top or yamm, may be able to give you an idea, but their
numbers can be misleading since its not possible distinguish different types of use. Just
because top says you have very little memory free doesn't necessarily mean you need to
buy more. A better indication is to look for swap activity numbers. These may be
reported as "swap-outs", or "pageouts", and can usually be found using utilities like
"vm_stat". If this number is frequently increasing, then you may need more RAM.

One of the best and simplest indications of memory problems is to simply listen to the
hard drive. If there is less memory available than the total resident sets of all running
processes (after accounting for sharing), then the computer will need to be continuously
swapping. This non-stop disk activity is called thrashing and is an indication that there
are too many active processes or that more memory is needed. If there is just barely
enough memory for the resident sets but not enough for all virtual memory, then
thrashing will occur only when new programs are run or patterns of user interaction
change. If there is more real memory than virtual memory, then there will be plenty of
extra for disk caching and repeated launching of applications or access to files will
produce little or no disk sounds.

Not all use is good use


Before you go out and spend money on more memory, make sure to check that the
processes you are running aren't consuming more than their share of resources. Programs
like "top" can show you if an application or other process is using increasing amounts of
RAM over a long period of time. If this is occurring, especially while the program is
idle, then your problem may be a memory leak in the program, and more RAM would
just be filled up by the misbehaving application. Proprietary X servers are notorious for
consuming arbitrary amounts of virtual memory over time and should be restarted daily
(or replaced with an open source version). Also, clueless users or programmers may be
leaving large numbers of background processes lying around.

Conclusion
These days you can find high-quality RAM online from a number of inexpensive
sources. So unless you are strapped for cash (or configuring a lot of machines), doubt
favors the upgrade. Just make sure the RAM you buy is from a reputable vendor and
includes a lifetime warranty, as bad RAM can cause a mess of sometimes subtle system
stability problems.

If you are really interested in figuring out exactly how much RAM a system needs, run a
monitoring program like the ones described above and listen for swapping. If the
monitoring programs show very little free memory, and the pageout count is rising, and
you hear swapping when you switch focus between large applications, then you probably
need more RAM.

Finally, be aware that some operating systems are much better than others at managing
memory. Closed systems like Windows or Solaris are generally much less efficient than
the more modern open source systems such as Linux, *BSD, or Darwin.
Process Management describes how to control programs in UNIX
including how to start a job (program) and how to kill it.

A multitasking* operating system may just switch between processes to give the
appearance of many processes executing concurrently or simultaneously, though in fact
only one process can be executing at any one time on a single-core CPU (unless using
multi-threading or other similar technology).[3]

It is usual to associate a single process with a main program, and 'daughter' ('child')
processes with any spin-off, parallel processes, which behave like asynchronous
subroutines. A process is said to own resources, of which an image of its program (in
memory) is one such resource. (Note, however, that in multiprocessing systems, many
processes may run off of, or share, the same reentrant program at the same location in
memory— but each process is said to own its own image of the program.)

Processes are often called tasks in embedded operating systems. The sense of 'process' (or
task) is 'something that takes up time', as opposed to 'memory', which is 'something that
takes up space'. (Historically, the terms 'task' and 'process' were used interchangeably, but
the term 'task' seems to be dropping from the computer lexicon.)

The above description applies to both processes managed by an operating system, and
processes as defined by process calculi.

If a process requests something for which it must wait, it will be blocked. When the
process is in the Blocked State, it is eligible for swapping to disk, but this is transparent
in a virtual memory system, where blocks of memory values may be really on disk and
not in main memory at any time. Note that even unused portions of active processes/tasks
(executing programs) are eligible for swapping to disk. All parts of an executing program
and its data do not have to be in physical memory for the associated process to be active.

______________________________

UNIX Command Summary


ls ................. show directory, in alphabetical order
logout ............. logs off system
mkdir .............. make a directory
rmdir .............. remove directory (rm -r to delete folders with
files)
rm ................. remove files
cd ................. change current directory
man (command) ...... shows help on a specific command
talk (user) ........ pages user for chat - (user) is a email address
write (user) ....... write a user on the local system (control-c to end)
pico (filename) .... easy to use text editor to edit files
pine ............... easy to use mailer
more (file) ........ views a file, pausing every screenful

sz ................. send a file (to you) using zmodem


rz ................. recieve a file (to the unix system) using zmodem

telnet (host) ...... connect to another Internet site


ftp (host) ......... connects to a FTP site
archie (filename) .. search the Archie database for a file on a FTP site
irc ................ connect to Internet Relay Chat
lynx ............... a textual World Wide Web browser
gopher ............. a Gopher database browser
tin, trn ........... read Usenet newsgroups

passwd ............. change your password


chfn ............... change your "Real Name" as seen on finger
chsh ............... change the shell you log into

grep ............... search for a string in a file


tail ............... show the last few lines of a file
who ................ shows who is logged into the local system
w .................. shows who is logged on and what they're doing
finger (emailaddr).. shows more information about a user
df ................. shows disk space available on the system
du ................. shows how much disk space is being used up by
folders
chmod .............. changes permissions on a file
bc ................. a simple calculator

make ............... compiles source code


gcc (file.c) ....... compiles C source into a file named 'a.out'

gzip ............... best compression for UNIX files


zip ................ zip for IBM files
tar ................ combines multiple files into one or vice-versa
lharc, lzh, lha .... un-arc'ers, may not be on your system

dos2unix (file) (new) - strips CR's out of dos text files


unix2dos (file) (new) - adds CR's to unix text files

You might also like