OS Questions and Answer
OS Questions and Answer
OS Questions and Answer
1. Define Process.
Process is defined as
Program in execution A
synchronous activity.
The "animated spirit" of a procedure
The "locus of control of a procedure in execution which is manifested by the existence of a "process
control block" in the operating system
That entity to which processors are assigned the dispatch able unit
4. What is FPCB?
FPCB is a data structure containing certain important information about the
process including the following:
Current state of the process
Unique identification of the process A
pointer to the process's parent
A pointer to the process's child
The process's priority
Pointers to locate the process's memory and to allocated resources.
22. What are the conditions that must hold for Deadlock Prevention?
Mutual Exclusion Condition
Hold and Wait Condition No
Pre-emption condition
Circular Wait Condition.
19. What are the different methods for allocation in a File System?
Contiguous Allocation
Linked Allocation
Indexed Allocation
44.Defing Throughput .
It is defined as the number of requests serviced per unit time.
o Operating system
and coordinates use of hardware among various applications and
Controls
users
o Application programs – define the ways in which the system resources are used
to solve the computing problems of the users
Word processors, compilers, web browsers, database systems, video
games
o Users
People, machines, other computers
Depends on the point of view
Users want convenience, ease of use and good
performance o Don’t care about resource utilization
But shared computer such as mainframe or minicomputer must keep all users happy
Users of dedicate systems such as workstations have dedicated resources but frequently
use shared resources from servers
Handheld computers are resource poor, optimized for usability and battery life
Some computers have little or no user interface, such as embedded computers in
devices and automobiles
OS is a resource allocator
o Decides between conflicting requests for efficient and fair resource use
OS is a control program
―The one program running at all times on the computer‖ is the kernel.
Everything else is either
o a system program (ships with the operating system) , or
o an application program.
COMPUTER STARTUP
COMPUTER-SYSTEM OPERATION
One or more CPUs, device controllers connect through common bus providing access
to shared memory
Device controller informs CPU that it has finished its operation by causing
an interrupt
Interrupt transfers control to the interrupt service routine generally, through
the interrupt vector, which contains the addresses of all the service routines
Interrupt architecture must save the address of the interrupted instruction
A trap or exception is a software-generated interrupt caused either by an error or a
user request
An operating system is interrupt driven
INTERRUPT HANDLING
The OS preserves the state of the CPU by storing registers and the program counter
Determines which type of interrupt has occurred:
o polling
out to) each device to
The interrupt controller polls (send a signal
determine which one made the request
Separate segments of code determine what action should be taken for each type
of interrupt
INTERRUPT TIMELI
I/O STRUCTURE
Synchronous (blocking) I/O
o System call – request to the OS to allow user to wait for I/O completion (polling
periodically to check busy/done)
o Device-status table contains entry for each I/O device indicating its
type, address, and state
STORAGE HIERARCHY
The basic unit of computer storage is the bit. A bit can contain one of two values, 0 and 1. All other
storage in a computer is based on collections of bits. Given enough bits, it is amazing how many
things a computer can represent: numbers, letters, images, movies, sounds, documents, and
programs, to name a few. A byte is 8 bits, and on most computers it is the smallest convenient chunk
of storage. For example, most computers don’t have an instruction to move a bit but do have one to
move a byte. A less common term is word, which is a given computer architecture’s native unit of
data. A word is made up of one or more bytes. For example, a computer that has
64-bit registers and 64-bit memory addressing typically has 64-bit (8-byte) words. A
computer executes many operations in its native word size rather than a byte at a time.
Computer storage, along with most computer throughput, is generally measured and
manipulated in bytes and collections of bytes.
STORAGE STRUCTURE
Main memory – only large storage media that the CPU can access
directly o Random access
o Typically volatile
Secondary storage – extension of main memory that provides large nonvolatile storage
capacity
Hard disks – rigid metal or glass platters covered with magnetic recording material
o Disk surface is logically divided into tracks, which are subdivided into sectors
o The disk controller determines the logical interaction between the device and
the computer
o Volatility
Important principle
o in hardware,
o operating system,
o software
o Efficiency
Faster storage (cache) checked first to determine if information is there
o If it is, information used directly from the cache (fast)
Typically used for I/O devices that generate data in blocks, or generate data fast
Device controller transfers blocks of data from buffer storage directly to
main memory without CPU intervention
Only one interrupt is generated per block, rather than the one interrupt per byte
HOW A MODERN COMPUTER SYSTEM WORKS
TYPES OF SYSTEMS
o Advantages include:
1. Increased throughput
2. Economy of scale
o Two types:
Asymmetric Multiprocessing – each processor is assigned a specific task
Symmetric Multiprocessing – each processor performs all tasks
SYMMETRIC MULTIPROCESSING ARCHITECTURE
Multicore
Software error (e.g., division by zero)
Request for operating system service
Other process problems include infinite loop, processes modifying each
other or the operating system
PROCESS MANAGEMENT
o Initialization data
Typically system has many processes, some user, some operating system
running concurrently on one or more CPUs
o Concurrency by multiplexing the CPUs among the processes / threads
ACTIVITIES
MEMORY MANAGEMENT
o Deciding which processes (or parts thereof) and data to move into and out
of memory
STORAGE MANAGEMENT
capacity, data-transfer rate,
Varying properties include access speed,
access method (sequential or random)
File-System management
o Files usually organized into directories
o OS activities include
Creating and deleting files and directories
Primitives to manipulate files and directories
Mapping files onto secondary storage
Backup files onto stable (non-volatile) storage media
MASS STORAGE MANAGEMENT
Usually disks used to store
data that does not fit in main memory, or
data that must be kept for a ―long‖ period of time
Proper management is of central importance
Entire speed of computer operation hinges on disk subsystem and its algorithms
Disk is slow, its I/O is often a bottleneck
OS activities
Free-space management
Storage allocation
Disk scheduling
Multitasking environments must be careful to use most recent value, no matter where it
is stored in the storage hierarchy
Multiprocessor environment must provide cache coherency in hardware such that all
CPUs have the most recent value in their cache
Distributed environment situation even more complex
Several copies of a datum can exist
I/O SUBSYSTEM
Systems generally first distinguish among users, to determine who can do what
o Access control for users and groups
COMPUTING ENVIRONMENTS
TRADITIONAL
MOBILE
service and respond to requests for service via
Broadcast request for
discovery protocol
o Examples include Napster and Gnutella, Voice over IP (VoIP) such as Skype
Virtualization
Host OS, natively compiled for CPU
VMM - virtual machine manager
Creates and runs virtual machines
VMM runs guest OSes, also natively compiled for CPU
Applications run within these guest OSes
Example: Parallels for OS X running Win and/or Linux and their apps
Some VMM’es run within a host OS
But, some act as a specialized OS
Example. VMware ESX: installed on hardware, runs when hardware boots,
provides services to apps, runs guest OSes
Vast and growing industry
Use cases
Developing apps for multiple different OSes on 1 PC
Very important for cloud computing
Executing and managing compute environments in data centers
Operating systems made available in source-code format rather than just
binary closed-source
Counter to the copy protection and Digital Rights Management (DRM) movement
Started by Free Software Foundation (FSF), which has ―copyleft‖ GNU Public
License (GPL)
Examples include GNU/Linux and BSD UNIX (including core of Mac OS X), and
many more
Can use VMM like VMware Player (Free on Windows), Virtualbox (open source and
free on many platforms - http://www.virtualbox.com)
Use to run guest operating systems for exploration
User services:
o User interface
No UI, Command-Line (CLI), Graphics User Interface (GUI), Batch
o Program execution - Loading a program into memory and running it, end
execution, either normally or abnormally (indicating error)
o I/O operations - A running program may require I/O, which may involve a file
or an I/O device
Communications may be via shared memory or through message passing
(packets moved by the OS)
in the CPU and memory hardware, in I/O devices, in user
May occur
program
For each type of error, OS should
take the appropriate action to ensure
correct and consistent computing
Debugging facilities can greatly enhance the user’s and programmer’s
abilities to efficiently use the system
System services:
o For ensuring the efficient operation of the system itself via resource sharing
o Accounting - To keep track of which users use how much and what kinds
of computer resources
Protectioninvolves ensuring that all access to system resources is
controlled
Security of the system from outsiders requires user authentication,
extends to defending external I/O devices from invalid access attempts
CLI or command interpreter allows direct command entry
Typically, a number associated with each system call
The system call interface invokes the intended system call in OS kernel and returns status
of the system call and any return values
The caller need know nothing about how the system call is implemented
o Just needs to obey API and understand what OS will do as a result call
library (set of functions built into libraries
Managed by run-time support
included with compiler)
o Parameters placed, or pushed, onto the stack by the program and popped off the
stack by the operating system
o Block and stack methods do not limit the number or length of parameters being
passed
TYPES OF SYSTEM CALLS
Process control
o end, abort
o load, execute
File management
o create file, delete file
Device management
o request device, release device
Information maintenance
o get time or date, set time or date
Communications
o create, delete communication connection
o send, receive messages if message passing model to host name or process name
From client to server
Protection
o Control access to resources
SYSTEM PROGRAMS
System programs provide a convenient environment for program development
and execution.
Most users’ view of the operation system is defined by system programs, not the
actual system calls
They can be divided into:
o File manipulation
rm, ls, cp, mv, etc in Unix
o Communications
o Background services
o Application programs
o File management - Create, delete, copy, rename, print, dump, list, and
generally manipulate files and directories
Status information
o Some ask the system for info - date, time, amount of available memory,
disk space, number of users
o Typically, these programs format and print the output to the terminal or
other output devices
File modification
o Text editors to create and modify files
o Special commands to search contents of files or perform transformations of the text
Program loading and execution- Absolute loaders, relocatable loaders, linkage editors,
and overlay-loaders, debugging systems for higher-level and machine language
o Allow users to send messages to one another’s screens, browse web pages, send
electronic-mail messages, log in remotely, transfer files from one machine to
another
Background Services
Some for system startup, then terminate
Some from system boot to shutdown
o Provide facilities like disk checking, process scheduling, error logging, printing
Application programs
o Don’t pertain to system
o Run by users
Much variation
o Now C, C++
o Main body in C
o Systems programs in C, C++, scripting languages like PERL, Python, shell scripts
o Disadvantages:
Efficiency can decrease (vs monolithic approach)
OPERATING
SYSTEM STRUCTURES
SIMPLE STRUCTURE
MS-DOS was created to provide the most functionality in the least space
Not divided into modules
MS-DOS has some structure
o But its interfaces and levels of functionality are not well separated
o Systems programs
o Kernel
UNIX Kernel
o Consists of everything that is
below the system-call interface and
above the physical hardware
o Kernel provides
File system, CPUscheduling, memory management, and other operating-
system functions
This is a lot of functionality for just 1 layer
Rather monolithic
Disadvantages:
o can be hard to decide how to split functionality into layers
Kernighan’s Law: ―Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by definition, not
smart enough to debug it.‖
PERFORMANCE TUNNING
OS SYSGEN
SYSTEM BOOT
How OS is loaded?
When power is initialized on system, execution starts at a predefined memory location
o Firmware ROM is used to hold initial bootstrap program (=bootstrap loader)
Bootstrap loader
Accounting information – CPU used, clock time elapsed since start, time limits
I/O status information – I/O devices allocated to process, list of open files
PROCESS SCHEDULING
Goal of multiprogramming:
o Ready queue – set of all processes residing in main memory, ready and
waiting to execute
Queuing diagram
o a common representation of process scheduling
SCHEDULERS
Scheduler – component that decides how processes are selected from these queues for
scheduling purposes
Long-term scheduler (or job scheduler)
On this slide - ―LTS‖ (LTS is not a common notation)
In a batch system, more processes are submitted then can be executed in memory
They are spooled to disk
LTS selects which processes should be brought into the ready queue
LTS is invoked infrequently
(seconds, minutes) (may be slow, hence can use advanced algorithms)
LTS controls the degree of multiprogramming
The number of processes in memory
Processes can be described as either:
I/O-bound process
Spends more time doing I/O than computations, many short CPU bursts
CPU-bound process
Spends more time doing computations; few very long CPU bursts
LTS strives for good process mix
Short-term scheduler (or CPU scheduler)
o Selects 1 process to be executed next
Among ready-to-execute processes
Too many programs poor performance users quit
Key idea:
o Reduce the degree of multiprogramming by swapping
CONTEXT SWITCH
o The more complex the OS and the PCB
the longer the context switch
switch requires only changing pointer to the right set
OPERATIONS ON PROCESSES
process creation,
process termination,
PROCESS CREATION
UNIX examples
o fork() system call creates new process
Child is a copy of parent’s address space
PROCESS TERMINATION
Process executes last statement and then asks the operating system to delete it using the
exit() system call.
Parent may terminate the execution of children processes using the abort() system
call. Some reasons for doing so:
o Child has exceeded allocated resources
o The parent is exiting and the operating systems does not allow a child to
continue if its parent terminates
o Some OSes don’t allow child to exists if its parent has terminated
The parent process may wait for termination of a child process by using the wait()system
call.
o The call returns status information and the pid of the terminated process
o pid = wait(&status);
o Cooperating process can affect (or be affected) by such results of another process
o Computation speed-up
o Modularity
o Convenience
INTERPROCESS COMMUNICATION
o Shared memory
o Message passing
o Very important!
Producer process
o produces some information
o incrementally
Consumer process
o consumes this information
o as it becomes available
Challenge:
o Producer and consumer should run concurrently and efficiently
#define BUFFER_SIZE 10
typedef struct {
...
} item;
item buffer[BUFFER_SIZE];
int in = 0;
int out = 0;
item next_produced;
while (true) {
next_produced = ProduceItem();
MESSAGE PASSING
o send(message)
o receive(message)
o Implementation issues:
o How many links can there be between every pair of communicating processes?
o What is the capacity (buffer size) of a link?
o Is the size of a message that the link can accommodate fixed or variable?
o Direct or indirect
o Synchronous or asynchronous
DIRECT COMMUNICATION
INDIRECT COMMUNICATION
Messages are directed and received from mailboxes (also referred to as ports)
Each mailbox has a unique id
Processes can communicate only if they share a mailbox
Properties of an indirect communication link
Link established only if processes share a common mailbox
A link may be associated with many processes
Each pair of processes may share several communication links
Link may be unidirectional or bi-directional
Operations
o create a new mailbox (port)
o destroy a mailbox
o Primitives are defined as:
SYNCHRONIZATION
message next_produced;
while (true) {
ProduceItem(&next_produced);
send(next_produced);
BUFFERING
2.Bounded capacity
3.Unbounded capacity
– infinite length
- Sender never waits
o Process may be changing common variables, updating table, writing file, etc
Assume that each process executes at a nonzero speed
No assumption concerning relative speed of the n processes
1. Mutual Exclusion
2. Progress
l Formal: If no process is executing in its critical section and there exist some
processes that wish to enter their critical section, then the selection of the processes
that will enter the critical section next cannot be postponed indefinitely
3. Bounded Waiting
l Formal: A bound must exist on the number of times that other processes are
allowed to enter their critical sections after a process has made a request to
enter its critical section and before that request is granted
Previous solutions are complicated and generally inaccessible to
application programmers
OS designers build software tools to solve critical section problem
Simplest is mutex lock
Protect a critical section by:
n Synchronization tool that provides more sophisticated ways (than Mutex locks) for
process to synchronize their activities.
wait(S) {
while (S <= 0)
S--;
P1:
S 1;
signal(synch);
P2:
wait(synch);
S 2;
Little busy waiting
o But, applications may spend lots of time in critical sections
Hence, the busy-waiting approach is not a good solution
A waiting queue is associated with each semaphore
int value;
} semaphore;
Two operations:
o block – place the process invoking the operation on the appropriate waiting queue
o wakeup – remove one of processes in the waiting queue and place it in the ready
queue
wait(semaphore *S) {
S->value--;
if (S->value < 0) {
add this process to S->list;
signal(semaphore *S) {
S->value++;
if (S->value <= 0) {
remove a process P from S->list;
wakeup(P);
P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
... ...
signal(S); signal(Q);
signal(Q); signal(S);
Priority Inversion
o A situation when a higher-priority process needs to wait for a lower-priority
process that holds a lock
o Bounded-Buffer Problem
o Dining-Philosophers Problem
MONITORS
monitor monitor-name
function P1 (…) { … }
…
function Pn (…) {……}
initialization_code (…) { … }
o Encapsulation
Local variables accessed only via local functions
Local functions access only local vars and params
condition x, y;
o x.wait()
A process that invokes the operation is suspended (sleeps)
o x.signal()
Resumes one of processes (if any) that invoked x.wait()
If no x.wait() was called, then x.signal()has no effect x
Issues with monitors: assume
Both Q and P cannot execute in parallel
Because they are within a monitor
Options include
o Signal and wait – P waits until Q either leaves the monitor or it waits for another
condition
o Signal and continue – Q waits until P either leaves the monitor or it waits for
another condition
CPU SCHEDULING
SCHEDULING LEVELS
High-Level Scheduling
Intermediate-Level Scheduling
See Medium-Term Scheduling from Chapter 3
Selects which jobs to temporarily suspend/resume to smooth fluctuations
in system load.
Dispatcher
o a module that gives control of the CPU to the process selected by the
short-term scheduler; this involves:
switching context
switching to user mode
proper location in the user program to restart
jumping to the
that program
Dispatch latency
o Time it takes for the dispatcher to stop one process and start
another running
P1 24
P2 3
P3 3
P1 P2 P3
0 24 27 30
Waiting time for P1 = 0; P2 = 24; P3 = 27
FCFS
P2 , P 3 , P 1
P2 P3 P1
0 3 6 30
Convoy effect – when several short processes wait for long a process to get off the CPU
Assume
Execution:
o The long one occupies CPU
The short ones wait for it: no I/O is done at this stage
No overlap of I/O with CPU utilizations
SJF
Associate with each process the length of its next CPU burst
o SJF uses these lengths to schedule the process with the shortest time
Advantage:
o SJF is optimal in terms of the average waiting time
Challenge of SJF:
o Hinges on knowing the length of the next CPU burst
But how can we know it?
Solutions: ask user or estimate it
o In a short-term scheduling
Use estimation
P1 0.0 6873
P2 2.0
P3 4.0
P4 5.0
P4 P1 P3 P2
0 3 9 16 24
Average waiting time = (3 + 16 + 9 + 0) / 4 = 7
PRIORITY
o Preemptive
o Nonpreemptive
ROUND ROBIN
If
then
o ―Each process gets 1/n of the CPU time‖
Incorrect statement from the textbook
P1 24
P2 3
P3 3
P P P P P P P P
1 2 3 1 1 1 1 1
0 4 7 10 14 18 22 26 30
DEADLOCK
Deadlock – two or more processes are waiting indefinitely for an event that can be
caused by only one of the waiting processes
P0 P1
wait(S); wait(Q);
wait(Q); wait(S);
... ...
signal(S); signal(Q);
signal(Q); signal(S);
o Type: CPU
2 instances - CPU1, CPU2
o Type: Printer
3 instances - printer1, printer2, printer3
o Use resource
Operate on the resource, e.g. print on the printer
o Release resource
Mutexes and Semaphores
o Special case:
Each mutex or semaphor is treated as a separate resource type
Because a process would want to get not just ―any‖ lock amonga group of
locks, but a specific lock that guards a specific shared data type
o P0 P1 P2 … Pn–1 Pn P0
MAIN MEMORY
Program must be brought (from disk) into memory and placed within a process for it
to be run
Main memory and registers are only storage CPU can access directly
Memory unit only sees a stream of addresses + read requests, or address + data
and write requests
Register access in one CPU clock (or less)
Main memory can take many cycles, causing a stall
Cache sits between main memory and CPU registers
Protection of memory required to ensure correct operation
A pair of base and limit registers define the logical address space
CPU must check every memory access generated in user mode to be sure it is
between base and limit for that user
HARDWARE PROTECTION
ADDRESS BINDING
Programs on disk, ready to be brought into memory to execute form an input queue
o Without support, must be loaded into address 0000
Inconvenient to have first user process physical address always at 0000
o Execution time: Binding delayed until run time if the process can be moved
during its execution from one memory segment to another
Need hardware support for address maps (e.g., base and limit registers)
MULTISTEP PROCESSING
The concept of a logical address space that is bound to a separate physical address space
is central to proper memory management
Logical and physical addresses are the same in compile-time and load-time address-
binding schemes; logical (virtual) and physical addresses differ in execution-time
address-binding scheme
Logical address space is the set of all logical addresses generated by a program
Physical address space is the set of all physical addresses generated by a program
MMU
The user program deals with logical addresses; it never sees the real physical addresses
o Execution-time binding occurs when reference is made to location in memory
DYNAMIC ALLOCATION
DYNAMIC LINKING
Static linking – system libraries and program code combined by the loader into the
binary program image
SWAPPING
Backing store – fast disk large enough to accommodate copies of all memory images
for all users; must provide direct access to these memory images
Roll out, roll in – swapping variant used for priority-based scheduling algorithms; lower-
priority process is swapped out so higher-priority process can be loaded and executed
Major part of swap time is transfer time; total transfer time is directly proportional to the
amount of memory swapped
System maintains a ready queue of ready-to-run processes which have memory
images on disk
Does the swapped out process need to swap back in to same physical addresses?
Depends on address binding method
o Plus consider pending I/O to / from process memory space
Modified versions of swapping are found on many systems (i.e., UNIX, Linux,
and Windows)
o Swapping normally disabled
o Resident operating system, usually held in low memory with interrupt vector
Relocation registers used to protect user processes from each other, and from changing
operating-system code and data
o Base register contains value of smallest physical address
o Limit register contains range of logical addresses – each logical address must
be less than the limit register
o Can then allow actions such as kernel code being transient and kernel
changing size
FRAGMENTATION
External Fragmentation – total memory space exists to satisfy a request, but it is not
contiguous
Internal Fragmentation – allocated memory may be slightly larger than requested
memory; this size difference is memory internal to a partition, but not being used
First fit analysis reveals that given N blocks allocated, 0.5 N blocks lost to fragmentation
o 1/3 may be unusable -> 50-percent rule
Reduce external fragmentation by compaction
o Shuffle memory contents to place all free memory together in one large block
o I/O problem
Latch job in memory while it is involved in I/O
Do I/O only into OS buffers
function
method
object
common block
stack
symbol table
arrays
Segment table – maps two-dimensional physical addresses; each table entry has:
o base – contains the starting physical address where the segments reside
in memory
Protection
PAGING
o Page number (p) – used as an index into a page table which contains
base address of each page in physical memory
o Page offset (d) – combined with base address to define the physical memory
address that is sent to the memory unit
Solaris supports two page sizes – 8 KB and 4 MB
ASSOCIATIVE MEMORY
Associative memory – parallel search
Address translation (p, d)
Page # Frame #
Hit ratio =
o Hit ratio – percentage of times that a page number is found in the
associative registers; ratio related to number of associative registers
Consider = 80%, = 20ns for TLB search, 100ns for memory access
Effective Access Time (EAT)
EAT = (1 + ) + (2 + )(1 – )
= 2 + –
Consider = 80%, = 20ns for TLB search, 100ns for memory
MEMORY PROTECTION
Memory protection implemented by associating protection bit with each frame to indicate
if read-only or read-write access is allowed
o Can also add more bits to indicate page execute-only, and so on
o ―invalid‖ indicates that the page is not in the process’ logical address space
Memory structures for paging can get huge using straight-forward methods
o If each entry is 4 bytes -> 4 MB of physical address space / memory for page
table alone
That amount of memory used to cost a lot
Don’t want to allocate that contiguously in main memory
Hierarchical Paging
Hashed Page Tables
Inverted Page Tables
64 BIT ARCHITECTURE
o If two level scheme, inner page tables could be 210 4-byte entries
o Address would look like
o But in the following example the 2nd outer page table is still 234 bytes in size
And possibly 4 memory access to get to one physical memory location
INTEL 32 BIT ARCHITECTURE
VIRTUAL MEMORY
VIRTUAL ADDRESS SPACE
Usually design logical address space for stack to start at Max logical address and grow
―down‖ while heap grows ―up‖
No physical memory needed until heap or stack grows to a given new
page
Enables sparse address spaces with holes left for growth, dynamically linked
libraries, etc
System libraries shared via mapping into virtual address space
Shared memory by mapping pages read-write into virtual address space
Pages can be shared during fork(), speeding process creation
SHARED LIBRARY USING VIRTUAL MEMORY
DEMAND PAGING
Lazy swapper – never swaps a page into memory unless page will be needed
o Swapper that deals with pages is a pager
With swapping, pager guesses which pages will be used before swapping out again
Instead, pager brings in only those pages into memory
How to determine that set of pages?
During MMU address translation, if valid–invalid bit in page table entry is i page fault
If there is a reference to a page, first reference to that page will trap to operating system:
PAGE FAULT
7. Actually, a given instruction could access multiple pages -> multiple page faults
l Consider fetch and decode of instruction which adds 2 numbers from memory
and stores result back to memory
l Instruction restart
INSTRUCTION RESTART
block move
auto increment/decrement location
4. Check that the page reference was legal and determine the location of the page on
the disk
1. Wait in a queue for this device until the read request is serviced
8. Save the registers and process state for the other user
10. Correct the page table and other tables to show page is now in memory
12. Restore the user registers, process state, and new page table, and then resume
the interrupted instruction
PAGE REPLACEMENT
NEED
BASIC PAGE REPLACEMENT
3. Bring the desired page into the (newly) free frame; update the page and frame tables
4. Continue the process by restarting the instruction that caused the trap
ALGORITHMS
Page-replacement algorithm
o Want lowest page-fault rate on both first access and re-access
o Repeated access to the same page does not cause a page fault
In all our examples, the reference string of referenced page numbers is
7,0,1,2,0,3,0,4,2,3,0,3,0,3,2,1,2,0,1,7,0,1
FIFO
OPTIMAL ALGORITHM
LRU
o Read page into free frame and select victim to evict and add to free pool
Possibly, keep free frame contents intact and note what is in them
o If referenced again before reused, no need to load contents again from disk
o Generally useful to reduce penalty if wrong victim frame selected
Each process needs minimum number of frames
Example: IBM 370 – 6 pages to handle SS MOVE instruction:
o instruction is 6 bytes, might span 2 pages
o 2 pages to handle from
o fixed allocation
o priority allocation
Many variations
THRASHING
If a process does not have ―enough‖ pages, the page-fault rate is very high
o Page fault to get page
Operating system thinking that it needs to increase the degree of
multiprogramming
Another process added to the system
FILE
o Data
numeric
character
binary
o Program
FILE ATTRIBUTES
OPERATIONS
Create
Write – at write pointer location
Read – at read pointer location
Reposition within file - seek
Delete
Truncate
Open(Fi) – search the directory structure on disk for entry Fi, and move the
content of entry to memory
Close (Fi) – move the content of entry Fi in memory to directory structure on disk
Several pieces of data are needed to manage open files:
o File pointer: pointer to last read/write location, per process that has the file
open
o Shared lock similar to reader lock – several processes can acquire concurrently
o Lines
o Fixed length
o Variable length
Complex Structures
o Formatted document
Can simulate last two with first method by inserting appropriate control characters
Who decides:
o Operating system
o Program
SEQUENTIAL
FILEACCESS
ACCESS METHODS
Sequential Access
read next
write next
reset
(rewrite)
read n
write n
position to n
read next
write next
rewrite n
VMS operating system provides index and relative files as another example
DIRECTORY STRUCTURE
A collection of nodes containing information about all files
DISK STRUCTURE
TYPES
o objfs – interface into kernel memory to get kernel symbols for debugging
Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …)
Path name
Can have the same file name for different user
Efficient searching
No grouping capability
TREE STRUCTURED DIRECTORIES
Efficient searching
Grouping Capability
Current directory (working directory)
o cd /spell/mail/prog
o type list
o Solutions:
o Entry-hold-count solution
MOUNT POINT
FILE SHARING
Client-server model allows clients to mount remote file systems from servers
o Server can serve multiple clients
o Standard operating system file calls are translated into remote calls
Remote file systems add new failure modes, due to network failure, server failure
Recovery from failure can involve state information about status of each remote request
Stateless protocols such as NFS v3 include all information in each request, allowing
easy recovery but less security
PROTECTION
o Read
o Write
o Execute
o Append
o Delete
o List
RWX
RWX
c)public access 1 001
Ask manager to create a group (unique name), say G, and add some users to the group.
For a particular file (say game) or subdirectory, define an appropriate access.
File structure
o Given commands like ―read drive1, cylinder 72, track 2, sector 10, into
memory location 1060‖ outputs low-level hardware specific commands to
hardware controller
Basic file system given command like ―retrieve block 123‖ translates to device driver
Also manages memory buffers and caches (allocation, freeing, replacement)
o Buffers hold data in transit
o Protection
Layering useful for reducing complexity and redundancy, but adds overhead and
can decrease performanceTranslates file name into file number, file handle, location
by maintaining file control blocks (inodes in UNIX)
o Logical layers can be implemented by any coding method according to OS
designer
We have system calls at the API level, but how do we implement their
functions? o On-disk and in-memory structures
Boot control block contains info needed by system to boot OS from that
volume o Needed if volume contains OS, usually first block of volume
Volume control block (superblock, master file table) contains volume details
o Total # of blocks, # of free blocks, block size, free block pointers or array
Directory structure organizes the files
Per-file File Control Block (FCB) contains many details about the file
o inode number, permissions, size, dates
o Implementation can be one of many file systems types, or network file system
Implements vnodes which hold inodes or network file details
DIRECTORY IMPLEMENTATION
o Time-consuming to execute
Linear search time
Could keep ordered alphabetically via linked list or use B+ tree
• Hash Table – linear list with hash data structure
o Collisions – situations where two file names hash to the same location
• An allocation method refers to how disk blocks are allocated for files:
o No external fragmentation
• Logical view
• index table
• Need index table
• Random access
• Dynamic access without external fragmentation, but have overhead of index block
• Mapping from logical to physical in a file of maximum size of 256K bytes and block
size of 512 bytes. We need only 1 block for index table
(number of bits per word) *(number of 0-value words) +offset of first 1 bit
o Example:
• No waste of space
o Modify linked list to store address of next n-1 free blocks in first free block, plus
a pointer to next block that contains free-block-pointers (like this one)
• Counting
o Used in ZFS
Full
data structures like bit maps couldn’t fit in memory -> thousands of
I/Os
Replay log into that structure
Combine contiguous free blocks into single entry
• Efficiency dependent on:
PAGE CACHE
• A page cache caches pages rather than disk blocks using virtual memory
techniques and addresses
• Routine I/O through the file system uses the buffer (disk) cache
Contains processor, microcode, private memory, bus controller, etc
o Memory-mapped I/O
Device data and command registers mapped to processor address space
Especially for large address spaces (graphics)
UNIT V
Linodes run Linux. Linux is an operating system, just like Windows and Mac OS X. As an
operating system, Linux manages your Linode’s hardware and provides services your other
software needs to run.
Linux is a very hands-on operating system. If running Windows is like driving an automatic,
then running Linux is like driving a stick. It can take some work, but once you know your way
around Linux, you’ll be using the command line and installing packages like a pro. This
article aims to ease you into the world of Linux.
Everything on a Linux system is case-sensitive. That means that photo.jpg, photo.JPG, and
Photo.jpg are all different files. Usernames and passwords are also case-sensitive.
Linux, like Mac OS X, is based on the Unix operating system. A research team at AT&T’s
Bell Labs developed Unix in the late 1960s and early 1970s with a focus on creating an
operating system that would be accessible and secure for multiple users.
Corporations started licensing Unix in the 1980s and 1990s. By the late 1980s, there was interest
in building a free operating system that would be similar to Unix, but that could be tinkered with
and redistributed. In 1991, Linus Torvalds released the Linux kernel as free, open-source
software. Open source means that the code is fully visible, and can be modified and redistributed.
Strictly speaking, Linux is the kernel, not the entire operating system. The kernel provides an
interface between your Linode’s hardware and the input/output requests from applications. The
rest of the operating system usually includes many GNU libraries, utilities, and other software,
from the Free Software Foundation. The operating system as a whole is known as GNU/Linux.
Your Linode is a type of server. What’s a server? A server is a type of computer that provides
services over a network, or connected group of computers. When people think about servers,
they’re usually thinking of a computer that is:
Since a server is a type of computer, there are a lot of similarities between a Linode and
your home computer. Some important similarities include:
The physical machine: Your Linode is hosted on a physical machine. It’s sitting in one
of our data centers.
Get Unique study materials from www.rejinpaul.com
Downloaded from www.Rejinpaul.com
The operating system: As we mentioned in the introduction, Linodes use the Linux
operating system. It’s just another type of operating system like Windows or Mac OS X.
Applications: Just like you can install applications on your home computer or
smartphone, you can install applications on your Linode. These applications help
your Linode do things like host a website. WordPress is a popular website
application, for example. Applications are also known as software and programs.
Files and directories: In the end, whether it’s an application or a photo, everything on
your Linode is a file. You can create new files, edit and delete old ones, and navigate
through directories just like you would on your home computer. In Linux, folders are
called directories.
Internet access: Your Linode is connected to the Internet. That’s how you connect to it to
get everything set up, and how your users connect to it to view your website or download
your app.
SYSTEM ADMINISTRATION
Basic Configuration
These tips cover some of the basic steps and issues encountered during the beginning of system
configuration. We provide a general getting started guide for your convenience if you’re new
to Linode and basic Linux system administration. Additionally, you may find some of our
Introduction to Linux Concepts guide useful.
Please follow our instructions for setting your hostname. Issue the following commands to
make sure it is set properly:
1 hostname
2 hostname -f
The first command should show your short hostname, and the second should show your fully
qualified domain name (FQDN).
When setting the timezone of your server, it may be best to set it to the timezone of the bulk of
your users. If you’re unsure which timezone would be best, consider using universal
coordinated time or UTC (i.e. Greenwich Mean Time).
By default, Linode base installs are set to Eastern Standard Time. The following process will
set the timezone manually, though many operating systems provide a more elegant method for
changing timezones. To change the time zone manually, you must find the proper zone file in
Uniqueeinfo/
/usr/share/zo and link that file to /etc/localtime. See the rejinpaulexamplebelow for
common possibilities. Please note that all contents following the double hashes (eg. ##) are
comments and should not be copied into your terminal.
ln -sf /usr/share/zoneinfo/UTC /etc/localtime ## for Universal
Coordinated 1 Time
2
3 ln -sf /usr/share/zoneinfo/EST /etc/localtime ## for Eastern Standard Time
4 ln -sf /usr/share/zoneinfo/US/Central /etc/localtime ## for American Central
56
time (including DST)
To change the time zone in Debian and Ubuntu systems, issue the following command
and answer the questions as prompted by the utility:
1 dpkg-reconfigure tzdata
In Arch Linux, set the timezone in the /etc/rc.conf file by configuring the TIMEZONE= setting
in the ―Localization‖ section. This line will resemble the following:
/etc/rc.conf
TIMEZONE=‖America/New_York‖
Note that the string specified in TIMEZONE refers to the ―zoneinfo‖ file located in or below
the /usr/share/zoneinfo/ directory.
The /etc/hosts file provides a list of IP addresses with corresponding hostnames. This allows
you to specify hostnames for an IP address once on the local machine, and then have multiple
applications connect to external resources via their hostnames. The system of host files predates
DNS, and hosts files are always checked before DNS is queried. As a result, /etc/hosts can be
useful for maintaining small ―internal‖ networks, for development purposes, and for managing
clusters.
Some applications require that the machine properly identify itself in the /etc/hosts file. As
a result, we recommend configuring the /etc/hosts file shortly after deployment. Here is an
example file:
/etc/hosts
You can specify a number of hostnames on each line separated by spaces. Every line must
begin with one and only one IP address. In this case, replace 12.34.56.78 with your machine’s
IP address. Let us consider a few additional /etc/hosts entries:
/etc/hosts
The second entry tells the system to look to 192.168.1.1 for the domain stick.example.com.
These kinds of host entries are useful for using ―private‖ or ―back channel‖ networks to
access other servers in a cluster without needing to access the public network.
Network Diagnostics
The following tips address the basic usage and functionality of a number of tools that you can
use to assess and diagnose network problems. If you suspect connectivity issues, including
output of the relevant commands in your support ticket can help our staff diagnose your
issue. This is particularly helpful in cases where networking issues are intermittent.
The ping command tests the connection between the local machine and a remote address
or machine. The following command ―pings‖ google.com and 74.125.67.100:
These commands send a bit of data (i.e. an ICMP packet) to the remote host, and wait for a
response. If the system is able to make a connection, for every packet it will report on the
―round trip time.‖ Here is the output of four pings of google.com:
In this case yx-in-f100.1e100.net is the reverse DNS for this IP address. The time field
specifies in milliseconds that the round trip takes for an individual packet. When you’ve
gathered the amount of information you need, send Control+C to interrupt the process. At this
juncture, you’ll be presented with some statistics. This will resemble:
Packet Loss, or the discrepancy between the number of packets sent and the number of
packets that return successfully.
Round Trip Time statistics on the final line report important information about all the ping
responses. For this ping we see that the fastest packet round trip took 33.89 milliseconds. The
longest packet took 53.28 milliseconds. The average round trip took 40.175 milliseconds. A
single standard deviation unit for these four packets is 7.67 milliseconds.
Use the ping tool to contact a server and ensure that you are able to make a connection.
Furthermore, ping is useful as an informal diagnostic tool to measure point-to-point network
latency, and as a network connection testing tool.
The traceroute command expands on the functionality of the ping command. traceroute
provides a report on the path that the packets take to get from the local machine to the remote
machine. Route information is useful when troubleshooting a networking issue: if there is
packet loss in one of the first few ―hops‖ the problem is often related to the user’s local area
network (LAN) or Internet service provider (ISP). By contrast, if there is packet loss near the
end of the route, the problem may be caused by an issue with the server’s connection.
Often the hostnames and IP addresses on either side of a failed jump are useful in determining
who operates the machine where the routing error occurs. Failed jumps are designated by line
with three asterisks (e.g. * * *).
The ―mtr‖ command, like the traceroute tool, provides information about the route that
Internet traffic takes between the local system and a remote host. However, mtr provides
additional information about the round trip time for the packet. In a way, you can think of mtr
as a combination of traceroute and ping.
1 htop
You can quit at any time by pressing the F10 or Q keys. There are a couple of htop behaviors that
may not be initially intuitive. Take note of the following:
The memory utilization graph displays used memory, buffered memory, and cached memory.
The numbers displayed at the end of this graph reflect the total amount of memory available
and the total amount memory on the system as reported by the kernel.
The default configuration of htop presents all application threads as independent processes,
which is non-intuitive. You can disable this by selecting the “setup” option with F2, then
“Display Options,” and then toggling the “Hide userland threads” option.
You can toggle a “Tree” view with the F5 key that usefully displays the processes in a hierarchy
and shows which processes were spawned by which other processes. This is helpful in
diagnosing a problem when you’re having trouble figuring out what processes are what.
If you’re new to administering systems and the Linux world, you might consider our ―Tools
& Reference‖ section and articles including: ―installing and using WinSCP‖ using rsync to
synchronize files and ―using SSH and the terminal.‖
As always, if you are giving other users access to upload files to your server, it would be wise to
consider the security implications of all additional access that you grant to third parties seriously.
If you’re used to using an FTP client, OpenSSH (which is included and active with all of the Linode
provided installation templates) allows you to use an FTP-like interface over the SSH protocol.
Known as ―SFTP,‖ many clients support this protocol, including: ―WinSCP‖ for Windows,
―Cyberduck‖ for Mac OS X, and ―Filezilla‖ for Linux, OS X, and Windows desktops.
If you are accustomed to FTP, SFTP is great option. Do note that by default, whatever access
a user has to a file system at the command line, they will also have over SFTP. Consider file
permissions very carefully.
Conversely, you can use Unix utilities including scp and rsync to securely transfer files to your
Linode. On local machine, a command to copy team-info.tar.gz would look like:
The command, scp, is followed by the name of the file on the local file system to be transferred.
Next is the username and hostname of the remote machine, separated by an ―at‖ sign (e.g. @).
Following the hostname, there is a colon (e.g. :) and the path on the remote server where the file
should be uploaded to. Taken another way, this command would be:
The syntax of scp follows the form scp [source] [destination]. You can copy files from a
remote host to the local machine by reversing the order of the paths in the above example.
Because Linode servers are network accessible and often have a number of distinct users,
maintaining the security of files is often an important concern. We recommend you
familiarize yourself with our basic security guide. Furthermore, our documentation of access
control with user accounts and permissions may provide additional insight.
Only give users the permission to do what they need to. This includes application specific users.
Only run services on public interfaces that you are actively using. One common source of
security vulnerabilities are in daemons that are left running and unused. This includes
database servers, HTTP development servers, and FTP servers.
Use SSH connections whenever possible to secure and encrypt the transfer of sensitive
information.
―Symbolic Linking,‖ colloquially ―sym linking,‖ allows you to create an object in your
filesystem that points to another object on your filesystem. This is useful when you need to
provide users and applications access to specific files and directories without reorganizing
your folders. This way you can provide restricted users access to your web-accessible
directories without moving your DocumentRoot into their home directories.
1 ln -s /home/squire/config-git/etc-hosts /etc/hosts
This creates a link of the file etc-hosts at the location of the system’s /etc/hosts file. More
generically. this command would read:
1 ln -s [/path/to/target/file] [/path/to/location/of/sym/link]
The final term, the location of the link, is optional. If you opt to omit the link destination, a link
will be created in the current directory with the same name as the file you’re linking to.
When specifying the location of the link, ensure that path does not have a final trailing slash.
You can create a sym link that targets a directory, but sym links cannot terminate with slashes.
You may remove a symbolic link without affecting the target file.
You can use relative or absolute paths when creating a link.
How to Manage and Manipulate Files on a Linux System
If you’re new to using Linux and manipulating files on the terminal interface we encourage
you to consider our using the terminal document. This tip provides an overview of basic file
management operations.
cp /home/squire/todo.txt /home/squire/archive/todo.01.txt
This copies todo.txt to an archive folder, and adds a number to the file name. If you want to
recursively copy all of the files and subdirectories in a directory to another directory, use the -R
option. This command looks like:
1 cp -R /home/squire/archive/ /srv/backup/squire.01/
1 mv /home/squire/archive/ /srv/backup/squire.02/
1 rm scratch.txt
This will delete the scratch.txt file from the current directory.
For more information about file system navigation and manipulation, please consider our
documentation of file system navigation in the using the terminal document.
Package Management
Contemporary Linux systems use package management tools to facilitate the installation and
maintenance of all software on your system. For more in-depth coverage of this topic, please
reference our package management guide.
While package management provides a number of powerful features, it is easy to obviate the
benefits of package management. If you install software manually without package
management tools, it becomes very difficult to keep your system up to date and to manage
complex dependencies. For these reasons, we recommend installing all software through
package management tools unless other means are absolutely necessary. The following tips
outline a couple of basic package management tasks.
Because packages are so easy to install, and often pull in a number of dependencies, it can be
easy to lose track of what software is installed on your system. The following commands
provide a list of installed packages on your system.
The following example presents a few relevant lines of the output of this command:
CentOS and Fedora systems provide the name of the package (e.g. SysVinit), the architecture
it was compiled for (e.g. i386), and the version of the build installed on the system (e.g. 2.86-
15.el5).
1 pacman -Q
This command provides a total list of all packages installed on the system. Arch also allows you
to filter these results to display only packages that were explicitly installed (with the -Qe option)
or that were installed as dependencies (with the -Qd option). The above command is thus the
union of the output of the following commands:
1 pacman -Qe
2 pacman -Qd
1 perl-www-mechanize 1.60-
2 perl-yaml 0.70-1
3 pkgconfig 0.23-1
4 procmail 3.22-2
5 python 2.6.4-1
6 rsync 3.0.6-1
Because there are often a large number of packages installed on any given system, the output of
these commands is often quite large. As a result, it is often useful to use tools like grep and less
to make these results more useful. For example, the command :
will return a list of all packages with the word python in their name or description. Similarly, the
following command:
1 dpkg -l | less
will return the same list as the plain ―dpkg -l; however, the results will appear in the
less pager, which allows you to search and scroll more easily.
You can append | grep "[string]" to these commands to filter package list results, or | less
to display the results in a pager, regardless of distribution.
Sometimes the name of a package doesn’t correspond to the name that you may associate with a
given piece of software. As a result, most package management tools make provide an interface
to search the package database. These search tools may be helpful if you’re looking for a specific
piece of software but don’t know what it’s called.
This will search the local package database for a given term and generate a list with
brief descriptions. An excerpt of the output for apt-cache search python follows:
This provides information regarding the maintainer, the dependencies, the size, the homepage
of the upstream project, and a description of the software. This command can be used to
provide additional information about a package from the command line.
This generates a list of all packages available in the package database that match the given term.
See the following excerpt for an example of the output of yum search wget:
You can use the package management tools to discover more information about a specific
package. Use the following command to get a full record from the package database:
This output presents more in-depth information concerning the package, its dependencies,
origins, and purpose.
This will perform a search of the local copy of the package database. Here is an excerpt of
results for a search for ―python:
1 extra/twisted 8.2.0-1
2 Asynchronous networking framework written in Python.
3 community/emacs-python-mode 5.1.0-1
4 Python mode for Emacs
The terms ―extra‖ and ―community‖ refer to which repository the software is located in. This
level of specificity is unnecessary when specifying packages to install or display more
information about. To request more information about a specific package issue a command in
the following form:
1 pacman -Si [package-name]
Running pacman with the -Si option generates the package’s record from the database. This
record includes information about dependencies, optional dependencies, package size, and a
brief description.
The first command only searches the database for package names. The second command
searches through the database for package names and descriptions. These commands will
allow you to search your local package tree (i.e. portage) for the specific package name or
term. The output of either command is similar to the excerpt presented bellow.
1 Searching...
2 [ Results for search key : wget ]
3 [ Applications found : 4 ]
4
5 * app-emacs/emacs-wget
6 Latest version available: 0.5.0
7 Latest version installed: [ Not Installed ]
8 Size of files: 36 kB
9 Homepage:http://pop-club.hp.infoseek.co.jp/emacs/emacs-wget/
10 Description: Wget interface for Emacs
11 License:GPL-2
Because the output provided by the emerge --search command is rather verbose, there is
no ―show more information‖ tool, unlike other distributions’ tools. The emerge --search
command accepts input in the form of a regular expression if you need to narrow results even
further.
Since there are often a large number of results for package searches, these commands output a
great quantity of text. As a result it is often useful to use tools like grep and less to make these
results easier to scroll. For example, the command :
will return the subset of the list of packages which matched for the search term ―python,‖
and that mention xml in their name or short description. Similarly, the following command:
will return the same list as the plain apt-cache search python but the results will appear
in the less pager. This allows you to search and scroll more conveniently.
You can append | grep "[string]" to any of these commands to filter package search
results, or | less to display the results in the less pager, regardless of distribution.
Text Manipulation
Among Linux and UNIX-like systems, nearly all system configuration information is stored
and manipulated in plain text form. These tips provide some basic information regarding the
manipulation of text files on your system.
to Search for a String in Files with grep
The grep tool allows you to search a stream of text, such as a file or the output of a
command, for a term or pattern matching a regular expression.
This will search the mail spool for subject lines (i.e. begins with the word ―Subject:‖), beginning
with any number of characters, containing the word ―help‖ in upper case, and followed by any
number of additional characters. grep would then print these results on the terminal.
grep provides a number of additional options that, if specified, force the program to output the
context for each match (e.g. with -C 2 for two lines of context). With -n, grep outputs the line
number of the match. With -H, grep prints the file name for each match, which is useful when
you ―grep‖ a group of files or ―grep‖ recursively through a file system (e.g. with -r). Consider
the output of grep --help for more options.
To grep a group of files, you can specify the file with a wildcard, as in the following example:
This will find and match against every occurrence of the word ―morris,‖ while ignoring
case (because of the option for -i). grep will search all files in the ~/org/ directory with a
.txt extension.
You can use grep to filter the results of another command that sends output to standard out
(e.g. stdout). This is accomplished by ―piping‖ the output of one command ―into grep.‖ For
instance:
In this example, we assume that the /home/squire/data directory contains a large number of
files that have a UNIX time stamp in their file name. The above command will filter the output
to only display those tiles that have the four digits ―1257‖ in their file name. Note, in these
cases grep only filters the output of ls and does not look into file contents. For more
information regarding grep consider the full documentation of the grep command.
While the grep tool is quite powerful for filtering text on the basis of regular expressions, if you
need to edit a file or otherwise manipulate the text you may use the sed tool. sed, or the Stream
EDitor, allows you search for a regular expression pattern and replace it with another string.
sed is extremely powerful, and we recommend that you back up your files and test your
sed commands thoroughly before running them, particularly if you’re new to using sed.
Here is a very simple sed one-liner, intended to illustrate its syntax.
1 's/[regex]/[replacement]/'
To match literal slashes (e.g. /), you must escape them with a backslash (e.g. \). As a result, to
match a / character you would use \/ in the sed expression. If you are searching for a string
that has a number of slashes, you can replace the slashes which another character. For instance:
1 's|r/e/g/e/x|regex|'
This would strip the slashes from the string r/e/g/e/x so that this string would be regex after
running the sed command on the file that contains the string.
The following example, from our migrating a server to your Linode document, searches and
replaces one IP address with another. In this case 98.76.54.32 is replaced with 12.34.56.78:
1 sed -i 's/98\.76\.54\.32/12\.34\.56\.78/'
In the above example, period characters are escaped as \.. In regular expressions the full-
stop (period) character matches to any character.
Once again, sed is a very powerful and useful tool; however, if you are unfamiliar with it,
we strongly recommend testing your search and replace patterns before making any edit of
consequence. For more information about sed consider the full documentation of text
manipulation with sed.
In many Linode Library documents, you may be instructed to edit the contents of a file. To do
this, you need to use a text editor. Most of the distribution templates that Linode provides
come with an implementation of the vi/vim text editor and the nano text editor. These are
small, lightweight, and very powerful text editors that allow you manipulate the text of a file
from the terminal environment.
There are other options for text editors, notably emacs and ―zile.‖ Feel free to install these
programs using your operating system’s package manager. Make sure you search your package
database so that you can install a version compiled without GUI components (i.e. X11).
To open a file, simply issue a command beginning with the name of the editor you wish to run
followed by the name of the file you wish to edit. Here are a number of example commands
that open the /etc/hosts file:
1 nano /etc/hosts
2 vi /etc/hosts
3 emacs /etc/hosts
4 zile /etc/hosts
When you’ve made edits to a file, you can save and exit the editor to return to the prompt. This
procedure varies between different editors. In emacs and zile, the key sequence is the same:
depress control and type x and s to save. This operation is typically notated ―C-x C-s‖ and
then ―C-x C-c‖ to close the editor. In nano, press Control-O (notated \^O) and confirm the file
name to write the file, and type \^X to exit from the program.
Since vi and vim are modal editors, their operation is a bit more complex. After opening a file in vi,
you can enter ―insert‖ mode by pressing the ―i‖ key; this will let you edit text in the conventional
manner. To save the file, you must exit into ―normal‖ mode by pressing the escape key (Control-
[ also sends escape), and then type :wq to write the file and quit the program.
This provides only the most basic outline of how to use these text editors, and there are
numerous external resources which will provide a more thorough introduction for more advanced
use of this software.
The following tips cover a number of basic web serving tasks and functions, as well as
some guidance for users new to the world of web servers.
Web servers work by listening on a TCP port, typically port 80 for ―http‖ and port 443 for
―https.‖ When a visitor makes a request for content, the servers respond by delivering the
resource requested. Typically resources are specified with a URL that contains the protocol,
http or https; a colon and two slashes, ://; hostname or domain, www.example.com or
squire.example.com; and the path to a file, /images/avatar.jpg, or index.html. Thus a
full URL would resemble: http://www.example.com/images/avatar.jpg .
In order to provide these resources to connected users, your Linode needs to be running a web
server. There are multiple different HTTP servers and countless configurations to provide
support for various web development frameworks. The three most popular general use web
servers are the Apache HTTP server, Lighttpd server (―Lighty‖), and nginx server (―Engine
X‖). Each server has its strengths and weaknesses, and your choice depends largely on your
experience and the nature of your needs.
Once you’ve chosen a web server, you need to decide what (if any) scripting support you need to
install. Scripting support allows you to run dynamic content with your web server and program
server side scripts in languages such as Python, PHP, Ruby, and Perl.
If you need a full web application stack, we encourage you to consider one of our more full-
featured LAMP stack guides. If you need support for a specific web development framework,
consult our tutorials for installing and using specific web development frameworks.
The Apache HTTP Server is considered by many to be the de facto standard web server. It is the
most widely deployed open source web server, its configuration interface has been stable for
many years, and its modular architecture allows it to function in many different types of
deployments. Apache forms the foundation of the LAMP stack, and contains superb support for
integrating dynamic server-side applications into the web server.
By contrast, web servers like Lighttpd and nginx are highly optimized for serving static
content in an efficient manner. If you have a deployment where server resources are limited
and are facing a great deal of demand, consider one of these servers. They are very functional
and run very well with minimal systems resources. Lighttpd and nginx can be more complex to
set up than Apache, and can be difficult to configure with regards to integration with dynamic
content interpreters. Furthermore, as these servers are more directed at niche use cases, there
are more situations and applications which remain undocumented.
Finally the Cherokee web server provides a general purpose web server with an easy to
configure interface. Cherokee might be a good option for some basic deployments.
Remember that the choice of web servers is often contextually determined. Specific choices
depend on factors like: the type of content you want to serve, the demand for that content,
and your comfort with that software as an administrator.
Often, when there is something wrong with an Apache web sever configuration or site, it is
difficult to determine what the cause of the error is from the behavior of the web server. There
are a number of common issues with which you might begin your troubleshooting efforts.
However, when more complex issues arise it is important to review the Apache error logs.
By default, error logs are located in the /var/log/apache2/error.log file. You can track or
―tail‖ this log with the following command:
1 tail -F /var/log/apache2/error.log
In the default virtual host configurations suggested in our Apache installation and LAMP
guides, we suggest the following error logging setup:
Where bucknell.net represents the name of your virtual host, and the location of relevant files.
These configuration directives make Apache create two log files that contain logging
information specific to that virtual host. This allows you to easily troubleshoot errors on specific
virtual hosts. To track or tail the error log, issue the following command:
1 tail -F /srv/www/example.com/logs/error.log
This will allow you to see new error messages as they appear. Often problems can be
diagnosed by using specific parts of an error message from an Apache log as a term in Web
search (e.g. Google.) Common errors to look for include:
CNAME DNS records make it possible to redirect requests for one hostname or domain to
another hostname or domain. This is useful in situations where you want to direct requests
for one domain to another, but don’t want to set up the web-server to handle requests.
CNAMEs are only valid when pointing from one domain to another. If you need to redirect a
full URL, you will need to set up a web server and configure redirection and/or virtual hosting
on the server level. CNAMEs will allow you to redirect subdomains, such as
team.example.com, to other subdomains or domains, such as jack.example.org. CNAMEs
must point a valid a domain that has a valid A Record, or to another CNAME.
Although limited in their capabilities, CNAMEs can be quite useful in some situations. In
particular, if you need to change the hostname of a machine, CNAMEs are quite useful. To
learn how to set up CNAME records with the Linode Manager, consult our documentation of
the Linode DNS Manager.
When reading domain names, we commonly refer to parts before the main or first-level domain
as ―sub-domains.‖ For example, in the domain team.example.com, team is a sub-domain for
the root domain example.com.
If you want to create and host a sub-domain, consider the following process:
in the DNS zone for the domain. This is easily accomplished when using the Linode DNS
Manager. As always, you may host the DNS for your domain with any provider you choose.
In order for your server to respond to requests for this domain, you must set up a server to
respond to these requests. For web servers like Apache this requires configuring a new virtual
host. For XMPP Servers you must configure an additional host to receive the requests for this host.
For more information, consult the documentation for the specific server you wish to deploy.
Once configured, subdomains function identically to first-level domains on your server in almost
all respects. If you need to, you can set up HTTP redirection for the new sub domain.
There are two major components of the email stack that are typically required for basic email
functionality. The most important part of the tool chain is the SMTP server or ―Mail
Transfer Agent.‖ The MTA, as it is often called, sends mail from one server to another. The
second crucial part of an email system is a server that permits users to access and download
that mail from the server to their own machine. Typically these server use a protocol such as
POP3 or IMAP to provide remote access to the mailbox.
There are additional components in the email server tool chain. These components may or may
not be optional depending on the requirements of your deployment. They include filtering and
delivery tools like procmail, anti-virus filters like ClamAV, mailing list managers like
MailMan, and spam filters like SpamAssassin. These components function independently of
which MTA and remote mailbox accessing server you chose to deploy.
The most prevalent SMTP servers or MTAs in the UNIX-like world are Postfix, Exim, and
Sendmail. Sendmail has the longest history and many systems administrators have extensive
experience with it. Postfix is robust and modern, and is compatible with many different
deployment types. Exim is the default MTA in Debian systems, and many consider it to be
easier to use for basic tasks. For remote mailbox access, servers like Courier and Dovecot are
widely deployed to provide remote access to mailboxes.
If you are in need of an integrated and easy to install email solution we encourage you to
consider the Citadel groupware server. Citadel provides an integrated ―turnkey‖ solution that
includes an SMTP server, remote mailbox access, real time collaboration tools including
XMPP, and a shared calendar interface. Along similar lines, we also provide documentation for
the installation of the Zimbra groupware server.
If, by contrast, you want a more simple and modular email stack, we urge you to consider one
of our guides built around the Postfix SMTP server.
Finally, it’s possible to outsource email service to a third party provider, such as Google Apps
or FastMail.fm. This allows you to send and receive mail from your domain, without hosting
email services on your Linode. Consult our documentation for setting up Google Apps for your
domain.
In many cases, administrators have no need for a complete email stack like those documented in our
email guides. However, applications running on that server still need to be able to send mail
for notifications and other routine purposes.
The configuration of applications to send notifications and alerts is beyond the scope of this tip,
most applications rely on a simple ―sendmail‖ interface. Nevertheless, the modern MTAs
Postfix provides a sendmail-compatible interfaces located at /usr/sbin/sendmail.
You can install postfix on Debian and Ubuntu systems with the following command:
On CentOS and Fedora systems you can install postfix by issuing the following command:
Once Postfix is installed, your applications should be able to access the sendmail interface,
located at /usr/sbin/sendmail. Most applications running on your Linode should be able to
send mail normally with this configuration.
If you simply want to use your server to send email through an external SMTP server, you may
want to consider a more simple tool like msmtp. Since msmtp is packaged in most distributions
you can install using the command appropriate to your distribution:
Use the command type msmtp, to find the location of msmtp on your system. Typically the
program is located at /usr/bin/msmtp. You can specify authentication credentials with
command line arguments or by declaring SMTP credentials in a configuration file. Here is
an example .msmtprc file.
.msmtprc example
The .msmptrc file needs to be set to mode 600, and owned by the user account that will be
sending mail. If the configuration file is located at /srv/smtp/msmtprc, you can call mstmp
with the following command:
1 /usr/bin/msmtp --file=/srv/smtp/msmtprc
VIRTUALIZATION
Virtualization projects are the focus of many IT professionals who are trying to consolidate
servers or data centers, decrease costs and launch successful ―green‖ conservation initiatives.
Virtualizing IT resources can be thought of as squeezing an enterprise’s computer processing
power, memory, network bandwidth and storage capacity onto the smallest number of
hardware platforms possible and then apportioning those resources to operating systems and
applications on a time-sharing basis. This approach aims to make the most efficient possible
use of IT resources. It differs from historical computing and networking models, which have
typically involved inextricably binding a given software
application or service to a specific operating system (OS), which, in turn, has been developed
to run on a particular hardware platform. By contrast, virtualization decouples these
components, making them available from a common resource pool. In this respect,
virtualization prevents IT departments from having to worry about the particular
hardware or software platforms installed as they deploy additional services. The decoupling and
optimization of these components is possible whether you are virtualizing servers, desktops,
applications, storage devices or networks. To virtualize some or all of a computing
infrastructure’s resources, IT departments require special virtualization software, firmware or a
third- party service that makes use of virtualization software or firmware. This
software/firmware component, called the hypervisor or the virtualizattion layer, performs the
mapping between virtual and physical resouces. It is what enables the various resources to be
decoupled,then aggregated and dispensed, irrespective of the underlying hardware and, in some
cases, the software OS. In effect, the hypervisor takes over hardware management from the OS.
In addition to the hypervisor virtualization technology, the organization overseeing the
virtualization project requires a virtualization management tool – which might be procured from
the same or a different supplier – to set up and manage virtual devices and policies.
Why Virtualize?
One key reason why IT organizations are considering virtualization of some or all of their
computing infrastructures is that the technology helps them to derive the biggest bang out
of their computing buck.
SETTING UP XEN
Xen is a type 1, bare-metal virtual machine monitor (or hypervisor), which provides the ability to
run one or more operating system instances on the same physical machine. Xen, like other types
of virtualization, is useful for many use cases such as server consolidation and isolation of
production and development environments (Eg.: corporate and personal environments on the
same system).
As of Ubuntu 11.10 (Oneiric), the default kernel included in Ubuntu can be used directly
with the Xen hypervisor as the management (or control) domain (dom0 or "Domain0" in Xen
terminology).
Our example uses LVM for virtual disks and network bridging for virtual network cards. It
also assumes Xen 4.1 (the version available in 12.04). It assumes a familiarity with general
virtualization issues, as well as with the specific Xen terminology. Please see the Xen wiki (see
http://wiki.xen.org/wiki/Xen_Overview) for more information.
During the install of Ubuntu for the Partitioning method choose "Guided - use the entire disk
and setup LVM". Then, when prompted to enter "Amount of volume group to use for guided
partitioning" enter a value large enough for the Xen dom0 system, leaving the rest for virtual
disks. Enter a value smaller than the size of your installation drive. For example 100 GB should
be large enough for a minimal Xen dom0 system. Keep in mind that in our model stay inside
that guest (dom0) all installation media for guest OSs and other useful files, so that guest must
have enough space on it.
After Installation of Ubuntu
Install GUI
sudo apt-get update
sudo apt-get install ubuntu-desktop
To skip the login screen completely, boot into the console and then start the GUI
sudo gedit /etc/default/grub
sudo update-grub
To logoff "Unity" (default Ubuntu desktop) from command line type "gnome-session-quit".
We understand that this is the best approach to have remote access to "Ubuntu Server 12.04
LTS" into a cross-plataform environment. The "XRDP" is an implementation of the "Remote
Desktop" standards from Microsoft and works on the same way as for Windows. Allows
remote desktop access via native Windows client machines (or "RDESKTOP" on Ubuntu),
does not require loading the "Ubuntu Server 12.04 LTS" GUI (Graphical User Interface) on
boot and allows multiple simultaneous sessions. To use "RDESKTOP" on Ubuntu 12.04 LTS
with "TSCLIENT" see my post on http://superuser.com/questions/420291/ubuntu-12-04-how-
to-get-tsclient-back.
With "XRDP" you can easily use Microsoft RDP to connect to Ubuntu without any
configuration. All you need to do is install the "xrdp" package, then open Remote
Desktop Connection from Windows and connect. That's it, nothing to configure.
Next, open Windows Remote Desktop Connection (RDP) and type Ubuntu Server hostname or
IP address.
As you may already know, SSH is a secure communication protocol that lets you remotely
access networked computers. It is known as a replacement for Telnet which is very
unsecure. While Telnet sends traffic in plain text, SSH on the other hand uses a secure
protocol to communicate.
Run the commands below to install SSH Server.
ssh <remote_user>@<ip_or_name>
Project Kronos is an initiative to port the XAPI tool stack to Debian and Ubuntu. It is a
management stack implemented in OCaml that configures and controls Xen hosts,
attached storage, networking and virtual machine life cycle. It exposes a HTTP API and
provides a command line interface (xe) for resource management.
XenCenter is Windows desktop application by Citrix that is distributed with XenServer for
managing servers running XenServer (the equivalent of linux is OpenXenManager). It uses
XAPI for talking to Xen resource pools. Since we are setting up XAPI, we can use XenCenter to
manage the server.
o sudo sed -i
's/GRUB_CMDLINE_LINUX=.*\+/GRUB_CMDLINE_LINUX="apparmor=0"/'
/etc/default/grub
o sudo update-grub
o sudo reboot
o Install XCP-XAPI
sudo apt-get install xcp-xapi
Fix for "qemu" which emulates the console does not have the keymaps in the correct
location
sudo mkdir /usr/share/qemu; sudo ln -s /usr/share/qemu-
linaro/keymaps /usr/share/qemu/keymaps
Network configuration
This section describes how to set up Linux bridging in Xen. It assumes eth0 is
both your primary interface to dom0 and the interface you want your VMs to
use. It also assumes that you will use manually IP configuration.
sudo apt-get install bridge-utils
Setup bridge networking
sudo gedit /etc/network/interfaces
Create a bond called xenbr0. The file should look like this for a
static network configuration:
# This file describes the network
interfaces available on your system
# and how to activate them. For more
information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
# Xen network interface for "dom0"
auto xenbr0
iface xenbr0 inet static
# IP address
address 192.168.1.111
# Subnet mask
netmask 255.255.255.0
# Default Gateway
gateway 192.168.1.1
# DNS Server
dns-nameservers 192.168.1.1
bridge_ports eth0
iface eth0 inet manual
# The primary network interface
# auto eth0
# iface eth0 inet dhcp
Eg.:
sudo chmod ugo+rwx
/dev/vmnet0
sudo chown eduardo
/dev/vmnet0
sudo chown :eduardo
/dev/vmnet0
All set! Ready to reboot and let xcp-xapi toolstack take over
sudo reboot
sudo xe vm-list
This should list the control domain
"
uuid (RO) : dbcf74d2-ee50-edd5-d44d-
b81fc8ba1777
name-label (RW): Control
domain on host:
ubuntu-xenserver-1
power-state (RO): running
"
sudo lvcreate -L <X>GB -n
<StorageRepositoryName> /dev/<VG>
Eg1.: sudo lvcreate -L 25GB -n
StorageRepository /dev/ubuntus1204
Eg2.: sudo lvcreate -l 100%FREE -n
StorageRepository /dev/ubuntus1204
An ISO Repository contains ISOs (disk images) with operational systems to perform the
installations. Then the following example makes a storage repository called ISOs
What do you need to get the most out of VMware Workstation 5? Take the following list of
requirements as a starting point. Like physical computers, the virtual machines running under
VMware Workstation generally perform better if they have faster processors and more memory.
PC Hardware
AMD™: Athlon™, Athlon MP, Athlon XP, Athlon 64, Duron™, Opteron™, Turion™
For additional information, including notes on processors that are not compatible, see the
VMware knowledge base at
www.vmware.com/support/kb/enduser/std_adp.php?p_faqid=967.
AMD Opteron, AMD Athlon 64, AMD Turion 64, AMD Sempron, Intel EM64T; support for
64-bit guest operating systems is available only on the following specific versions of
these processors:
Memory
128 MB minimum (256 MB recommended)
You must have enough memory to run the host operating system, plus the memory required for
each guest operating system and for applications on the host and guest. See your guest operating
system and application documentation for their memory requirements.
Display
16-bit or 32-bit display adapter recommended
Disk Drives
Guest operating systems can reside on physical disk partitions or in virtual disk files.
Hard Disk
At least 1GB free disk space recommended for each guest operating system and the
application software used with it; if you use a default setup, the actual disk space needs
are approximately the same as those for installing and running the guest operating system
and applications on a physical computer.
For Installation — 80MB (Linux) or 250MB (Windows) free disk space required for
basic installation. You can delete the installer afterwards to reclaim disk space.
Non-Ethernet networks supported using built-in network address translation (NAT) or using
a combination of host-only networking plus routing software on the host operating system
VMware Workstation is available for both Windows and Linux host operating systems.