Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
ALGORITHMS
Session no :5
Operating System Design
@KL University, 2020
Recap of Session 4
▪ A file system is UNIX way of organizing files on mass storage devices.
▪ All data in Unix is organized into files. All files are organized into directories. These
directories are organized into a tree-like structure called the file system.
▪ When a process wants to access data ,kernel brings data into main memory
▪ To minimize the frequency of disk accesses, maintains a pool of internal data buffers
,called the buffer cache.
▪ During system initialization , the kernel allocates space for a number of buffers ,configurable
according to memory size & system performance constraints.
▪ Buffer allocation Algorithms : Getblk , Brelse.
8/20/2020 2
Reading Disk Blocks
▪ To read a disk block , a process uses the algorithm getblk to search for it in the buffer
cache.
If found in the buffer cache
Return the data without disk access
Else
Calls disk driver to schedule a read request
Sleep(awaiting the event that I/O completes)
When I/O complete, disk controller interrupts the Processor and Disk interrupt handler
awakens the sleeping process .
Some times we need to read the files sequentially , so we have to anticipate the need for
reading the second disk block .
8/20/2020 3
Writing Disk Blocks
▪Kernel informs the disk driver that , it has a buffer ,whose contents are to be written to the disk .
➢Synchronous write
▪the calling process goes the sleep awaiting I/O completion and releases the buffer when
awakens.
➢ Asynchronous write
▪the kernel starts the disk write. The kernel release the buffer when the I/O completes
➢ Delayed write
▪The kernel put off the physical write to disk until buffer reallocated
Release
▪Use brelse()
8/20/2020
DESIGN
8/20/2020 5
Reading Disk Blocks
8/20/2020 6
Reading Disk Blocks
Read Ahead:Improving performance
• Read additional block before request
Use breada()
if (second block not in cache)
Algorithm breada {
Input: (1) file system block number for immediate read get buffer for second block(algorithm getblk);
(2) file system block number for asynchronous read if(buffer data valid)
Output: buffer containing data for immediate read release buffer(algorithm brelse);
else
{ initiate disk read;
if (first block not in cache) }
{ if(first block was originally in cache)
get buffer for first block(algorithm getblk); {
if(buffer data not valid) read first block(algorithm bread)
initiate disk read; return buffer;
} }
sleep(event first buffer contains valid data);
return buffer;
}
8/20/2020
Release Disk Block
algorithm
8/20/2020
Writing Disk Blocks
algorithm
8/20/2020
Reading Disk Blocks
High-level block
block_read() Device handler
block_write()
bread()
breada()
getblk() ll_rw_block()
8/20/2020 12
Advantages and Disadvantages
Advantages
Allows uniform disk access
Eliminates the need for special alignment of user buffers
• by copying data from user buffers to system buffers,
Reduce the amount of disk traffic
• less disk access
Insure file system integrity
• one disk block is in only one buffer
Disadvantages
Can be vulnerable to crashes
• When delayed write
requires an extra data copy
• When reading and writing to and from user processes
8/20/2020
What happen to buffer until now
8/20/2020
Logging
8/20/2020
Logging Layer
One of the most interesting problems in file system design is crash recovery.
xv6 implements file system fault tolerance through a simple logging mechanism
• System calls do not directly write file system data structures
• Instead:
1. A system call first writes a description of all the disk writes that it wishes to perform
to a log on the disk
2. It then writes a special commit record to the log to specify that it contains a complete
operation
3. Next it copies the required writes to the on-disk file system data structures
4. Finally, it deletes the log
8/20/2020 16
Recovery
▪In case of a reboot, the file system performs recovery by looking at the log file .
▪If the log contains the commit record, the recovery code copies the required writes to the on-disk
data structures .
▪If the log does not contain a complete operation, it is ignored and deleted.
▪If the crash occurs before the commit record, the log will be ignored, and the state of the disk will
stay unmodified .
▪ If the crash occurs after the commit record, then the recovery will replay all of the operation’s
writes, even repeating them if the crash occurred during the write to the on-disk data structure .
▪ In both cases, the correctness of the file system is preserved: Either all writes are reflected on the
disk or none
8/20/2020 17
Log Design
8/20/2020 18
Log Design
▪To allow concurrent execution of file system operations by different processes , the logging system can
accumulate the writes of multiple system calls into one transaction.
▪A transaction sequence is indicated by the start and end sequence of writes in the system call
▪ Only one system call can be in a transaction at any given time to ensure correctness
▪ The log holds at most one transaction at a time
▪ Only read system calls can execute concurrently with a transaction
▪ A fixed amount of space on the disk is dedicated to hold the log
▪ No system call can write more distinct blocks than the size of the log
▪ Large writes are broken into multiple smaller writes so that each write can fit in the log
8/20/2020 19
Typical system call usage of log
▪ begin_trans: Waits until it obtains exclusive use of the log •
begin_trans();
▪ log_write:
... ▪ Appends the block’s new content to the log on the disk
bp = bread(...); ▪ Leaves the modified block in the buffer cache so that subsequent reads of
the block during the transaction will yield the updated state
bp->data[...] = ...; ▪ Records the block’s sector number in memory to find out when a block is
written multiple times during a transaction and overwrite the block’s previous
log_write(bp); copy in the log •
... ▪commit_trans:
1. Writes the log’s header block to disk, updating the count
commit_trans(); 2. Calls install_trans to copy each block from the log to the relevant location on
the disk
3. Sets to count in the log header to zero
8/20/2020 20
Log_Write
void
begin_trans(); 4922 log_write(struct buf *b)
4923 {
... 4924 int i;
4925
bp = bread(...); 4926 if (log.lh.n >= LOGSIZE || log.lh.n >= log.size − 1)
4927 panic("too big a transaction");
bp->data[...] = ...; 4928 if (log.outstanding < 1)
4929 panic("log_write outside of trans");
log_write(bp); 4930
... 4931
4932
acquire(&log.lock);
for (i = 0; i < log.lh.n; i++) {
commit_trans(); 4933 if (log.lh.block[i] == b−>blockno) // log absorbtion
4934 break;
4935 }
4936 log.lh.block[i] = b−>blockno;
4937 if (i == log.lh.n)
4938 log.lh.n++;
4939 b−>flags |= B_DIRTY; // prevent eviction
4940 release(&log.lock);
4941 }
8/20/2020 21