System - Level I/O: 15 - 213/18 - 243: Introduc3on To Computer Systems 16 Lecture, Mar. 15, 2010
System - Level I/O: 15 - 213/18 - 243: Introduc3on To Computer Systems 16 Lecture, Mar. 15, 2010
System-Level I/O
15-213/18-243: Introduc3on to Computer Systems 16th Lecture, Mar. 15, 2010 Instructors: Gregory Kesden and Anthony Rowe
Carnegie Mellon
Today
Unix I/O RIO (robust I/O) package Metadata, sharing, and redirecEon Standard I/O Conclusions and examples
Carnegie Mellon
Unix Files
Carnegie Mellon
Regular
le
File
containing
user/app
data
(binary,
text,
whatever)
OS
does
not
know
anything
about
the
format
Directory
le
A
le
that
contains
the
names
and
loca3ons
of
other
les
Socket
A
le
type
used
for
network
communica3on
between
processes
4
Carnegie Mellon
Unix I/O
Key
Features
Elegant
mapping
of
les
to
devices
allows
kernel
to
export
simple
interface
called
Unix
I/O
Important
idea:
All
input
and
output
is
handled
in
a
consistent
and
uniform
way
B0 B1
Bk-1 Bk Bk+1
Current le posiEon = k
Carnegie Mellon
Opening Files
Opening
a
le
informs
the
kernel
that
you
are
geUng
ready
to
access
that
le
int fd; /* file descriptor */ if ((fd = open("/etc/hosts", O_RDONLY)) < 0) { perror("open"); exit(1); }
Each
process
created
by
a
Unix
shell
begins
life
with
three
open
les
associated
with
a
terminal:
0:
standard
input
1:
standard
output
2:
standard
error
6
Carnegie Mellon
Closing Files
Closing
a
le
informs
the
kernel
that
you
are
nished
accessing
that
le
int fd; /* file descriptor */ int retval; /* return value */ if ((retval = close(fd)) < 0) { perror("close"); exit(1); }
Closing
an
already
closed
le
is
a
recipe
for
disaster
in
threaded
programs
(more
on
this
later)
Moral:
Always
check
return
codes,
even
for
seemingly
benign
funcEons
such
as
close()
7
Carnegie Mellon
Reading Files
Reading
a
le
copies
bytes
from
the
current
le
posiEon
to
memory,
and
then
updates
le
posiEon
char buf[512]; int fd; /* file descriptor */ int nbytes; /* number of bytes read */ /* Open file fd ... */ /* Then read up to 512 bytes from file fd */ if ((nbytes = read(fd, buf, sizeof(buf))) < 0) { perror("read"); exit(1); }
Carnegie Mellon
WriEng Files
WriEng
a
le
copies
bytes
from
memory
to
the
current
le
posiEon,
and
then
updates
current
le
posiEon
char buf[512]; int fd; /* file descriptor */ int nbytes; /* number of bytes read */ /* Open the file fd ... */ /* Then write up to 512 bytes from buf to file fd */ if ((nbytes = write(fd, buf, sizeof(buf)) < 0) { perror("write"); exit(1); }
Carnegie Mellon
cpstdin.c
Note the use of error handling wrappers for read and write (Appendix A).
10
Carnegie Mellon
11
Carnegie Mellon
Today
Unix I/O RIO (robust I/O) package Metadata, sharing, and redirecEon Standard I/O Conclusions and examples
12
Carnegie Mellon
RIO
is
a
set
of
wrappers
that
provide
ecient
and
robust
I/O
in
apps,
such
as
network
programs
that
are
subject
to
short
counts
RIO
provides
two
dierent
kinds
of
funcEons
Unbuered
input
and
output
of
binary
data
rio_readn
and
rio_writen Buered
input
of
binary
data
and
text
lines
rio_readlineb
and
rio_readnb Buered
RIO
rou3nes
are
thread-safe
and
can
be
interleaved
arbitrarily
on
the
same
descriptor
Carnegie Mellon
Same
interface
as
Unix
read
and
write Especially
useful
for
transferring
data
on
network
sockets
#include "csapp.h" ssize_t rio_readn(int fd, void *usrbuf, size_t n); ssize_t rio_writen(int fd, void *usrbuf, size_t n); Return:
num.
bytes
transferred
if
OK,
0
on
EOF
(rio_readn
only),
-1
on
error
14
Carnegie Mellon
ImplementaEon
of
rio_readn
/* * rio_readn - robustly read n bytes (unbuffered) */ ssize_t rio_readn(int fd, void *usrbuf, size_t n) { size_t nleft = n; ssize_t nread; char *bufp = usrbuf; while (nleft > 0) { if ((nread = read(fd, bufp, nleft)) < 0) { if (errno == EINTR) /* interrupted by sig handler return */ nread = 0; /* and call read() again */ else return -1; /* errno set by read() */ } else if (nread == 0) break; /* EOF */ nleft -= nread; bufp += nread; } return (n - nleft); /* return >= 0 */ }
csapp.c
15
Carnegie Mellon
Carnegie Mellon
For
reading
from
le
File
has
associated
buer
to
hold
bytes
that
have
been
read
from
le
but
not
yet
read
by
user
code
rio_cnt
unread
not in buer
already read
unread
Current
File
PosiEon
unseen
17
Carnegie Mellon
unread
typedef struct { int rio_fd; int rio_cnt; char *rio_bufptr; char rio_buf[RIO_BUFSIZE]; } rio_t;
/* /* /* /*
descriptor for this internal buf */ unread bytes in internal buf */ next unread byte in internal buf */ internal buffer */
18
Carnegie Mellon
Eciently
read
text
lines
and
binary
data
from
a
le
parEally
cached
in
an
internal
memory
buer
#include "csapp.h" void rio_readinitb(rio_t *rp, int fd); ssize_t rio_readlineb(rio_t *rp, void *usrbuf, size_t maxlen); Return:
num.
bytes
read
if
OK,
0
on
EOF,
-1
on
error
19
Carnegie Mellon
maxlen bytes read EOF encountered Calls to rio_readlineb and rio_readnb can be interleaved arbitrarily on the same descriptor Warning: Dont interleave with calls to rio_readn
20
Carnegie Mellon
RIO Example
cpfile.c
21
Carnegie Mellon
Today
Unix I/O RIO (robust I/O) package Metadata, sharing, and redirecEon Standard I/O Conclusions and examples
22
Carnegie Mellon
File
Metadata
Metadata
is
data
about
data,
in
this
case
le
data
Per-le
metadata
maintained
by
kernel
accessed
by
users
with
the
stat and
fstat
func3ons
/* Metadata returned by the stat and fstat functions */ struct stat { dev_t st_dev; /* device */ ino_t st_ino; /* inode */ mode_t st_mode; /* protection and file type */ nlink_t st_nlink; /* number of hard links */ uid_t st_uid; /* user ID of owner */ gid_t st_gid; /* group ID of owner */ dev_t st_rdev; /* device type (if inode device) */ off_t st_size; /* total size, in bytes */ unsigned long st_blksize; /* blocksize for filesystem I/O */ unsigned long st_blocks; /* number of blocks allocated */ time_t st_atime; /* time of last access */ time_t st_mtime; /* time of last modification */ time_t st_ctime; /* time of last change */ };
23
Carnegie Mellon
statcheck.c
24
Carnegie Mellon
Accessing Directories
Carnegie Mellon
Two
descriptors
referencing
two
disEnct
open
disk
les.
Descriptor
1
(stdout)
points
to
terminal,
and
descriptor
4
points
to
open
disk
le
Descriptor
table
[one
table
per
process]
Open
le
table
v-node
table
[shared
by
all
processes]
[shared
by
all
processes]
File
A
(terminal)
File
access
File
size
File
type
...
File
pos
refcnt=1
... ...
Carnegie Mellon
File Sharing
Two
disEnct
descriptors
sharing
the
same
disk
le
through
two
disEnct
open
le
table
entries
E.g.,
Calling
open twice
with
the
same
filename argument
Descriptor
table
[one
table
per
process]
Open
le
table
v-node
table
[shared
by
all
processes]
[shared
by
all
processes]
File
A
(disk)
File
access
File
size
File
type
...
File
pos
refcnt=1
... ...
27
Carnegie Mellon
File
pos
refcnt=1
... ...
Carnegie Mellon
Open
le
table
v-node
table
[shared
by
all
processes]
[shared
by
all
processes]
File
A
(terminal)
File
pos
refcnt=2
Child
fd
0
fd
1
fd
2
fd
3
fd
4
... ...
Carnegie Mellon
I/O RedirecEon
a b
b b
30
Carnegie Mellon
File
pos
refcnt=1
... ...
Carnegie Mellon
File
pos
refcnt=0
... ...
Carnegie Mellon
Carnegie Mellon
34
Carnegie Mellon
35
Carnegie Mellon
Today
Unix I/O RIO (robust I/O) package Metadata, sharing, and redirecEon Standard I/O Conclusions and examples
36
Carnegie Mellon
The
C
standard
library
(libc.so)
contains
a
collecEon
of
higher-level
standard
I/O
funcEons
Documented
in
Appendix
B
of
K&R.
37
Carnegie Mellon
Carnegie Mellon
Standard
I/O
funcEons
use
buered
I/O
printf("h");
printf("e");
printf("l");
printf("l");
printf("o");
buf
printf("\n");
e l
o \n .
Carnegie Mellon
You
can
see
this
buering
in
acEon
for
yourself,
using
the
always
fascinaEng
Unix
strace
program:
linux> strace ./hello execve("./hello", ["hello"], [/* ... */]). ... write(1, "hello\n", 6) = 6 ... exit_group(0) = ?
#include <stdio.h> int main() { printf("h"); printf("e"); printf("l"); printf("l"); printf("o"); printf("\n"); fflush(stdout); exit(0); }
40
Carnegie Mellon
Today
Unix I/O RIO (robust I/O) package Metadata, sharing, and redirecEon Standard I/O Conclusions
41
Carnegie Mellon
Standard
I/O
and
RIO
are
implemented
using
low-level
Unix
I/O
fdopen fwrite fprintf sprintf fputs fseek
C
applicaEon
program
Standard
I/O
funcEons
Unix
I/O
funcEons
(accessed
via
system
calls)
RIO
funcEons
rio_readn rio_writen rio_readinitb rio_readlineb rio_readnb
Carnegie Mellon
Pros
Unix
I/O
is
the
most
general
and
lowest
overhead
form
of
I/O.
All
other
I/O
packages
are
implemented
using
Unix
I/O
func3ons.
Unix
I/O
provides
func3ons
for
accessing
le
metadata.
Unix
I/O
func3ons
are
async-signal-safe
and
can
be
used
safely
in
signal
handlers.
Cons
Dealing
with
short
counts
is
tricky
and
error
prone.
Ecient
reading
of
text
lines
requires
some
form
of
buering,
also
tricky
and
error
prone.
Both
of
these
issues
are
addressed
by
the
standard
I/O
and
RIO
packages.
43
Carnegie Mellon
Pros:
Buering
increases
eciency
by
decreasing
the
number
of
read
and
write
system
calls
Short
counts
are
handled
automa3cally
Cons:
Provides
no
func3on
for
accessing
le
metadata
Standard
I/O
func3ons
are
not
async-signal-safe,
and
not
appropriate
for
signal
handlers.
Standard
I/O
is
not
appropriate
for
input
and
output
on
network
sockets
There
are
poorly
documented
restric3ons
on
streams
that
interact
badly
with
restric3ons
on
sockets
(CS:APP2e,
Sec
10.9)
44
Carnegie Mellon
45
Carnegie Mellon
String
func3ons
Carnegie Mellon
Unix programming TCP/IP (the protocol that makes the Internet work) Unix network programming Unix IPC programming