0% found this document useful (0 votes)
19 views

Lecture 6b - Unix Programming Odds and Ends (Errors, Standards, Streams, Buffering)

Uploaded by

Umama Zafran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lecture 6b - Unix Programming Odds and Ends (Errors, Standards, Streams, Buffering)

Uploaded by

Umama Zafran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

CIS*3110

Lecture 6b: a quick note on error handling in Unix (and other


things)

Based in part on Advanced Programming in the Unix Environment (3ed), Stevens and Rago
Announcements
Schedule for this week (Feb 12-16)

• Monday (today) - normal class


• Wednesday - no class: o ce hours instead, to help you with A1
• Friday - no class: class time and room are used for writing re-scheduled Midterm 1

ffi
Operating system standards

• The behaviour - and programming API - of an operating system corresponds to a standard


• These standard de ne exactly how, for example, a new process is created by fork, how a
pipe behaves when the write end is closed, etc..

• These standards change as the OS evolves


• Each version of the OS comes with its own standard API set
• New features introduced in the latest OS would not be available on a previous one
• The older features may continue as before (for backwards compatibility) or may be updated
fi
The POSIX standard in Unix

• POSIX is a family of standards for Unix-style operating systems, initially developed by the IEEE
• It stands for Portable Operating System Interface (for Unix, presumably)
• It de ned the services that an operating system must provide in order to be ‘‘POSIX compliant’’
• "POSIX compliance" is conferred through formal (and apparently expensive) certi cation
• O cially POSIX-certi ed OS include
• Server-oriented OS (e.g. AIX - an IBM Unix system)
• macOS (Apple's desktop OS based on FreeBSD Unix)
• EulerOS (Huawei's Linux distro)
• etc
ffi
fi
fi
The POSIX standard in Unix

• Most Unix-like systems are o cially "mostly compliant" with the POSIX standard
• In practice, they are either made small modi cations where they wanted to, or didn't bother
with the o cial certi cation because of costs

• This includes
• Linux
• Android
• etc.
ffi
fi
ffi
fi
Specifying the POSIX standard

• Di erent POSIX-compliant / mostly compliant OS have slightly di erent implementations


• This means that the code can compile for / run on systems that support speci c version of
POSIX

• When compiling your code, you may need to specify the minimum version of POSIX that your
code is compatible with - and that must be supported by the libraries available to the
compiler
ff
Specifying the POSIX standard

• In our case, if we want to compile our code with no implementation-de ned constants in the
Linux image running in Docker, we need to de ne the constant _POSIX_C_SOURCE and
associate a version of POSIX with it

• The 2008-09 version of POSIX is a common modern version


• This appeared in the context of the kill() function
• The Linux man pages for it require _POSIX_C_SOURCE to be de ned with we are compiling
with the standard (i.e. non-GNU) C
• Other Unix systems - e.g. macOS - do not require this
Specifying the POSIX standard

• This is called a feature set macro


• All the POSIX headers use this constant to exclude any implementation-de ned de nitions
when _POSIX_C_SOURCE is de ned.

• By setting _POSIX_C_SOURCE to 200809L we state that we only want to use the POSIX
2008-09 speci cation
• Nothing newer, no implementation-de ned extensions
fi
fi
fi
Specifying the POSIX standard

• This can be done in two ways:


• As an argument to gcc, e.g:
gcc -D_POSIX_C_SOURCE=200809L file.c

• As a macro on the 1st list of your code, e.g.:


#define _POSIX_C_SOURCE 200809L

• For us, either one works:


• You can create a standard "all useful unix headers" header
• You can just set it in your Make le for each target

fi
Error handling approaches

• Functions that perform relatively complex tasks and can fail for a variety of reasons follow a general
pattern, regardless of the language they are written in

• They often need to return a "thing" on success ( le descriptor, socket, rendering context, etc.) or to
perform a task (e.g. initialize a pipe)

• If an error occurs, they need to pass the error details to the caller, so the error can be identi ed and
properly handled

• Returning just NULL/not NULL is insu cient


• NULL tells us that the function encountered an error, but gives no idea what the error was
• Functions returning value types (i.e. not pointers / references) have no ability to return NULL at all
• So a "success" and an "error" need to be handled separately, with enough information about the
situation

ffi
fi
Error handling approaches

• In modern programming languages, we usually use exception handling for this:


• A function returns a "thing" on success
• A function throws an exception on error
• The exception is handled using the catch/ nally clauses - or passed up the call stack
• Languages with a functional avour may take a di erent approach and use a special type (a
sum type), whose value is
• An object on success
• An error type on failure
• No exceptions are necessary
• E.g. Either type in C# or Kotlin

fl
fi
Error handling in Unix

• C has none of these mechanisms, so it defaults to using global state - some global variables
• When a process nishes executing a system function in kernel mode, it returns to user-mode
with the desired results and a return value, which is normally 0 for success or -1 for error.
• Some functions return a pointer - which points to a valid object on success and is NULL on error
• In case of error, the external global variable errno (de ned in errno.h) contains an ERROR
code which identi es the error.

• For example, the open function returns either a non-negative le descriptor if all is well, or −1
if an error occurs
• An error from open has about 15 possible errno values: le does not exist, permission problem,
etc..
fi
fi
errno - the magic error "variable"

• The le errno.h de nes the symbol errno and constants for each value that errno can
assume.
• Each of these constants begins with the character E
• E.g. if errno is equal to the constant EACCES, this indicates a permission problem, such as
insu cient permission to open the requested le

• errno is set to zero at program startup


• Any function can modify its value to some value di erent from zero, generally to indicate
speci c categories of errors

• In this course, we do not need to know what these constants are - we should not access
errno directly
ffi
fi
fi
fi
fi
errno - the magic error "variable"

• In the older versions of Unix, errno was indeed just a variable


• Once threading was introduced (which we will discuss in the upcoming lectures), a single
global variable was no longer thread-safe (again, a concept we will discuss shortly)

• In practice, on modern Unix systems, errno is de ned as a "service"


• It can be implemented internally as a function call or a macro
• We will usually not access it directly
Turning errno into something readable
Option 1: strerror

• We can use strerror (string.h) to get a string representation of the error based on its
code
#include <string.h>

char *strerror(int errnum);


• Returns: pointer to message string
• errno is usually passed as the argument
• We can do whatever we need with the string returned by strerror
• Note that the string returned by strerror was not malloc'ed - it's a pointer to a standard
string stored somewhere (likely in the read-only text segment) by the C runtime - so it does
not need to be freed
Turning errno into something readable
Option 1: strerror

• We can pass the string returned by stressor to some other function that displays it on the
screen - or sends it to STDERR, to be more speci c

• see errorExample1.c
Turning errno into something readable
Option 2: perror

• Alternatively, we can call perror (also in string.h)


void perror(const char *msg);

• It outputs the string pointed to by msg


• followed by a colon and a space
• followed by the error message corresponding to the value of errno
• followed by a newline
• It outputs to STDERR
• So it is essentially a call to strerror(), followed by a call to fprint(stderr, ...)
• see errorExample1.c
Reporting errors

• Which one you use is up to you


• The function strerror is probably more generally useful - but you must remember to print it
to STDERR, not STDOUT

• perror is nice and short, useful for debugging


Review / preview: std... file streams

• Each login process - i.e. you opening an instance of the shell as a speci c user - opens three
le streams on its terminal:
• stdin for input (standard input)
• stdout for output (standard input)
• stderr for error output (standard error)
• Defaults are
• stdin is user input from the terminal (e.g. typing some characters)
• stdout is for normal output to the terminal/screen (e.g. that's what printf does by default)
• stderr is for abnormal output - i.e. errors. By default, it is also sent to the terminal/screen
fi
Review / preview: std... file streams
Each is a pointer to a FILE structure in the execution image’s heap
FILE *stdin //points to FILE structure
char fbuf[SIZE]
int counter, index, etc.
int fd = 0; // fd[0] in PROC <== from KEYBOARD

FILE *stdout // points to FILE structure


char fbuf[SIZE]
int counter, index, etc.
int fd = 1; // fd[1] in PROC ==> to TERMINAL (SCREEN)

FILE *stderr //points to FILE structure


char fbuf[SIZE]
int counter, index, etc.
int fd = 2; // fd[2] in PROC ==> to TERMINAL (SCREEN)
Standard streams in the standard C library

• scanf, gets read from STDIN (keyboard)


• printf, puts write to STDOUT (terminal)
• fprintf/fscanf, putc/getc require you to specify which FILE stuct you are interacting
with
• stdin, stdout, stderr are pre-de ned le handles
• you can open your own le
• e.g.
• fprintf(stdout, "I am normal output\n");
• fprintf(stderr, "I indicate an error\n");
fi
fi
fi
Standard streams in the Unix library

• read, write work with any le descriptor


• STDOUT_FILENO, STDOUT_FILENO, and STDERR_FILENO are pre-de ned le description
constants

• E.g.
• write(STDOUT_FILENO, "I am normal output\n", 20);
• write(STDERR_FILENO, "I indicate an error\n", 21);

fi
Stream redirection

• Each stream can be redirected


• Instead of reading from / writing to a the default "location", we can do something else, e.g.
• Read from a le instead now STDIN
• Write to a le instead of STDOUT or STDERR
• Have STDOUT of one process be the STDIN of another, to pass data between processes
• etc.
fi
fi
Stream redirection
Command line

• One application is separating outputs from a program


• For example, we may point STDOUT and STDERR to di erent les, so we can examine them
separately
• ./a.out 1 > normalOutput.txt //redirecting stream 1, i.e. STDOUT
• ./a.out 2 > errorOutput.txt. //redirecting stream 1, i.e. STDOUT
• Useful for debugging, e.g.
• valgrind prints to STDERR by default
• your program can write to STDOUT
• valgrind debugging information won't be mixed with out le's output

• We will see many other applications in the course


Stream redirection example: ignoring errors

• A very common use for stream redirection is ignoring errors from a shell commands
• E.g. we want to nd out where the header unistd.h is
• $ find / -name unistd.h
• will generate a lot of errors (why?)
• hard to tell errors and output apart
• $ find / -name unistd.h 2>/dev/null
• Will pipe all error messages to the special le /dev/null - the Unix garbage bin
• We will only see the proper output, i.e. locations of all les named unistd.h in the directories
that out user has read access to
fi
fi
Standard streams and buffering
When to bu er

• I/O can be bu ered or unbu ered:


• unbu ered - happens immediately
• bu ered - things are placed into a bu er, then moved from the bu er to the nal destination at
the discretion of the OS or the relevant library

• Bu ered output is the default for performance reasons


• When dealing with multiple sequential reads / writes, it is often faster to treat them as one big
read / write (fewer function calls, more e cient use of the disk, etc.

• As a result, typical C standard I/O is bu ered - e.g. reading from STDIN or writing to STDOUT
• So printf and scans are bu ering by default
ff
ff
ff
ff
ff
ff
ff
ff
ffi
ff
Standard streams and buffering
When not to bu er

• Bu ered output means that there will be a delay between the time your output function
returns and the time when the output appears on the screen

• However, this can be a problem:


• We may want errors to display immediately, rather than be bu ered
• Helps gure out exactly where in the code the error happened
• Program may terminate before bu ered output is written to the screen
ff
fi
ff
ff
Standard streams and buffering
When not to bu er

• Also, when debugging concurrent code (processes, threads) output from multiple concurrent
sources will be mixed together

• Having it be bu ered makes it very di cult to when exactly each message was printed
• Output may be printed out of order
• E.g.
• Process 1 calls printf
• write to bu er, OS scheduler puts it to sleep
• Process 2 calls printf
• write to bu er, gets placed on the screen
• bu er from Process 1 nally gets written to the screen
ff
ff
ff
ff
fi
ff
ffi
Standard streams and buffering
Who does what

• C standard library
• stdin and stdout are bu ered
• stderr is unbu ered
• Unix read / write
• Unbu ered by default
• In this course, we will be debugging and dealing with concurrency a lot
• Unbu ered output is nice for clarify and debugging, so we usually want our errors and debug
messages to be unbu ered
ff
ff
ff
ff
ff
Unbuffered output options
C library options

• If we do not care about keeping STDOUT and STDERR separate, we can just use
fprintf(stderr,...)
• unbu ered, one line of code, but everything goes to STDERR
• If we do want to keep the streams separate, we use
int fflush(FILE* stream)
• If the given stream was open for writing (or if it was open for updating and the last i/o operation
was an output operation) any unwritten data in its output bu er is written to the le
• Returns 0 on success, EOF on error
• E.g.
printf("Hello world\n");
fflush(stdout);
ff
Buffering in the example code

• E.g. in signalDebug.c
• Errors related to registering a handler for SIGUSR2 are written to STDERR
• Normal output is written to STDOUT
• I will sometimes also use write for unbu ered debug/error messages
• Or fprintf(stderr, ...) - which is OK for errors, but could be considered sloppy for
debugging

ff

You might also like