0% found this document useful (0 votes)
50 views

Kernel Data Structures: User Space

The document summarizes kernel data structures and processes in Unix systems. It discusses: 1) Kernel data structures like the process table and open file table contain information about each process and open files in the system. 2) Processes are created using the fork system call to clone the current process, followed by the exec system call to replace the image and execute a new program. 3) Each process has information maintained like its working directory, file descriptors, process ID, parent process ID, and environment variables.

Uploaded by

bonhome
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Kernel Data Structures: User Space

The document summarizes kernel data structures and processes in Unix systems. It discusses: 1) Kernel data structures like the process table and open file table contain information about each process and open files in the system. 2) Processes are created using the fork system call to clone the current process, followed by the exec system call to replace the image and execute a new program. 3) Each process has information maintained like its working directory, file descriptors, process ID, parent process ID, and environment variables.

Uploaded by

bonhome
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Kernel Data Structures

• Information User Space


about each
process. Code Code Code
Lecture 3 • Process table:
contains an entry Data Data Data
for every
process in the Process Process Process
Processes and Filters system. Info Info Info
• Open-file table:
contains at least Open File Process
one entry for Table Table
every open file
in the system. Kernel Space

Unix Processes Process Creation


Process: An entity of execution • Interesting trait of UNIX
• Definitions • fork system call clones the current process
– program: collection of bytes stored in a file that can be
run
A A A
– image: computer execution environment of program
– process: execution of an image
• exec system call replaces current process
• Unix can execute many processes simultaneously.
A B

• A fork is typically followed by an exec

Process Setup fork and exec


• All of the per process information is copied • Example: the shell
with the fork operation
while(1) {
– Working directory display_prompt();
– Open files read_input(cmd, params);
pid = fork(); /* create child */
• Copy-on-write makes this efficient if (pid != 0)
waitpid(-1, &stat, 0); /* parent waits */
• Before exec, these values can be modified else
execve(cmd, params, 0); /* child execs */
}

1
Unix process genealogy Background Jobs
Process generation

Init process 1
forks init processes
• By default, executing a command in the
shell will wait for it to exit before printing
init init Init
execs execs execs out the next prompt
getty getty getty
• Trailing a command with & allows the shell
execs
and command to run simultaneously
login
execs
$ /bin/sleep 10 &
[1] 3424
/bin/sh $

Program Arguments Ending a process


• When a process is started, it is sent a list of • When a process ends, there is a return code
strings associated with the process
– argv, argc • This is a positive integer
• The process can use this list however it – 0 means success
wants to – >0 represent various kinds of failure, up to
process

Process Information Maintained Process Information Maintained


• Working directory • Umask
• File descriptor table – Default file permissions for new file
We haven’t talked about these yet:
• Process id
• Effective user and group id
– number used to identify process
– The user and group this process is running with
• Process group id permissions as
– number used to identify set of processes • Real user and group id
• Parent process id – The user and group that invoked the process
– process id of the process that created the process • Environment variables

2
Setuid and Setgid Mechanisms Environment of a Process
• The kernel can set the effective user and • A set of name-value pairs associated with a
group ids of a process to something different process
than the real user and group • Keys and values are strings
– Files executed with a setuid or setgid flag set • Passed to children processes
cause the these values to change
• Cannot be passed back up
• Make it possible to do privileged tasks:
– Change your password • Common examples:
– PATH: Where to search for programs
• Open up a can of worms for security if buggy
– TERM: Terminal type

The PATH environment variable Having . In Your Path


$ ls $ ./foo
foo Hello, foo.
• Colon-separated list of directories. $ foo
sh: foo: not found
• Non-absolute pathnames of executables are
only executed if found in the list. • What not to do:
– Searched left to right $ PATH=.:/bin
$ ls
• Example: foo
$ cd /usr/badguy
$ myprogram $ ls
sh: myprogram not found Congratulations, your files have been removed
$ PATH=/bin:/usr/bin:/home/kornj/bin and you have just sent email to Prof. Korn
$ myprogram challenging him to a fight.
hello!

Shell Variables
• Shells have several mechanisms for creating
Variables (con’t)
variables. A variable is a name representing a
string value. Example: PATH • Syntax varies by shell
– varname=value # sh, ksh
– Shell variables can save time and reduce typing errors
– set varname = value # csh
• Allow you to store and manipulate information
– Eg: ls $DIR > $FILE • To access the value: $varname
• Two types: local and environmental
– local are set by the user or by the shell itself • Turn local variable into environment:
– environmental come from the operating system and are – export varname # sh, ksh
passed to children – setenv varname value # csh

3
Environmental Variables
Inter-process Communication
NAME MEANING
$HOME Absolute pathname of your home directory Ways in which processes communicate:
$PATH A list of directories to search for • Passing arguments, environment
$MAIL Absolute pathname to mailbox • Read/write regular files
$USER Your user id • Exit values
$SHELL Absolute pathname of login shell • Signals
$TERM Type of your terminal • Pipes
$PS1 Prompt

Signals
An Example of Signals
• Signal: A message a process can send to a process
or process group, if it has appropriate permissions. • When a child exists, it sends a SIGCHLD
• Message type represented by a symbolic name signal to its parent.
• For each signal, the receiving process can: • If a parent wants to wait for a child to exit,
– Explicitly ignore signal it tells the system it wants to catch the
– Specify action to be taken upron receipt (signal handler) SIGCHLD signal
– Otherwise, default action takes place (usually process is
killed) • When a parent does not issue a wait, it
• Common signals: ignores the SIGCHLD signal
– SIGKILL, SIGTERM, SIGINT
– SIGSTOP, SIGCONT
– SIGSEGV, SIGBUS

Process Subsystem utilities Pipes


• ps monitors status of processes
• kill send a signal to a pid
• wait parent process wait for one of its
children to terminate
• nohup makes a command immune to the
hangup and terminate signal
• sleep sleep in seconds
One of the cornerstones of UNIX
• nice run processes at low priority

4
Pipes Pipes (2)
• General idea: The input of one program is
the output of the other, and vice versa • Often, only one end of the pipe is used

standard out standard in

A B A B

• Both programs run at the same time


• Could this be done with files?

File Approach More about pipes


• Run first program, save output into file • What if a process tries to read data but nothing is
• Run second program, using file as input available?
– UNIX puts the reader to sleep until data available
process 1 process 2
• What if a process can’t keep up reading from the
process that’s writing?
• Unnecessary use of the disk
– UNIX keeps a buffer of unread data
– Slower • This is referred to as the pipe size.
– Can take up a lot of space – If the pipe fills up, UNIX puts the writer to sleep until
• Makes no use of multi-tasking the reader frees up space (by doing a read)
• Multiple readers and writers possible with pipes.

Interprocess Communication
More about Pipes
For Unrelated Processes
• Pipes are often chained together
• FIFO (named pipes)
– Called filters
– A special file that when opened represents pipe
• System V IPC
standard out standard in – message queues p1 p2
A B C – semaphores
– shared memory
• Sockets (client/server model)

5
Pipelines What’s the difference?
Both of these commands send input to command from a
• Output of one program becomes input to file instead of the terminal:
another
– Uses concept of UNIX pipes
• Example: $ who | wc -l $ cat file | command
– counts the number of users logged in
vs.
• Pipelines can be long
$ command < file

An Extra Process Introduction to Filters


$ cat file | command
• A class of Unix tools called filters.
– Utilities that read from standard input,
cat command
transform the file, and write to standard out
• Using filters can be thought of as data
oriented programming.
$ command < file – Each step of the computation transforms data
stream.
command

Examples of Filters cat: The simplest filter


• Sort • The cat command copies its input to output
– Input: lines from a file unchanged (identity filter). When supplied a list of
file names, it concatenates them onto stdout.
– Output: lines from the file sorted
• Some options:
• Grep – -n number output lines (starting from 1)
– Input: lines from a file – -v display control-characters in visible form (e.g.
– Output: lines that match the argument ^C)

• Awk cat file*


– Programmable filter ls | cat -n

6
head tail
• Display the first few lines of a specified file • Displays the last part of a file
• Syntax: tail +|-number [lbc] [f] [filename]
• Syntax: head [-n] [filename...]
or: tail +|-number [l] [rf] [filename]
– -n - number of lines to display, default is 10 – +number - begins copying at distance number from
– filename... - list of filenames to display beginning of file, if number isn’t given, defaults to 10
– -number - begins from end of file
• When more than one filename is specified,
– l,b,c - number is in units of lines/block/characters
the start of each files listing displays – r - print in reverse order (lines only)
==>filename<== – f - if input is not a pipe, do not terminate after end of
file has been copied but loop. This is useful to monitor
a file being written by another process

head and tail examples tee


head /etc/passwd Unix Command Standard output
head *.c
tail +20 /etc/passwd
ls -lt | tail -3 file-list

head –100 /etc/passwd | tail -5 • Copy standard input to standard output and
tail –f /usr/local/httpd/access_log one or more files
– Captures intermediate results from a filter in the
pipeline

tee con’t Unix Text Files: Delimited Data


Tab Separated Pipe-separated
• Syntax: tee [ -ai ] file-list
John 99 COMP1011|2252424|Abbot, Andrew John |3727|1|M
– -a - append to output file rather than overwrite, Anne 75 COMP2011|2211222|Abdurjh, Saeed |3640|2|M
default is to overwrite (replace) the output file Andrew 50 COMP1011|2250631|Accent, Aac-Ek-Murhg |3640|1|M
Tim 95 COMP1021|2250127|Addison, Blair |3971|1|F
– -i - ignore interrupts Arun 33 COMP4012|2190705|Allen, David Peter |3645|4|M
– file-list - one or more file names for capturing Sowmya 76 COMP4910|2190705|Allen, David Pater |3645|4|M

output Colon-separated
• Examples root:ZHolHAHZw8As2:0:0:root:/root:/bin/ksh
jas:nJz3ru5a/44Ko:100:100:John Shepherd:/home/jas:/bin/ksh
ls | head –10 | tee first_10 | tail –5 cs1021:iZ3sO90O5eZY6:101:101:COMP1021:/home/cs1021:/bin/bash
cs2041:rX9KwSSPqkLyA:102:102:COMP2041:/home/cs2041:/bin/csh
who | tee user_list | wc cs3311:mLRiCIvmtI9O2:103:103:COMP3311:/home/cs3311:/bin/sh

7
cut: select columns cut examples
cut -f 1 < data
• The cut command prints selected parts of input lines.
– can select columns (assumes tab-separated input) cut -f 1-3 < data
– can select a range of character positions cut -f 1,4 < data
• Some options: cut -f 4- < data
_ -f listOfCols: print only the specified columns (tab- cut -d'|' -f 1-3 < data
separated) on output
_ -c listOfPos: print only chars in the specified positions cut -c 1-4 < data
_ -d c: use character c as the column separator Unfortunately, there's no way to refer to "last
• Lists are specified as ranges (e.g. 1-5) or comma- column" without counting the columns.
separated (e.g. 2,4,5).

paste: join columns paste example


cut -f 1 < data > data1
• The paste command displays several text files "in
parallel" on output. cut -f 2 < data > data2
• If the inputs are files a, b, c
1 3 5
2 4 6 cut -f 3 < data > data3
– the first line of output is composed
of the first lines of a, b, c paste data1 data3 data2 > newdata
1 3 5
– the second line of output is composed 2 4 6
of the second lines of a, b, c
• Lines from each file are separated by a tab character.
• If files are different lengths, output has all lines from
longest file, with empty strings for missing lines.

sort: Sort lines of a file sort: Options


• Syntax: sort [-dftnr] [-o filename] [filename(s)]
• The sort command copies input to output but
-d Dictionary order, only letters, digits, and whitespace
ensures that the output is arranged in ascending are significant in determining sort order
order of lines. -f Ignore case (fold into lower case)
– By default, sorting is based on ASCII comparisons of -t Specify delimiter
the whole line.
-n Numeric order, sort by arithmetic value instead of
• Other features of sort: first digit
– understands text data that occurs in columns. -r Sort in reverse order
(can also sort on a column other than the first) -o filename - write output to filename, filename can be
– can distinguish numbers and sort appropriately the same as one of the input files
– can sort files "in place" as well as behaving like a filter • Lots of more options…
– capable of sorting very large files

8
sort: Specifying fields sort Examples
• Delimiter : -td
sort +2nr < data
• Old way: sort –k2nr data
– +f[.c][options] [-f[.c][options]
• +2.1 –3 +0 –2 +3n sort -t: +4 /etc/passwd
– Exclusive sort -o mydata mydata
– Start from 0 (unlike cut, which starts at 1)
• New way:
– -k f[.c][options][,f[.c][options]]
• -k2.1 –k0,1 –k3n
– Inclusive
– Start from 1

uniq: list UNIQue items wc: Counting results


• Remove or report adjacent duplicate lines
• The word count utility, wc, counts the
• Syntax: uniq [ -cdu] [input-file] [ output-file]
number of lines, characters or words
-c Supersede the -u and -d options and generate an
output report with each line preceded by an • Options:
occurrence count -l Count lines
-d Write only the duplicated lines -w Count words
-u Write only those lines which are not duplicated -c Count characters
– The default output is the union (combination) of -d • Default: count lines, words and chars
and -u

wc and uniq Examples tr: TRanslate Characters


• Copies standard input to standard output with
who | sort | uniq –d
substitution or deletion of selected characters
• Syntax: tr [ -cds ] [ string1 ] [ string2 ]
wc my_essay
• -d delete all input characters contained in string1
who | wc • -c complements the characters in string1 with respect
sort file | uniq | wc –l to the entire ASCII character set
• -s squeeze all strings of repeated output characters
sort file | uniq –d | wc –l
in the last operand to single characters
sort file | uniq –u | wc -l

9
tr (continued) tr uses
• tr reads from standard input.
– Any character that does not match a character in string1 • Change delimiter
is passed to standard output unchanged tr ‘|’ ‘:’
– Any character that does match a character in string1 is • Rewrite numbers
translated into the corresponding character in string2
and then passed to standard output tr ,. .,
• Examples • Import DOS files
– tr s z replaces all instances of s with z tr –d ’\r’ < dos_file
– tr so zx replaces all instances of s with z and o • Find printable ASCII in a binary file
with x
tr –cd ’\na-zA-Z0-9 ’ < binary_file
– tr a-z A-Z replaces all lower case characters with
upper case characters
– tr –d a-c deletes all a-c characters

xargs find utility and xargs


• Unix limits the size of arguments and environment • find . -type f -print | xargs wc -l
that can be passed down to child – -type f for files
• What happens when we have a list of 10,000 files – -print to print them out
to send to a command? – xargs invokes wc 1 or more times
• xargs solves this problem
– Reads arguments as standard input • wc -l a b c d e f g
– Sends them to commands that take file lists wc -l h i j k l m n o
– May invoke program several times depending on size …
of arguments • Compare to: find . -type f –exec wc -l {} \;
cmd a1 a2 …
a1 … a300 xargs cmd a100 a101 …
cmd
cmd a200 a201 …

Next Time
• Regular Expressions
– Allow you to search for text in files
– grep command
• We will soon learn how to write scripts that
use these utilities in interesting ways.

10

You might also like