Linux CT2
Linux CT2
Linux CT2
7
1.What is process
2.Process descriptor
3.Allocation and storing process descriptor
4.process state
5.process context/context switching
12.User premmtion
6.process creation
Fork v fork & clone system call
7.process termination
https://www.informit.com/articles/article.aspx?p=368650
(5 page article covers all above topics)
1. What is Process?
An instance of a running program is called a process. Every time you run a shell
command, a program is run and a process is created for it. Each process in Linux has a process
id (PID) and it is associated with a particular user and group account.
Linux is a multitasking operating system, which means that multiple programs can be
running at the same time (processes are also known as tasks). Each process has the
illusion that it is the only process on the computer. The tasks share common processing
resources (like CPU and memory).
2.Process Descriptor
Process Descriptor and the Task Structure
The kernel stores the list of processes in a circular doubly linked list called the task list3.
Each element in the task list is a process descriptor of the types struct task_struct, which
is defined in <linux/sched.h>. The process descriptor contains all the information about a
specific process.
The task_struct is a relatively large data structure, at around 1.7 kilobytes on a 32-bit
machine. This size, however, is quite small considering that the structure contains all the
information that the kernel has and needs about a process. The process descriptor
contains the data that describes the executing program—open files, the process's
address space, pending signals, the process's state, and much more
7
The new structure also makes it rather easy to calculate offsets of its values for use in
assembly code.
struct thread_info {
struct task_struct *task;
struct exec_domain *exec_domain;
unsigned long flags;
unsigned long status;
__u32 cpu;
__s32 preempt_count;
mm_segment_t addr_limit;
struct restart_block restart_block;
unsigned long previous_esp;
__u8 supervisor_stack[0];
};
Each task's thread_info structure is allocated at the end of its stack. The task element of
the structure is a pointer to the task's actual task_struct.
On x86, current is calculated by masking out the 13 least significant bits of the stack
pointer to obtain the thread_info structure. This is done by the current_thread_info()
function. The assembly is shown here:
Finally, current dereferences the task member of thread_info to return the task_struct:
current_thread_info()->task;
Contrast this approach with that taken by PowerPC (IBM's modern RISC-based
microprocessor), which stores the current task_struct in a register. Thus, current on PPC
merely returns the value stored in the register r2. PPC can take this approach because,
unlike x86, it has plenty of registers. Because accessing the process descriptor is a
common and important job, the PPC kernel developers deem using a register worthy for
the task
1. Process States
The state field of the process descriptor describes the current condition of the process
(see Figure 3.3). Each process on the system is in exactly one of five different states.
This value is represented by one of five flags:
TASK_ZOMBIE—The task has terminated, but its parent has not yet issued a wait4()
system call. The task's process descriptor must remain in case the parent wants to
access it. If the parent calls wait4(), the process descriptor is deallocated.
System calls and exception handlers are well-defined interfaces into the kernel. A
process can begin executing in kernel-space only through one of these interfaces—all
access to the kernel is through these interfaces.
int clone(int (*fn)(void *), void *stack, int flags, void *arg, ...
/* pid_t *parent_tid, void *tls, pid_t *child_tid */ );
Let’s breakdown some parts to understand more:
1. Nice values
The Linux kernel implements two separate priority ranges.The first is the nice value, a
number from –20 to +19 with a default of 0. Larger nice values correspond to a lower
priority—you are being “nice” to the other processes on the system. Processes with a
lower nice value (higher priority) receive a larger proportion of the system’s processor
compared to processes with a higher nice value (lower priority). Nice values are the
standard priority range used in all Unix systems, although different Unix systems apply
them in different ways, reflective of their individual scheduling algorithms. In other Unix-
based systems, such as Mac OS X, the nice value is a control over the absolute
timeslice allotted to a process; in Linux, it is a control over the proportion of
timeslice.You can see a list of the processes on your system and their respective nice
values (under the column marked NI) with the command ps -el
1. RT Priority
The second range is the real-time priority.The values are configurable, but by default
range from 0 to 99, inclusive. Opposite from nice values, higher real-time priority values
correspond to a greater priority.All real-time processes are at a higher priority than
normal processes; that is, the real-time priority and nice value are in disjoint value
spaces. Linux implements real-time priorities in accordance with the relevant Unix
standards, specifically POSIX.1b.All modern Unix systems implement a similar
scheme.You can see a list of the processes on your system and their respective real-
time priority (under the column marked RTPRIO) with the command ps -eo
state,uid,pid,ppid,rtprio,time,comm. A value of “-” means the process is not real-time
1. Time slice
The timeslice2 is the numeric value that represents how long a task can run until it is
preempted.The scheduler policy must dictate a default timeslice, which is not a trivial
exercise.Too long a timeslice causes the system to have poor interactive performance;
the system will no longer feel as if applications are concurrently executed.Too short a
timeslice causes significant amounts of processor time to be wasted on the overhead of
switching processes because a significant percentage of the system’s time is spent
switching from one process with a short timeslice to the next. Furthermore, the
conflicting goals of I/Obound versus processor-bound processes again arise: I/O-bound
processes do not need longer timeslices (although they do like to run often), whereas
processor-bound processes crave long timeslices (to keep their caches hot). With this
argument, it would seem that any long timeslice would result in poor interactive
performance. In many operating systems, this observation is taken to heart, and the
default timeslice is rather low—for example, 10 milliseconds. Linux’s CFS scheduler,
however, does not directly assign timeslices to processes. Instead, in a novel approach,
CFS assigns processes a proportion of the processor. On Linux, therefore, the amount
of processor time that a process receives is a function of the load of the system.This
assigned proportion is further affected by each process’s nice value.The nice value acts
as a weight, changing the proportion of the processor time each process receives.
Processes with higher nice values (a lower priority) receive a deflationary weight,
yielding them a smaller proportion of the processor; processes with smaller nice values
(a higher priority) receive an inflationary weight, netting them a larger proportion of the
processor.
1. Locking
The fundamental issue surrounding locking is the need to provide synchronization in
certain code paths in the kernel. These code paths, called critical sections, require some
combination of concurrency or re-entrancy protection and proper ordering with respect to
other events. The typical result without proper locking is called a race condition. Realize
how even a simple i++ is dangerous if i is shared! Consider the case where one
processor reads i, then another, then they both increment it, then they both write i back
to memory. If i were originally 2, it should now be 4, but in fact it would be 3!
This is not to say that the only locking issues arise from SMP (symmetric
multiprocessing). Interrupt handlers create locking issues, as does the new preemptible
kernel, and any code can block (go to sleep). Of these, only SMP is considered true
concurrency, i.e., only with SMP can two things actually occur at the exact same time.
The other situations—interrupt handlers, preempt-kernel and blocking methods—provide
pseudo concurrency as code is not actually executed concurrently, but separate code
can mangle one another's data.
These critical regions require locking. The Linux kernel provides a family of locking
primitives that developers can use to write safe and efficient code.
1. Priority arrays
2. User level threads and kernel level threads
3. What is shell
4. Types of shell
5. How to execute and run a shell script
6. Shell programming (variables in linux, rules naming variables adv system variables)
7. Quotes in shell script
8. Echo options in shell script
9. Shell arithmetic
10. Command line argument shell script bc command if statement, expr or test statement
11. If else statement
12. For loop case statement user interface
9.Nice Values
Nice value — Nice values are user-space values that we can use to control the
priority of a process. The nice value range is -20 to +19 where -20 is highest, 0
default and +19 is lowest.
Source:
https://medium.com/@chetaniam/a-brief-guide-to-priority-and-nice-values-
in-the-linux-ecosystem-fb39e49815e0
10.RT Priority
Linux Real-Time scheduling
all tasks with static priority less than 100 are real-time tasks.
highest priority is a FIFO task, which runs until it suspends —
this prevents all other (lower priority) tasks from running. next
highest priority is a RR task, which runs until its timeslice
expires.
RT in top command-
linux top scheduling htop priority. In the top and htop tools,
processes (or/and threads depending on display settings) having
the highest realtime priority (99 from the userland API point of
view) with either the scheduling policy SCHED_RR or
SCHED_FIFO the priority is displayed as RT
11.Time Slice
In a time slice, a short period of time is assigned to execute a CPU. A time slice is a time allocation for each
process that will run in a preemptive multitasking CPU. Each time a process is run, the scheduler runs it.
13.Run queue
Active processes are placed in an array called a run queue, or runqueue. The run queue
may contain priority values for each process, which will be used by the scheduler to
determine which process to run next. To ensure each program has a fair share of
resources, each one is run for some time period (quantum) before it is paused and
placed back into the run queue. When a program is stopped to let another run, the
program with the highest priority in the run queue is then allowed to execute.
Processes are also removed from the run queue when they ask to sleep, are waiting on a
resource to become available, or have been terminated.
In the Linux operating system (prior to kernel 2.6.23), each CPU in the system is given a
run queue, which maintains both an active and expired array of processes. Each array
contains 140 (one for each priority level) pointers to doubly linked lists, which in turn
reference all processes with the given priority. The scheduler selects the next process
from the active array with highest priority. When a process' quantum expires, it is placed
into the expired array with some priority. When the active array contains no more
processes, the scheduler swaps the active and expired arrays, hence the name O(1)
scheduler.
In UNIX or Linux, the sar command is used to check the run queue.
The vmstat UNIX or Linux command can also be used to determine the number of
processes that are queued to run or waiting to run. These appear in the 'r' column.
There are two models for Run queues: one that assigns a Run Queue to each physical
processor, and the other has only one Run Queue in the system
14.Priority Array
The Priority Arrays
Each runqueue contains two priority arrays, the active and the expired array. Priority arrays
are defined in kernel/sched.c as struct prio_array. Priority arrays are the data
structures that provide O(1) scheduling. Each priority array contains one queue of runnable
processors per priority level. These queues contain lists of the runnable processes at each
priority level. The priority arrays also contain a priority bitmap used to efficiently discover
the highest-priority runnable task in the system.
struct prio_array {
int nr_active; /* number of tasks in the queues
*/
unsigned long bitmap[BITMAP_SIZE]; /* priority bitmap */
struct list_head queue[MAX_PRIO]; /* priority queues */
};
MAX_PRIO is the number of priority levels on the system. By default, this is 140. Thus, there
is one struct list_head for each priority. BITMAP_SIZE is the size that an array
of unsigned long typed variables would have to be to provide one bit for each valid priority
level. With 140 priorities and 32-bit words, this is five. Thus, bitmap is an array with five
elements and a total of 160 bits.
Each priority array contains a bitmap field that has at least one bit for every priority on the
system. Initially, all the bits are zero. When a task of a given priority becomes runnable
(that is, its state is set to TASK_RUNNING), the corresponding bit in the bitmap is set to one.
For example, if a task with priority seven is runnable, then bit seven is set. Finding the
highest priority task on the system is therefore only a matter of finding the first set bit in
the bitmap. Because the number of priorities is static, the time to complete this search is
constant and unaffected by the number of running processes on the system. Furthermore,
each supported architecture in Linux implements a fast find first set algorithm to quickly
search the bitmap. This method is called sched_find_first_bit(). Many architectures
provide a find-first-set instruction that operates on a given word[4]. On these systems,
finding the first set bit is as trivial as executing this instruction at most a couple of times.
[4]
On the x86 architecture, this instruction is called bsfl. On PPC, cntlzw is used for this purpose.
The priority array also contains a counter, nr_active. This is the number of runnable tasks
in this priority array
User-level threads are easier and faster to create than kernel-level threads. They can
also be more easily managed.
User-level threads can be run on any operating system.
There are no kernel mode privileges required for thread switching in user-level threads.
Disadvantages of User-Level Threads
Some of the disadvantages of user-level threads are as follows −
A mode switch to kernel mode is required to transfer control from one thread to another
in a process.
Kernel-level threads are slower to create as well as manage as compared to user-level
threads.
Difference
Difference between User Level thread and Kernel Level thread
User level thread Kernel level thread
User thread are implemented by users. kernel threads are implemented by OS.
OS doesn’t recognize user level threads. Kernel threads are recognized by OS.
If one user level thread perform blocking If one kernel thread perform blocking operation
operation then entire process will be blocked. then another thread can continue execution.
User level threads are designed as dependent Kernel level threads are designed as independent
threads. threads.
Example : Java thread, POSIX threads. Example : Window Solaris