Performance Analysis of N-Computing Device Under Various Load Conditions
Performance Analysis of N-Computing Device Under Various Load Conditions
Performance Analysis of N-Computing Device Under Various Load Conditions
org
Department of computer science and Engineering, Institute of Technology and Management, Gorakhpur, India
GIDA,
Abstract: In the present time, a personal computer has a very high processing power than required for a single
user system. Hence, it can be effectively used as a multiuser system which serves several users concurrently. This can be achieved by N-computing. We will create a multiuser environment on a uniprocessor system; this can be achieved, when there is a separate kernel for each user in the same operating system. This concept is related to the concept of mapping between user level thread and kernel level thread, which is discussed in this paper. Further the systems performance under various load conditions will be analyzed. In other words we want to utilize the full processing power of a personal computer for number of users simultaneously and also provide better performance at less cost. Keywords: Kernel, mapping, n-computing, thread, uniprocessor.
I.
Introduction
N-computing is technology that allows multiple users to share single computer simultaneously; this means that with n-computing we could have one ordinary desktop computer catering for people or more at the same time. The effectiveness of parallel computing depends to a great extent on the performance of the primitives that are used to express and control the parallelism within programs. It exhibit poor performance if the cost of creating and managing parallelism is high. Even a fine-grained program can achieve good performance if the cost of creating and managing parallelism is low. Threads are a lighter-weight abstraction; multiple threads share an address space and its resources, and communication can be accomplished through shared data. Kernel level threads are, effectively, processes that share code and data space, where user level threads are implemented at the application level. This research divides responsibility for thread management between the kernel and application. Multithreading has emerged as a leading paradigm for the development of applications with demanding performance requirement. When number of users increase, so does the number of processes and hence number of threads, then performance is a major issue. This can be achieved by better thread management by kernel and better mapping of kernel level thread to user level thread which is explained in this paper.
II.
A thread is a light-weight process. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources. On a single processor, multithreading generally occurs by time-division multiplexing (as in multitasking): the processor switches between different threads. This context switching generally happens frequently enough that the user perceives the threads or tasks as running at the same time. On a multiprocessor (including multi-core system), the threads or tasks will actually run at the same time, with each processor or core running a particular thread or task. Many modern operating systems directly support both time-sliced and multiprocessor threading with a process scheduler. The kernel of an operating system allows programmers to manipulate threads via the system call interface. Some implementations are called a kernel thread, whereas a lightweight process (LWP) is a specific type of kernel thread that shares the same state and information. Multithreading has emerged as a leading paradigm for development of application with demanding performance requirements (For real time systems, a dynamic uniprocessor scheduling algorithm has an O (n log n) Worst case complexity [1].) Generally, threads are located on shared data structure: a shared run queue for ready threads and shared communication structures for blocked thread. Access control to the shared resource is maintained through lock-based mechanism which ensures safe access to such critical section of core [2]. There are two commonly used thread models: kernel level threads and user level threads. Kernel level threads suffer from the cost of frequent user-kernel domain crossings kernel scheduling priorities. User level threads are not integrated with the kernel; blocking all threads whenever one thread is blocked. The Scheduler Activations model, proposed by Anderson et al., combines CPU allocation decisions with application control over thread www.iosrjournals.org 46 | Page
Process
Thread
User space
Kernel
space
Kernel
Kernel
Thread table
Kernel level threads share some of the disadvantages of processes. Switching between them is slow, taking an order of magnitude more time than a user level thread context switch. Also they are scheduled by the kernel; with no application control .this can be negatively affect performance. For example, if threads have different priorities and the priorities are not visible to the kernel, a low priority thread may be scheduled in place of a high priority one. User level threads systems control scheduling decisions, but because they are not integrated with the kernel, when one thread blocks (e.g. to perform I\O), all of the user level threads sharing the process are blocked. This advantage of a multithreaded program allows it to operate faster on computer systems that have multiple CPUs, CPUs with multiple cores or across a cluster of machines because the threads of the program naturally lend themselves to truly concurrent execution. In such a case, the programmer needs to be careful to avoid race conditions, and other non-intuitive behaviors. In order for data to be correctly manipulated, threads will often need to rendezvous in time in order to process the data in the correct order. Threads may also require mutually exclusive operations (often implemented using semaphores) in order to prevent common data from being simultaneously modified, or read while in the process of being modified. Careless use of such primitives can lead to deadlocks. The operating system kernel has complete control over the allocation of processors among address spaces including the ability to change the number of processors assigned to an application during its execution. To achieve this, the kernel notifies the address space thread scheduler of every kernel event affecting the address space, allowing the application to have complete knowledge of its scheduling state. . For fixed priority scheduling on uniprocessors, under certain conditions, the system's schedulability is maximized when the priorities are chosen in the inverse order of the task's deadlines [4]. The thread system in each address space notifies the kernel of the subset of user-level thread operations that can affect processor allocation decisions, preserving good performance for the majority of operations that do not need to be reflected to the kernel [5].
Todays operating systems provide kernel threads for parallel applications and multi-threaded servers. Scheduling plays an important role with regard to efficiency and fairness especially for distributed applications, multimedia processing and server processes. A multi-threaded application should be able to specify the scheduling strategy for its threads itself. In most modern operating systems the scheduling strategy is hardcoded into the kernel and cannot be changed by the user. There are a few user-level thread packages available, where the users can dene the scheduling strategy. Yet user-level threads are not suitable for applications that interact with the operating system frequently, such as server processes or distributed applications. In this paper we present a concept that allows handling of kernel-thread scheduling from the user level using hierarchical schedulers. Each application can have one or more of its own schedulers, which can dene the applicationspecic scheduling strategy. Thus, the programmer can implement his own scheduling strategy for his application or even for subsystems inside the application [6]. www.iosrjournals.org 47 | Page
III.
Thread Scheduling
IV.
Threads can be supported either at user level or in the kernel. Neither approach has been fully satisfactory. User-level threads are managed by runtime library routines linked into each application so that thread management operations require no kernel intervention. The result can be excellent performance: in systems, the cost of user-level thread operations is within an order of magnitude of the cost of a procedure call. User-level threads are also flexible; they can be customized to the needs of the language or user without kernel modification. User-level threads execute within the context of traditional processes; indeed, user-level thread systems are typically built without any modifications to the underlying operating system kernel. The thread package views each process as a virtual processor, and treats it as a physical processor executing under its control; each virtual processor runs user-level code that pulls threads off the ready list and runs them. In reality, though, these virtual processors are being multiplexed across real, physical processors by the underlying kernel. Real world operating system activity, such as multiprogramming, I/O, and page faults, distorts the equivalence between virtual and physical processors; in the presence of these factors, user-level threads built on top of traditional processes can exhibit poor performance or even incorrect behavior. Kernel is unaware of it. When one thread gets blocked, even though there are Runnable threads, they do not get a chance to run Kernel threads: Scheduled by kernel; in kernel's view, it is like a process only, but a lightweight process. Each such thread is a schedulable entity. User threads on top of kernel threads: Here kernel is aware of the threads that are running in the user space. So user level threads can make blocking calls and kernel can run other threads from the same process. Also, if a one user level thread blocks, then the kernel thread on which the user thread was running, is also blocked; and if that kernel thread was the only thread that was running on a processor, then the processor becomes unusable. If the user level thread is blocked, to have multiple kernel threads and context switch between them, then if all the kernel threads are blocked (because the corresponding user level thread blocks), then none will be running in the system. Instead of this, i.e., In this case, we can avoid the need to create multiple kernel threads right. The operating system kernel provides each user-level thread system with its own virtual multiprocessor, the abstraction of a dedicated physical machine except that the kernel may change the number of processors in that machine during the execution of the program [5]. There are several aspects to this abstraction: The kernel allocates processors to address spaces; the kernel has complete control over how many processors to give each address space's virtual multiprocessor. Each address spaces user-level thread system has complete control over which threads to run on its allocated processors, as it would if the application were running on the bare physical machine. The kernel notifies the user-level thread system whenever the kernel changes the number of processors assigned to it; the kernel also notifies the thread system whenever a user-level thread blocks or wakes up in the kernel (e.g., on I/O or on a page fault). The kernel's role is to vector events to the appropriate thread scheduler, rather than to interpret these events on its own. The user-level thread system notifies the kernel when the application needs more or fewer processors. The kernel uses this information to allocate processors among address spaces. However, the user level notifies the kernel only on that subset of user-level thread operations that might affect processor allocation decisions. As a result, performance is not compromised; the majority of thread operations do not suffer the overhead of communication with the kernel. The application programmer sees no difference, except for performance, from programming directly with kernel threads. There are two types of threads to be managed in a modern system: User threads and kernel threads. User threads are supported above the kernel, without kernel support. These are the threads that application programmers would put into their programs. Kernel threads are supported within the kernel of the OS itself. All modern OS support kernel level threads, allowing the kernel to perform multiple simultaneous tasks and/or to service multiple kernel system calls simultaneously. In a specific implementation, the user threads must be mapped to kernel threads, using one of the following strategies:
www.iosrjournals.org
49 | Page
Performance analysis of N-computing device under various load conditions A multithreaded process User Kernel Kernel thread Kernel thread Kernel thread The kernel dispatcher manages run queues of Runnable kernel thread
Figure 2: Mapping between user level thread and kernel level thread Many-To-One Model
In the many-to-one model, many user-level threads are all mapped onto a single kernel thread. Thread management is handled by the thread library in user space, which is very efficient. However, if a blocking system call is made, then the entire process blocks, even if the other user threads would otherwise be able to continue. Because a single kernel thread can operate only on a single CPU, the many-to-one model does not allow individual processes to be split across multiple CPUs. Green threads for Solaris and GNU Portable Threads implement the many-to-one model.
One-To-One Model
The one-to-one model creates a separate kernel thread to handle each user thread. One-to-one model overcomes the problems listed above involving blocking system calls and the splitting of processes across multiple CPUs. However the overhead of managing the one-to-one model is more significant, involving more overhead and slowing down the system. Most implementations of this model place a limit on how many threads can be created. Linux and Windows from 95 to XP implement the one-to-one model for threads.
Many-to-Many Model
The many-to-many model multiplexes any number of user threads onto an equal or smaller number of kernel threads, combining the best features of the one-to-one and many-to-one models. Users have no restrictions on the number of threads created. Blocking kernel system calls do not block the entire process. Processes can be split across multiple processors. Individual processes may be allocated variable numbers of kernel threads, depending on the number of CPUs present and other factors. One popular variation of the manyto-many model is the two-tier model, which allows either many-to-many or one-to-one operation. IRIX, HPUX, and Tru64 UNIX use the two-tier model, as did Solaris prior to Solaris 9. Threads are the vehicle for concurrency in many approaches to parallel programming. Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully satisfactory. The effectiveness of parallel computing depends to a great extent on the performance of the primitives that are used to express and control the parallelism within programs. Even a coarse-grained parallel program can exhibit poor performance if the cost of creating and managing parallelism is high. Even a fine-grained program can achieve good performance if the cost of creating and managing parallelism is low. One way to construct a parallel program is to share memory between a collection of traditional UNIX-like processes, each consisting of a single address space and a single sequential execution stream within that address space [5].
V.
Multitasking On Uniprocessor
Multitasking is the ability of a computer to run more than one program, or task , at the same time. Multitasking contrasts with single-tasking, where one process must entirely finish before another can begin. www.iosrjournals.org 50 | Page
www.iosrjournals.org
51 | Page
Performance analysis of N-computing device under various load conditions No direct data exchange between Modules O.S. Inter face
User App
Memory module
Process module
Syst em call
Microkernel
File mo dul e
VI.
Conclusion
www.iosrjournals.org
52 | Page