0% found this document useful (0 votes)

197 views63 pages

Dynamic Scheduler For Multi-Core Processor - Final Report - All 4 Names

The document presents a final project report on developing a dynamic scheduler for multi-core processors submitted by 5 students. It discusses the need for scheduling algorithms that can efficiently distribute tasks across multiple cores of a multi-core processor to improve utilization. The proposed dynamic scheduling algorithm would have the scheduler reside on all cores and access a shared task data structure to distribute ready tasks whenever a processor is idle.

Uploaded by

Abbas Baramatiwala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

197 views63 pages

Dynamic Scheduler For Multi-Core Processor - Final Report - All 4 Names

Uploaded by

Abbas Baramatiwala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 63

A Final Project Report on

Dynamic Scheduler for Multi-Core Processor

Submitted in partial fullment of the requirements for the Degree of

Bachelor of Engineering
in Information Technology by

Sachin A. Janani Balaji B. Ankamwar Vaijinath T. Jadhav Abbas A. Baramatiwala

Under the guidance of

Prof. Dinesh A. Zende

Department of Information Technology

Vidya Pratishthans College of Engineering Baramati 413133, Dist- Pune (M.S.) INDIA April 2012

VPCOE, Baramati
Department of Information Technology

Certificate
This is to certify that the dissertation entitled

Dynamic Scheduler for Multi-Core Processor

submitted by Sachin A. Janani Balaji B. Ankamwar Vaijinath T. Jadhav Abbas A. Baramatiwala is a record of bona-fide work carried out by them, in the partial fulfillment of the requirement for the award of Degree of Bachelor of Engineering in Information Technology at Vidya Pratishthans College of Engineering, Baramati under the University of Pune. This work is done during year 2011-12, under our guidance.

Prof. Dinesh A. Zende Assistant Professor

Prof. S. A. Takale Head of Dept.

Dr. S. B. Deosarkar Principal

Examiner 1:

Examiner 2:

Acknowledgements
It was highly eventful at the department of Information Technology, Vidya Pratishthans college of Engineering. Working with highly devoted professor community with remains most memorable experience of our life. Hence this acknowledgement is humble attempt to honestly thank all those who were directly or indirectly involved in our project and were of immense help to us.

We would personally like to thank Prof. S. A. Takale, HOD of Information Technology department who, with such undying interest reviewed and enclosed this project report. We take this opportunity to thank respected Prof. Dinesh A. Zende , our project guide for his generous assistance. Lastly, We would like to thank our Principal Dr. S. B. Deosarkar who created a healthy environment for all of us to learn in best possible way.

Janani Sachin A. Ankamwar Balaji B. Jadhav Vaijnath T. Baramatiwala Abbas A.

Abstract
Many dynamic scheduling algorithms have been proposed in the past. With the advent of multi core processors, there is a need to schedule multiple tasks on multiple cores. The scheduling algorithm needs to utilize all the available cores eciently. The multicore processors may be SMPs or AMPs with shared memory architecture. In this, we propose a dynamic scheduling algorithm in which the scheduler resides on all cores of a multi-core processor and accesses a shared Task Data Structure (TDS) to pick up ready-to-execute tasks. This method is unique in the sense that the processor has the onus of picking up tasks whenever it is idle. We have discussed the proposed scheduling algorithm using a set of tasks as an example.

Also High performance on multicore processors requires that schedulers be reinvented. Traditional schedulers focus on keeping execution units busy by assigning each core a thread to run. Schedulers ought to focus, however, on high utilization of on-chip memory, rather than of execution cores, to reduce the impact of expensive DRAM and remote cache accesses. A challenge in achieving good use of on-chip memory is that the memory is split up among the cores in the form of many small caches. This scheduling that assigns each object and its operations to a specic core, moving a thread among the cores as it uses dierent objects.

Contents
Acknowledgements Abstract Keywords Notation and Abbreviations 1 Introduction 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Related Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Literature Survey 2.1 Need of the topic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Proposed Work 3.1 Problem Denition 3.2 Project Scope . . . 3.3 Project Objectives 3.4 Project Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i ii vii viii 1 1 2 3 5 6 7 7 7 7 8 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 12 12 12 17 18 22 25 25 37 38

4 Research Methodology 5 Project Design 5.1 Hardware Requirements 5.2 Software Requirements . 5.3 Risk Analysis . . . . . . 5.4 Data Flow Diagrams . . 5.5 Project Schedules . . . . 5.6 UML Documentations .

6 System Implementations 6.1 Important Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Important Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Important Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

CONTENTS

7 System Testing 8 Experimental Results 9 Conclusion 10 Future Scope A Appendix References

40 41 44 45 46 51

List of Tables
5.1 7.1 8.1 Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Test Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dependency Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 40 41

List of Figures
5.1 5.2 5.3 5.4 5.5 5.6 6.1 6.2 8.1 Editing the GRUB 2 Menu DFD . . . . . . . . . . . . Gantt Chart . . . . . . . . Use Case . . . . . . . . . . Flow Chart . . . . . . . . Sequence Diagram . . . . During Boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 17 21 22 23 24 39 39 42 48 49

Dynamic sheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Task Data Structure(TDS) . . . . . . . . . . . . . . . . . . . . . . . . . . Simulation Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A.1 Menucong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 menu.lst le . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Keywords
List of keywordsDynamic scheduler; multi-core systems; load balancing; work load distribution; anity scheduling; thread migration; thread scheduling

vii

Notation and Abbreviations

SMP-Symmetric Multiprocessor AMP-Asymmetric Multiprocessor TDS-Task Data Structure SCU-Software Cache Unication NUMA-Non-Uniform Memory Access UMA-Uniform Memory Access MPSoC-Multi-Processor System on Chip DSM-distributed shared memory

viii

Chapter 1

Introduction
1.1 Introduction
Multi-core processors have two or more processing elements or cores on a single chip. These cores could be of similar architecture (Synchronous Multicore Processors, SMPs) or of dierent architecture (Asynchronous Multicore Processors, AMPs). All the cores necessarily use shared memory architecture. Multicore processors have existed previously in the form of MPSoC (Multi-Processor System on Chip) but they were limited to a segment of applications such as networking. The easy availability of multicore has forced software programmers to change the way they think and write their applications. Unfortunately, the applications written so far are sequential in nature. We can extract the inherent parallelism in such applications to exploit the available multi core architecture. To do so, conversion of sequential code to parallel code or writing parallel applications from scratch may not alone solve the problem optimally. There is a denite need for scheduling algorithms suitable for shared memory architecture to increase the eciency of multi-core processors in presence of multiple tasks within an application. Most of the proposed scheduling algorithms for multi-core processors concentrate on scheduling tasks that are independent of each other. This means that execution of one task does not aect or is not dependent on the result of other tasks and they may execute concurrently. To utilize multi-core processors more eciently for embedded applications where only one single application executes at any time, the application should be divided into

1.2. MOTIVATION

CHAPTER 1. INTRODUCTION

subtasks. This demands a scheduling algorithm that can be ecient enough to exploit the multicore architecture to achieve an optimal schedule in terms of time of execution and processor utilization.

1.2

Motivation
With the emergence of multicore chips, future distributed shared memory (DSM)

systems will have less powerful processor cores but will have tens of thousands of cores. Performance asymmetry in multicore platforms is another trend due to budget issues such as power consumption and area limitation as well as various degrees of parallelism in dierent applications [5]. We call such a system heterogeneous manycore DSM system. Processor cores belonging to the same level (e.g., same chip or board) frequently share memory resources. For instance, cores on the same chip may share an L2 or L3 cache. The shared-memory programming model is capable of attaining the benets of largescale parallel computing without surrendering much programmability [8]. Using the shared-memory model, a program can be written as if it were running on a large processor count SMP machine. From the perspective of application developers, all processors provide identical performance and the memory access time from each processor is also uniform. This model has been widely accepted and used for a long time. Now if we compare the real architecture and the vision of the architecture from the developers angle, there is a big gap between them. A number of long-standing assumptions are broken. Instead of a uniform memory access time, there are various memory latencies. The immediate result is that placing threads on arbitrary processors may lead to suboptimal performance when there are data accessed in common by threads. Heterogeneous cores provide dierent compute powers. Developers still should be able to write portable programs regardless of dierent machines. When the number of user-level threads is greater than the number of kernel threads, anity based thread scheduling must be taken into account to maximize the program locality.
Dynamic Scheduler for Multi-Core Processor 2 VPCOE, Baramati

1.3. RELATED THEORY

CHAPTER 1. INTRODUCTION

If a number of cores share a certain level of cache, problems may arise due to resource contention. We hope to nd a method to reschedule threads to close the above gap and improve the multithreaded programs performance. The scheduling method should be automatic and applicable to a variety of general-purpose programs. Another issue is that multicore chips consist of relatively simple processor cores and will be underutilized if user programs cannot provide sucient thread level parallelism. It is the developers responsibility to write high performance parallel software to fully utilize the processor cores. To achieve high performance, we believe that the new parallel multicore software should have the following two characteristics: Fine grain threads. We need a high degree of parallelism to keep every processor core busy. Another reason is that a core often has a small-size cache or scratch buer to work on, which requires developers decompose a task into smaller tasks. Asynchronous program execution. When there are many processor cores, the presence of a synchronization point can seriously aect the program performance. And eliminating unnecessary synchronization points can increase the degree of parallelism accordingly. Therefore, we want to adopt the current scheduling approach to designing new dynamic scheduler for multicore architectures. The dynamic scheduling approach places ne grain computational tasks in a directed acyclic graph and schedules them dynamically depending on data dependence, program locality, and critical path.

1.3

Related Theory
Since IBM released Power4 (dual cores) in 2001 and Sun Microsystems released

Ultra- SPARC T1 (eight cores) in 2005, there are great numbers of multicore chips implemented by various vendors [6]. Traditional microarchitectures typically relies on increasing the complexity of the logic, wire, and design to nd more Instruction Level
Dynamic Scheduler for Multi-Core Processor 3 VPCOE, Baramati

1.3. RELATED THEORY

CHAPTER 1. INTRODUCTION

Parallelism (ILP) such as out-of-order and speculative instruction executions within a sequential program. But this trend cannot continue any more due to the diminishing returns for large increases in complexity and the exponentially rising processor clock rates [7]. Compared to the traditional microarchitectures, multicore chips have a simple design, higher performance-to-area ratio, and better power eciency. Thus, hardware architects have changed their course to rely on multicore architectures. Multicore (or manycore) processors with hundreds of processing cores on a single die are also imminent in the near future. Both shared-memory and distributed-memory platforms could consist of multicore systems. There are two programming models to develop parallel programs: sharedmemory programming model and distributed-memory programming model. This dissertation rst depicts what a future shared-memory multicore machine will look like and then proposes a static scheduling method to improve program performance. Next, the dissertation studies how to use the dynamic directed acyclic graph (DAG) scheduling approach to developing new parallel software for both shared-memory and distributedmemory multicore systems.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Chapter 2

Literature Survey
Many researchers have proposed various dynamic scheduling techniques over the past few years. This section gives an overview of some of the prominent work in this area. Megel et al discuss an improvement over the optimal nish time (OFT) algorithm for reducing pre-emption for embedded real time applications. Kurzak and Dongarra propose a data ow based scheduler and discuss provision of data reuse. Their scheduling algorithm is intended for numerical computation based applications. The proposed method relies on data dependency analysis between tasks in a sequential representation of programs. Jooya et als method based on recording application resource utilization and throughput to adaptively change cores for applications at runtime is applicable only to heterogeneous multi core processors. The method also saves power by downgrading applications with low resource utilization to weaker cores. Manikandan Baskaran et al discuss a compile time technique that dynamically extracts inter-tile dependencies and schedule the parallel tiles on the cores for improved scalability on multicores. The approach is applicable to multi-statement input programs with statements of dierent dimensionalities. Ali presents a framework for expressing, evaluating and dynamically executing schedules for FFT computations on hierarchical and shared memory multicore architectures. It describes a FFT schedule specication language. This language is used to generate one-dimensional serial FFT schedule, multi-dimensional serial FFT schedule and parallel FFT schedules. Wang presents a scheduler modeled to accept a task graph, and analyze the same and then put the tasks in scheduling queue. The algorithm involves

2.1. NEED OF THE TOPIC

CHAPTER 2. LITERATURE SURVEY

prediction according to the history record of task scheduling. It then rearranges a long task into smaller subtasks to form another task state graph and then schedule them in parallel. Blagojevic et al examine user-level schedulers that dynamically right sizes the dimensions and degrees of parallelism on the cell broadband engine. Blagojevic et al mention a new method using sampling of dominant execution phases to converge to the optimal scheduling algorithm.

2.1

Need of the topic

The main objective of the project dynamically load balancing the arrived task on

multiple cores of the system and to investigate how to eectively schedule threads to improve program performance on multicore architectures. This formulates the anitybased thread scheduling problem on shared-memory multicore systems and proposes a static feedback-directed approach to computing optimized thread schedules to improve the eectiveness on every level of a complex memory hierarchy while keeping load balance. The dissertation also studies the dynamic data-availability driven scheduling approach for ne grain parallel programs and demonstrates the scalability and practicality of the approach on both shared-memory and distributed-memory multicore systems.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Chapter 3

Proposed Work
3.1 Problem Denition
We are going to develope the Dynamic Scheduler For Multicore Processor System,In this, we propose a dynamic scheduling algorithm in which the scheduler resides on all cores of a multi-core processor and accesses a shared Task Data Structure (TDS) to pick up ready-to-execute tasks.

3.2

Project Scope
Uptil now the number of apllications are developed but these applications are run

eciently on the single core processor system but these applications does not eciently run on the multicore systems that means these applications does not utilize all cores of the processor equally, generally applications run on rst core but another cores does not get utilized while rst core is get burdened by assigning task.

3.3

Project Objectives
Multi-core processors have two or more processing elements or cores on a single

chip. These cores could be of similar architecture (Synchronous Multicore Processors, SMPs) or of dierent architecture (Asynchronous Multicore Processors, AMPs). All the cores necessarily use shared memory architecture. Multicore processors have existed
7

3.4. PROJECT CONSTRAINTS

CHAPTER 3. PROPOSED WORK

previously in the form of MPSoC (Multi-Processor System on Chip) but they were limited to a segment of applications such as networking. The easy availability of multicore has forced software programmers to change the way they think and write their applications. Unfortunately, the applications written so far are sequential in nature. We can extract the inherent parallelism in such applications to exploit the available multi core architecture. To do so, conversion of sequential code to parallel code or writing parallel applications from scratch may not alone solve the problem optimally. There is a denite need for scheduling algorithms suitable for shared memory architecture to increase the eciency of multi-core processors in presence of multiple tasks within an application. Most of the proposed scheduling algorithms for multi-core processors concentrate on scheduling tasks that are independent of each other. This means that execution of one task does not aect or is not dependent on the result of other tasks and they may execute concurrently. To utilize multi-core processors more eciently for embedded applications where only one single application executes at any time, the application should be divided into subtasks. This demands a scheduling algorithm that can be ecient enough to exploit the multicore architecture to achieve an optimal schedule in terms of time of execution and processor utilization.

3.4

Project Constraints
Dynamic Multicore exceeds performance expectation in some workloads on mul-

ticore systems. But it still shows some weakness in other workloads. There are some weakness about irresponsiveness of Dynamic Multicore scheduler in 3D game area. In the current implemented Dynamic Scheduling policy we can not handle the deadlock occured while task scheduling and balancing the task on multiple cores.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Chapter 4

Research Methodology
With the emergence of multicore chips, future distributed shared memory (DSM) systems will have less powerful processor cores but will have tens of thousands of cores. Performance asymmetry in multicore platforms is another trend due to budget issues such as power consumption and area limitation as well as various degrees of parallelism in dierent applications [Balakrishnan et al., 2005, Kumar et al., 2004, Kumar et al., 2006]. We call such a system heterogeneous manycore DSM system. Processor cores belonging to the same level (e.g., same chip or board) frequently share memory resources. For instance, cores on the same chip may share an L2 or L3 cache. The shared-memory programming model is capable of attaining the benets of largescale parallel computing without surrendering much programmability [Lu et al., 1995]. Using the shared-memory model, a program can be written as if it were running on a large processor count SMP machine. From the perspective of application developers, all processors provide identical performance and the memory access time from each processor is also uniform. This model has been widely accepted and used for a long time. Now if we compare the real architecture and the vision of the architecture from the developers angle, there is a big gap between them. A number of long-standing assumptions are broken. We hope to nd a method to reschedule threads to close the above gap and improve the multithreaded programs performance. The scheduling method should be automatic and applicable to a variety of general-purpose programs. Another issue is that multicore chips consist of relatively simple processor cores and will be underutilized if user

CHAPTER 4. RESEARCH METHODOLOGY

programs cannot provide sucient thread level parallelism. It is the developers responsibility to write high performance parallel software to fully utilize the processor cores. To achieve high performance, we believe that the new parallel multicore software should have the following two characteristics:

1. Fine grain threads. We need a high degree of parallelism to keep every processor core busy. Another reason is that a core often has a small-size cache or scratch buer to work on, which requires developers decompose a task into smaller tasks. 2. Asynchronous program execution. When there are many processor cores, the presence of a synchronization point can seriously aect the program performance. And eliminating unnecessary synchronization points can increase the degree of parallelism accordingly. Therefore, we want to adopt the current scheduling approach to designing new dynamic scheduler for multicore architectures. The dynamic scheduling approach places ne grain computational tasks in a directed acyclic graph and schedules them dynamically depending on data dependence, program locality, and critical path. The most signicant change in 2.6 Linux Kernel which improved scalability in multi processor system was in the kernel process scheduler. The design of Linux 2.6 scheduler is based on per cpu runqueues and priority arrays, which allow the scheduler perform its tasks in O(1) time. This mechanism solved many scalability issues but the scheduler still didnt perform as expected on Hyperthreaded systems and on higher end NUMA systems. In case of Hyper-threading, more than one logical CPU shares the processor resources, cache and memory hierarchy. And in case of NUMA, dierent nodes have dierent access latencies to the memory. These non uniform relationships between the CPUs in the system pose signicant challenge to the scheduler. Scheduler must be aware of these dierences and the load distribution needs to be done accordingly. To address this, 2.6 Linux kernel scheduler introduced a concept called scheduling domains [SD]. 2.6 Linux kernel used hierarchical scheduler domains constructed dynamically depending on the CPU topology in the system. Each scheduler domain contains a list of scheduler groups having a common property. Load balancer runs at each domain
Dynamic Scheduler for Multi-Core Processor 10 VPCOE, Baramati

CHAPTER 4. RESEARCH METHODOLOGY

level and scheduling decisions happen between the scheduling groups in that domain. On a high end NUMA system with processors capable of Hyper-threading, there will be three scheduling domains, one each for HT, SMP and NUMA. In the presence of Hyperthreading, when the system has fewer tasks compared to number of logical CPUs in the system, scheduler must distribute the load uniformly between the physical packages. This distribution will avoid scenarios in the system where one physical package has more than one logical CPU busy and another physical package is completely idle. Uniform load distribution between physical packages will lead to lower resource contention and higher throughput. Presence of Hyperthreading scheduler domain will help the scheduler achieve the equal load distribution between the physical packages. Similarly the NUMA scheduling domain will help in unnecessary task migration from one node to another. This will ensure that the tasks will stay most of the time in their home (where the task has allocated most of its memory) node.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Chapter 5

Project Design
5.1 Hardware Requirements

1. Processor: Multi-Core Processors 2. 256MB RAM.

5.2

Software Requirements

Operating System: Ubuntu xx.xx.xx(Any Linux OS) Application Software: 1. HackBench 2. GCC Compiler 3. GTK 4. GEdit 5. Latest Kernel

5.3

Risk Analysis
While Developing and installing kernel with the dynamic scheduler in the current

Linux Operating System number of problems are occurs but that can be solved.

5.3. RISK ANALYSIS

CHAPTER 5. PROJECT DESIGN

How To Enable Root User ( Super User ) in Ubuntu By default, root account password is locked in Ubuntu. While compiling the new kernel the default linux directory containing all .o les are created in the /usr/src directory. But the ordinary user accounts has not permissions to it. So, when you do su -, youll get Authentication failure error message as shown below.

$ su Password: Su: Authentication failure. First,unlock the root user and set a password for root user as shown below.

$ sudopasswd root [ sudo ]password for project: Enter new Unix password: Retype new Unix password: Password :password updated successfully. How do I update Ubuntu Linux softwares? In newely installed Linux OS there is no guarantee that all necessary packages are installed in it, for installing new kernel requires some special packages like ncurces,gtk,qtk,gcc,make,yum etc. so it needs to add these packages before starting actual task, this can be done by updating and upgrading the system.It can be upgraded using GUI tools or using traditional command line tools. Using apt-get command line tool apt-get update : Update is used to resynchronize the package index les from their sources via Internet. apt-get upgrade : Upgrade is used to install the newest versions of all packages currently installed on the system
Dynamic Scheduler for Multi-Core Processor 13 VPCOE, Baramati

5.3. RISK ANALYSIS

CHAPTER 5. PROJECT DESIGN

apt-get install package-name : install is followed by one or more packages desired for installation. If package is already installed it will try to update to latest version. $ sudo apt-get install update $sudo apt-get install update && sudo apt-get install upgrades Reading packages lists done Building dependency tree Reading state information done E:Unable to locate packages update If all the packags updates scessefully then Ubuntu compilation will be done easily.

Editing the GRUB 2 Menu During Boot After completing task we need to reboot the system to feel the new look of the new kernel ,but some times there is problems are occurs in loading the system. This problem is occurs due problems in menu.lst les. This problem can be solved by editing this le at the boot time by using following steps If the menu is displayed, the automatic countdown may be stopped by pressing any key other than the ENTER key. If the menu is not normally displayed during boot, hold down the SHIFT key as the computer attempts to boot to display the GRUB 2 menu. In certain circumstances, if holding the SHIFT key method does not display the menu pressing the ESC key repeatedly may display the menu. The user can edit entries in the GRUB 2 menu using the following instructions: With the menu displayed, press any key (except ENTER) to halt the countdown timer and select the desired entry with the up/down arrow keys. Press the e key to reveal the selections settings.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.3. RISK ANALYSIS

CHAPTER 5. PROJECT DESIGN

Figure 5.1: Editing the GRUB 2 Menu During Boot

Use the keyboard to position the cursor. In this example, the cursor has been moved so the user can change or delete the numeral 9. Make a single or numerous changes to any or every line. Do not use ENTER to move between lines. Tab completion is available, which is especially useful in entering kernel and initrd entries. When complete, determine the next step: CTRL-X - boot with the changed settings (highlighted for emphasis). C - go to the command line to perform diagnostics, load modules, change settings, etc. ESC - Discard all changes and return to the main menu. The choices are listed at the bottom of the screen as a reminder. Edits made to the menu in this manner are non-persistent. They remain in eect only for the current boot. The changes must be re-entered on the next boot.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.3. RISK ANALYSIS

CHAPTER 5. PROJECT DESIGN

Once successfully booted, the changes can be made permanent by editing the appropriate le, saving the le, and running update-grub as root.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.4. DATA FLOW DIAGRAMS

CHAPTER 5. PROJECT DESIGN

5.4

Data Flow Diagrams

Figure 5.2: DFD

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.5. PROJECT SCHEDULES

CHAPTER 5. PROJECT DESIGN

5.5

Project Schedules
Table 5.1: Schedule

# T1 T1.1

Tasks Beginning Phase In this phase we actually track out why the need of these project Establish Project Scope Establish Project Scope Create Test Plan Create Manufacturing Plan Establish Engineering Requirements Establish Communications Establish Project Goals Sta Project Establish Training Requirements Establish Engineering Requirements Establish Communications Analysis Phase In these we analyse various constraints involved in project development Develop Project Specications Develop Initial Documentation Conduct User Training Create Manufacturing Plan Create Marketing Plan

Days 12 1

Start 06-10-2011 06-10-2011

Finish 21-10-2011 06-10-2011

Assignments Sachin Janani, Abbas Baramatiwala

T1.2 T1.3 T1.4 T1.5 T1.6 T1.7 T1.8 T1.9 T1.10 T1.11 T1.12 T2 T2.1

1 1 1 1 1 1 1 1 1 1 1 11 5

07-10-2011 10-10-2011 11-10-2011 12-10-2011 13-10-2011 14-10-2011 17-10-2011 18-10-2011 19-10-2011 20-10-2011 21-10-2011 31-10-2011 31-10-2011

07-10-2011 10-10-2011 11-10-2011 12-10-2011 13-10-2011 14-10-2011 17-10-2011 18-10-2011 19-10-2011 20-10-2011 21-10-2011 14-11-2011 04-11-2011 Sachin Janani, Abbas Baramatiwala

T2.1 T2.1 T2.1 T2.1 T2.1

6 1 4 5 6

01-11-2011 02-11-2011 03-11-2011 04-11-2011 07-11-2011

08-11-2011 02-11-2011 08-11-2011 10-11-2011 14-11-2011

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.5. PROJECT SCHEDULES

CHAPTER 5. PROJECT DESIGN

T3 T3.1 T3.2 T4 T4.1

Design Phase Actual designing project takes place Develop Prototype Estimating Phase of

14 14 0 12 12

08-11-2011 08-11-2011 09-11-2011 10-11-2011 10-11-2011

25-11-2011 25-11-2011 09-11-2011 25-11-2011 25-11-2011

Abbas Baramatiwala, Sachin Janani

Vaijnath Jadhav, Balaji Ankamwar

T4.2 T5

In these phase we actually various estimation like effort,time,cost etc Estimate Costs, Savings and /or Revenues Coding Phase

0 50

11-11-2011 14-11-2011

11-11-2011 10-01-2012 Sachin Janani, Abbas Baramatiwala, Vaijnath Jadhav, Balaji Ankamwar

T5.1

T5.2

T5.3 T5.4 T5.5 T6

Actual development of project takes in these project To schedule the processes we rst calculate the threads in the process which are independent Complete Open Items Run Performance Tests Develop Prototype Debugging Phase

14-11-2011

10-01-2012

15-11-2011

16-11-2011

1 1 1 46

16-11-2011 17-11-2011 18-11-2011 21-11-2011

16-11-2011 17-11-2011 18-11-2011 23-01-2012

Sachin Janani, Abbas Baramatiwala, Vaijnath Jadhav, Balaji Ankamwar

T6.1 T6.2 T6.3 T6.4

In these phase the bugs in the project are removed Finalize Testing Correct Problems Conduct Alpha Testing

1 1 43 1

21-11-2011 23-11-2011 24-11-2011 25-11-2011

21-11-2011 23-11-2011 23-01-2012 25-11-2011

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.5. PROJECT SCHEDULES

CHAPTER 5. PROJECT DESIGN

Maintenance Phase

20-02-2012

23-02-2012

Sachin Janani, Abbas Baramatiwala, Vaijnath Jadhav, Balaji Ankamwar

T7.1

T7.2 T7.3

After alpha and beta testing the actual maintenance of software takes place Evaluate Systems Conduct Beta Testing

20-02-2012

2 3

20-02-2012 21-02-2012

21-02-2012 23-02-2012

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.5. PROJECT SCHEDULES

CHAPTER 5. PROJECT DESIGN

Figure 5.3: Gantt Chart

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.6. UML DOCUMENTATIONS

CHAPTER 5. PROJECT DESIGN

5.6

UML Documentations

Figure 5.4: Use Case

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.6. UML DOCUMENTATIONS

CHAPTER 5. PROJECT DESIGN

Figure 5.5: Flow Chart

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

5.6. UML DOCUMENTATIONS

CHAPTER 5. PROJECT DESIGN

Figure 5.6: Sequence Diagram

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Chapter 6

System Implementations
6.1 Important Functions
As the scheduler implemented is for multi-core processor it should also be compatible with the unicore processor so we have to write the scheduler code in conditional compilation statement i.e

#ifdef CONFIG_SMP

-----

------

#endif

Functions to be used for handling various challenges in scheduling

1. void wait_task_inactive(task_t * p) { unsigned long flags;

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

runqueue_t *rq; repeat: rq = task_rq(p); while (unlikely(rq->curr == p)) { cpu_relax(); barrier(); } rq = lock_task_rq(p, &flags); if (unlikely(rq->curr== p)) { unlock_task_rq(rq, &flags); goto repeat; } unlock_task_rq(rq,&flags); } This function is generally used for SMP or multicore scheduling.This function wait for a process to unschedule. This is used by the exit() and ptrace() code

2. static int try_to_wake_up(task_t* p, int synchronous) { unsigned long flags; int success = 0; runqueue_t*rq; rq = lock_task_rq(p, flags); p->state = TASK_RUNNING; if (!p->array) { activate_task(p, rq);
Dynamic Scheduler for Multi-Core Processor 26 VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

if ((rq->curr == rq->idle) || (p->prio < rq->curr->prio)) resched_task(rq->curr); success = 1; } unlock_task_rq(rq,&flags); return success; } This function wake up a process.The working of this function is as follows:Put a process on runqueue if its not already there. The current process is always on the run-queue (except when the actual re-schedule is in progress), and as such youre allowed to do the simpler current->state = TASK RUNNING to mark yourself runnable without the overhead of this.

3. int wake_up_process(task_t * p) { return try_to_wake_up(p,0); } This function calls the above try to wake up function 4. void sched_task_migrated(task_t *new_task) { wait_task_inactive(new_task); new_task->cpu = smp_processor_id(); wake_up_process(new_task); } This function is generally used by SMP message passing mechanism or code whenever the new task is arrived to the target CPU. We move to the new task into local runqueue so for this migration of the task we use the above function. This function must be called with interrupts disabled. The above function works as
Dynamic Scheduler for Multi-Core Processor 27 VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

follows:a) The new task rst waits for the old task to unscheduled by function wait task inactive() which is explained in above points. b) Assign a CPU to the new task using the statement new task->cpu=smp processor id()

c) After assigning the CPU wate up the new process using function wake up process(new task);

5. void kick_if_running(task_t * p) { if (p == task_rq(p)->curr) resched_task(p); } This function is used to signal the CPU in the case if CPU is trying to execute a process that is currently running on other CPU . Kick the remote CPU if the task is running currently, this code is used by the signal code to signal tasks which are in user-mode as quickly as possible. (Note that we do this lockless - if the task does anything while the message is in ight then it will notice the sigpending condition anyway.)

6. static inline unsigned int double_lock_balance(runqueue_t *this_rq, runqueue_t *busiest, int this_cpu, int idle, unsigned int nr_running) { if (unlikely(!spin_trylock(&busiest->lock))) { if (busiest < this_rq) { spin_unlock(&this_rq->lock); spin_lock(&busiest->lock); spin_lock(&this_rq->lock); /* Need to recalculate nr_running*/ if (idle || (this_rq->nr_running >this_rq->prev_nr_running[this_cpu]))
Dynamic Scheduler for Multi-Core Processor 28 VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

nr_running = this_rq->nr_running; else nr_running = this_rq->prev_nr_running[this_cpu]; } else spin_lock(&busiest->lock); } return nr_running; } Lock the busiest runqueue as well, this rq is locked already. Recalculate nr running if we have to drop the runqueue lock.

7. static void load_balance(runqueue_t *this_rq, int idle) {

int imbalance, nr_running, load, max_load, idx, i, this_cpu = smp_processor_id task_t *next = this_rq->idle, *tmp; runqueue_t *busiest,*rq_src; prio_array_t *array; list_t *head, *curr;

/* * We search all runqueues to find the most busy one. * We do this lockless to reduce cache-bouncing overhead, * we re-check the best source CPU later on again, with * the lock held. * * We fend off statistical in runqueue lengths by * saving the unqueue length during the previous load-balancing * operation and using the smaller one the current and saved lengths.
Dynamic Scheduler for Multi-Core Processor 29 VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

* If a runqueue is long enough for a longer amount of time then * we recognize it and pull tasks from it. * * The current runqueue length is a statistical maximum variable, * for that one we take the longer one - to avoid fluctuations in * the other direction. So for a load-balance to happen it needs * stable long runqueue on the target CPU and stable short runqueue * on the local runqueue. * * We make an exception if this CPU is about to become idle - in * that case we are less picky about moving a task across CPUs and * take what can be taken. */ if (idle || (this_rq->nr_running >this_rq->prev_nr_running[this_cpu])) nr_running = this_rq->nr_running; else nr_running = this_rq->prev_nr_running[this_cpu];

busiest = NULL; max_load = 1; for (i = 0; i < smp_num_cpus; i++) { rq_src = cpu_rq(cpu_logical_map(i)); if (idle || (rq_src->nr_running < this_rq->prev_nr_running[i])) load = rq_src->nr_running; else load = this_rq->prev_nr_running[i]; this_rq->prev_nr_running[i]= rq_src->nr_running;

if ((load > max_load) && (rq_src != this_rq))

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

{ busiest = rq_src; max_load = load; } } if (likely(!busiest)) return; imbalance = (max_load - nr_running) / 2;

/* *It needs an at least ~25% imbalance to trigger *balancing. */ if (!idle && (imbalance < (max_load + 3)/4)) return; nr_running = double_lock_balance(this_rq, busiest, this_cpu,idle, nr_running); /* * Make sure nothing changed since we checked the * runqueue length. */ if (busiest->nr_running <= this_rq->nr_running+ 1) goto out_unlock;

/* * We first consider expired tasks. Those will likely not be * executed in the near future, and they are most likely to * be cache-cold, thus switching CPUs has the least effect * on them. */ if (busiest->expired->nr_active)

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

array = busiest->expired; else array = busiest->active; new_array: /* * Load-balancing does not affect RT tasks, so we start the * searching at priority 128. */ idx = MAX_RT_PRIO; skip_bitmap: idx = find_next_bit(array->bitmap, MAX_PRIO, idx); if (idx == MAX_PRIO) { if (array == busiest->expired) { array = busiest->active; goto new_array; } goto out_unlock; \ } head = array->queue + idx; curr = head->prev; skip_queue: tmp = list_entry(curr,task_t, run_list);

/* * We do not migrate tasks that are: * 1) running (obviously),or * 2) cannot be migrated to this CPU due to cpus_allowed, or * 3) are cache-hot on their current CPU.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

#define CAN_MIGRATE_TASK(p,rq,this_cpu) \ ((jiffies - (p)->sleep_timestamp > cache_decay_ticks) && \ ((p) != (rq)->curr) && \ (tmp->cpus_allowed & (1<< (this_cpu))))

if (!CAN_MIGRATE_TASK(tmp, busiest, this_cpu)) { curr = curr->next; if (curr != head) goto skip_queue; idx++; goto skip_bitmap; } next= tmp; /* * take the task out of the other runqueue and * put it into this one: */ dequeue_task(next, array); busiest->nr_running--; next->cpu = this_cpu; this_rq->nr_running++; enqueue_task(next,this_rq->active); if (next->prio < current->prio) current->work.need_resched= 1; if (!idle && - -imbalance) { if (array == busiest->expired) {

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

array = busiest->active; goto new_array; } } out_unlock: spin_unlock(&busiest->lock); } This function is used for load balancing in case of overloading on a single processor. The tasks from busiest runqueue are pulled out and are put in the short runqueue .If there are task that are ready the it is feasible to take that task instead of migrating the task from the busiest runqueue.

8. Main Schedule function void scheduling_functions_start_here(void) { }

This Function is empty to help the developer to decide where to start wirting the scheduler asmlinkage

void schedule(void) { task_t *prev = current,*next; runqueue_t *rq = this_rq(); prio_array_t *array; list_t *queue; int idx; if (unlikely(in_interrupt())) BUG(); release_kernel_lock(prev,smp_processor_id()); spin_lock_irq(&rq->lock);

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

switch (prev->state) { case TASK_RUNNING: prev->sleep_timestamp= jiffies; break; case TASK_INTERRUPTIBLE: if (unlikely(signal_pending(prev))) { prev->state = TASK_RUNNING; prev->sleep_timestamp = jiffies; break; } default: deactivate_task(prev, rq); }

#if CONFIG_SMP pick_next_task: #endif

if (unlikely(!rq->nr_running)) { #if CONFIG_SMP load_balance(rq, 1); if (rq->nr_running) goto pick_next_task; #endif next = rq->idle; rq->expired_timestamp = 0;

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.1. IMPORTANT FUNCTIONS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

goto switch_tasks; }

array = rq->active; if (unlikely(!array->nr_active)) { /* * Switch the active and expired arrays. */ rq->active = rq->expired; rq->expired = array; array = rq->active; rq->expired_timestamp = 0; } idx = sched_find_first_bit(array->bitmap); queue = array->queue+ idx; next = list_entry(queue->next, task_t, run_list); switch_tasks: prefetch(next); prev->work.need_resched = 0; if (likely(prev != next)) { rq->nr_switches++; rq->curr = next; context_switch(prev, next); /* * The runqueue pointer might be from another CPU * if the new task was last running on a different * CPU - thus re-load it. */ barrier();

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.2. IMPORTANT ALGORITHMS

CHAPTER 6. SYSTEM IMPLEMENTATIONS

rq = this_rq(); } spin_unlock_irq(&rq->lock); reacquire_kernel_lock(current); return; }

6.2

Important Algorithms

1.Algorithm for process execution 1.Start 2.Repeat step 3-6 3.If new process arrives calculate process dependencies 4.Recalculate process priorities depending of thr number of dependencies 5.Take the process for execution 6.Execute the process 7.Stop

2.Algo for process execution 1.Start 2.for each CPU tick resolve the dependencies of process Mark the resolve process as ready recalculate the process priorities 3.If burst time of process > 0 goto step 2 4.Stop

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.3. IMPORTANT DATA STRUCTURE

CHAPTER 6. SYSTEM IMPLEMENTATIONS

6.3

Important Data Structure

1. struct runqueue { spinlock_t lock; unsigned long nr_running, nr_switches, expired_timestamp; task_t *curr, *idle; prio_array_t *active, *expired, arrays[2]; int prev_nr_running[NR_CPUS]; }cacheline_aligned; This structure is used to create the runqueue for each CPU or core in the system.Some places requires to lock multiple runqueues lock acquire operations must be ordered by ascending &runqueue.

The scheduler will reside in the shared memory of the multi-core system. This ensures that all the cores share the scheduler code. The same scheduler code will be executing on dierent cores and we will maintain a shared task data structure (TDS) that contains task information. The TDS stores information such as status, list of dependent tasks, data and stack pointers, etc. The detailed description of the TDS is as shown in Figure . The scheduler program executing on dierent cores (scheduler instances) will share this TDS. The access to this TDS is exclusive to each scheduler program. Exclusivity is achieved through the use of locking mechanism such as locks or semaphores.

The scheduler executes on each individual core as a separate thread or instance. Whenever a core is idle, the scheduler thread is invoked and it checks the shared TDS for the available list of ready-to-execute tasks. The shared TDS will have elements as shown in Figure 4.2

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

6.3. IMPORTANT DATA STRUCTURE

CHAPTER 6. SYSTEM IMPLEMENTATIONS

Figure 6.1: Dynamic sheduler

Figure 6.2: Task Data Structure(TDS)

Denitions Ti - Task ID Tis - Task status of Ti Tid - Number of dependencies that should be resolved to start execution of Ti Tia(n) - List of tasks that become available due to execution of task Ti Tip - Priority number of tasks that become available due to execution of task Ti Tidp - Pointer to data required for executing task Ti Tisp - Stack pointer Tix - Execution time for Ti

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Chapter 7

System Testing

Table 7.1: Test Case Test case name Checking the throuput of schedular Checking the Intelligence of schedular Operation Check the number of process that the scheduler can properly balanced on the core. Check to see how the scheduler intelligently split the large process, and if process is small to which core it assign process. Expected Output We are assuming that the scheduler can handle 100 processes If the process is of very short then it can be assigned to any free core directly, but if the process is large then the process is divided into small threads and each thread assigned to dierent core. Speedup should be greater than 1.5. Result Pass

Pass

Checking The performance of schedular.

This test cases is used to check the total performance of the scheduler in term of speedup application of and level of parallism.

Pass

Chapter 8

Experimental Results
We discuss the proposed scheduling algorithm with the help of following example. Table 1 below shows a dependency table for a set of six tasks. The number indicates the time unit at which the dependency of a particular task is resolved. This table represents an output of an oine dependency analysis on a sequential code in a simplied form. Tij Task j can be started only after task i has nished Tij time of execution where i is row number and j is column
Table T1 100 0 0 0 0 0 8.1: Dependency T2 T3 200 150 50 150 0 50 0 0 0 0 0 0 Table T4 0 0 100 50 0 0

Task T0 T1 T2 T3 T4 T5

T0 0 0 0 0 0 0

T5 0 200 150 99 150 0

Tx 250 300 200 100 200 0

Figure 8.1 shows the simulation output of the dynamic scheduler for dependencies as given in Table 8.1. We have assumed time unit in seconds. Column Tx gives the total execution time for corresponding task. Each cell in the dependency table contains Tij for task j, which means task j can be started only after task i has nished Tij units of execution. For example, Task T2 starts only after task T0 nishes 200s and task T1 nishes 50s of execution. So T02 = 200, T12 = 50. At time t=0 all the cores will try to get lock of TDS and one of them (P1 ) gets
41

CHAPTER 8. EXPERIMENTAL RESULTS

Figure 8.1: Simulation Result

that lock and nds that task T0 is ready for execution. At t = 0 T0s = 1 (ready), T1s = -1, T2s =-1, T3s =-1, T4s =-1, T5s =-1 T0d =0, T1d = 1, T2d =2, T3d =3, T4d =2. T5d =4 T0x = 250, T1x =300, T2x =200, T3x =100, T4x = 200, T5x = 50 T0p T0a(n) , T0 will release 3 tasks (T1 , T2 , T3 ) so T0p is highest T0 will start executing at t =0. At, t =100 the dependency for task T1 is resolved, T01 = 100 so task T1 is ready for execution. Thus, At t = t1 (100) T0s = 2 (executing), T1s =1 (ready), T2s =-1, T3s =-1, T4s =-1, T5s =-1 Processor P4 locks data structure and starts executing task T1 . At t=t1 (200) (T02 = 200, T12 = 50) >= t1, So task T2 is ready for execution on free processor P1 T0s = 2 (executing), T1s =2, T2s =1, T3s =-1, T4s =-1, T5s =-1 At t=t2 (250) (T02 = 200, T12 = 50) >= t1, So task T3 is ready for execution on free processor P2 T0s = 3 (Completed), T1s =2, T2s =2, T3s =1(ready), T4s =-1, T5s =-1 Using similar logic, the algorithm will continue to schedule task and it can be

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

CHAPTER 8. EXPERIMENTAL RESULTS

concluded that tasks T4 and T5 will be scheduled for execution on processors P3 and P4 respectively.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Chapter 9

Conclusion
The scheduling algorithm discussed attempts to increase the utilization of multicore processors. This algorithm is dierent in the sense that the processor owns the responsibility of picking up tasks for execution whenever it is idle. This method gives priority to tasks that resolve more dependencies and hence make sure that the updates to the ready-to-execute tasks list are done accordingly. The scheduler resides on each core as a separate thread or instance and hence is specic to a core, so we can conclude that the proposed scheduler will be more ecient and will balance the load properly. This project covered most of the important aspects of Linux scheduler. Kernel scheduler is one of the most frequently executed components in Linux system. Hence, it has gained a lot of attentions from kernel developers who have thrived to put the most optimized algorithms and codes into the scheduler. Dierent algorithms used in kernel scheduler were discussed in the project. Dynamic Multicore scheduler achieves a good performance and responsiveness while being relatively simple compared with the previous algorithm like O(1). Dynamic Multicore exceeds performance expectation in some workloads on multicore systems. But it still shows some weakness in other workloads. There are some weakness about irresponsiveness of Dynamic Multicore scheduler in 3D game area.

Chapter 10

Future Scope
The future work includes delving deeper in to the scheduling and process codes in a way so that we can implement a new scheduling algorithm in the kernel. Though this project gives a vivid overview and basic steps of conguring and compiling kernel, implementing scheduling policy like the Dynamic Scheduling,SCHED IDLE (with a lower priority) , there were some challenges associated with it. One of the challenges were interpreting the change in the scheduling policy through the process runtime. The goal for the future is to such challenges and develope ecient techniques for kernel scheduling. In the current implemented Dynamic Scheduling policy we have consider about the deadlock handling of the processes,our next goal is to improve this Dynamic Scheduling policy by implementing the better deadlock handling policy

Appendix A

Appendix
Kernel Compilation Compiling custom kernel has its own advantages and disadvantages. However, new Linux user / admin nd it dicult to compile Linux kernel. Compiling kernel needs to understand few things and then just type couple of commands. This step by step how-to covers compiling Linux kernel version 2.6.xx under Debian GNU Linux. However, an instruction remains the same for any other distribution except for apt-get command.

Step # 1 Get Latest Linux kernel code Visit http://kernel.org/ and download the latest source code. File name would be linux-x.y.z.tar.bz2, where x.y.z is actual version number. For example le inux2.6.25.tar.bz2 represents 2.6.25 kernel version. Use wget command to download kernel source code: $ cd /tmp $ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-x.y.z.tar.bz2 Note: Replace x.y.z with actual version number.

Step # 2 Extract tar (.tar.bz3) le Type the following command:

# tar -xjvf linux-2.6.25.tar.bz2 -C /usr/src

APPENDIX A. APPENDIX

# cd /usr/src Step # 3 Congure kernel Before you congure kernel make sure you have development tools (gcc compilers and related tools) are installed on your system. If gcc compiler and tools are not installed then use apt-get command under Debian Linux to install development tools.

# apt-get install gcc Now you can start kernel conguration by typing any one of the command:

$ make menuconfig Text based color menus, radiolists & dialogs. This option also useful on remote server if you wanna compile kernel remotely.

$ make xconfig X windows (Qt) based conguration tool, works best under KDE desktop

$ make gconfig X windows (Gtk) based conguration tool, works best under Gnome Dekstop. For example make menucong command launches following screen:

$ make menuconfig You have to select dierent options as per your need. Each conguration option has HELP button associated with it so select help button to get help.

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

APPENDIX A. APPENDIX

Figure A.1: Menucong

Step # 4 Compile kernel Start compiling to create a compressed kernel image, enter: $ make Start compiling to kernel modules: $ make modules Install kernel modules (become a root user, use su command): $ su # make modules_install Step # 5 Install kernel So far we have compiled kernel and installed kernel modules. It is time to install kernel itself. # make install It will install three les into /boot directory as well as modication to your kernel grub conguration le: 1.System.map-2.6.25 2.cong-2.6.25 3.vmlinuz-2.6.25

Step # 6: Create an initrd image

Dynamic Scheduler for Multi-Core Processor 48 VPCOE, Baramati

APPENDIX A. APPENDIX

Type the following command at a shell prompt:

# cd /boot # mkinitrd -o initrd.img-2.6.25 2.6.25 initrd images contains device driver which needed to load rest of the operating system later on. Not all computer requires initrd, but it is safe to create one.

Step # 7 Modify Grub conguration le - /boot/grub/menu.lst Open le using vi

# vi /boot/grub/menu.lst

Figure A.2: menu.lst le

title root kernel initrd

Debian GNU/Linux, kernel 2.6.25 Default (hd0,0) /boot/vmlinuz root=/dev/hdb1 ro /boot/initrd.img-2.6.25

49 VPCOE, Baramati

Dynamic Scheduler for Multi-Core Processor

APPENDIX A. APPENDIX

savedefault boot Remember to setup correct root=/dev/hdXX device. Save and close the le. If you think editing and writing all lines by hand is too much for you, try out update-grub command to update the lines for each kernel in /boot/grub/menu.lst le. Just type the command:

# update-grub Neat. Huh?

Step # 8 : Reboot computer and boot into your new kernel Just issue reboot command:

# reboot

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

References
[1] D. Tam, R. Azimi, and M. Stumm. Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In Pro-ceedings of the 2nd ACMSIGOPS/EuroSys European Conference on Computer Systems, pages 47-58, New York, NY, USA, 2007.ACM.

[2] F. Bellosa and M. Steckermeier. The performance implications of locality information usage in shared-me mory multiprocessors. J. Parallel Distrib. Comput., 37(1):113-121, 1996.

[3] M. C. Carlisle and A. Rogers. Software caching and computation migration in Olden. In Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 1995.

[4] Jakub Kurzak and Jack Dongarra, Fully Dynamic Scheduler for Numerical Computing on Multicore Processors, LAPACK Working Note 220, UT-CS-09-643, June 4, 2009 .

[5] Balakrishnan, S., Rajwar, R., Upton, M., and Lai, K. K. (2005).The impact of performance asymmetry in emerging multicore architectures. In 32st Inter-national Symposium on Computer Architecture (ISCA 2005), 4-8 June 2005, Madison, Wisconsin, USA, pages 506517. IEEE Computer Society.

REFERENCES

[6] K. Asanovic et al. The Landscape of Parallel Computing Research: A View from Berkeley.Technical Report UCB/EECS-2006-183, University of California at Berkeley, December 2006.

[7] K. Olukotun, L. Hammond, J. Laudon, Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency, Synthesis Lectures on Computer Architecture, Morgan and Claypool, 2007.

[8] [Lu et al., 1995] Lu, H., Dwarkadas, S., Cox, A. L., and Zwaenepoel, W. (1995). Message passing versus distributed shared memory on networks of workstations. In Supercomputing 95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM),page 37. ACM Press.

[9] http://www.kernel.org, <Use to download the Linux Kernel>

[10] http://www.ibiblio.org/pub/Linux/docs/HOWTO/KernelAnalysis-HOWTO, <This link tries to explain the most important components of the Linux Kernel>

[11] http://www.barrelfish.org, <The site is exploring how to structure an OS for future Multi- and Many-Core Systems>

[12] http://www.intel.com/core, <The site is exploring architecture of various Intel Core Processor>

[13] http://www.multicoreinfo.com,

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

REFERENCES

<Portal for Multicore Processor News and Information>

Dynamic Scheduler for Multi-Core Processor

VPCOE, Baramati

Nextgen Comp Arch
No ratings yet
Nextgen Comp Arch
794 pages
Transactional Memory Book
100% (1)
Transactional Memory Book
226 pages
ANSI-IEEE, NEMA and UL Requirements For Switchgear
No ratings yet
ANSI-IEEE, NEMA and UL Requirements For Switchgear
4 pages
Wire Basket Tray
No ratings yet
Wire Basket Tray
21 pages
Boc DB - 2448-DRG PDF
No ratings yet
Boc DB - 2448-DRG PDF
2 pages
R12-T06C05-TIE-00-XX-SDW-EL-10404 (C0) IRR External Lighting Layout
No ratings yet
R12-T06C05-TIE-00-XX-SDW-EL-10404 (C0) IRR External Lighting Layout
1 page
Serial-9 CHAPTER VII-Details of Escalator & Lift
No ratings yet
Serial-9 CHAPTER VII-Details of Escalator & Lift
20 pages
DALI Lighting Brochure - Spread
No ratings yet
DALI Lighting Brochure - Spread
12 pages
RR KABEL Wires and Cables PDF
No ratings yet
RR KABEL Wires and Cables PDF
412 pages
Cover Sheet of Block Logic Diagram and Schematic Diagram Drawings Bus Transfer System (BTS) Panel Block Logic Diagram and Schematic Diagram
No ratings yet
Cover Sheet of Block Logic Diagram and Schematic Diagram Drawings Bus Transfer System (BTS) Panel Block Logic Diagram and Schematic Diagram
17 pages
Riyadh AmtechCableDesignGuidelines Rev3
No ratings yet
Riyadh AmtechCableDesignGuidelines Rev3
33 pages
Power Load Calc
No ratings yet
Power Load Calc
11 pages
SAES-T-481 In-Plant Voice Paging System
No ratings yet
SAES-T-481 In-Plant Voice Paging System
10 pages
Guide To Forms of Seperation
No ratings yet
Guide To Forms of Seperation
32 pages
Coaxial Cable Specifications
No ratings yet
Coaxial Cable Specifications
13 pages
CAT14 CTCableTraySystemsCT DGT
No ratings yet
CAT14 CTCableTraySystemsCT DGT
380 pages
Eaton Metal Enclosed Switchgear Mvs 38kv Design Guide Dg022013en
No ratings yet
Eaton Metal Enclosed Switchgear Mvs 38kv Design Guide Dg022013en
18 pages
Prisma P
No ratings yet
Prisma P
36 pages
Dahua Video Conferencing Solution
No ratings yet
Dahua Video Conferencing Solution
16 pages
Latest
No ratings yet
Latest
602 pages
Project Report PDF
No ratings yet
Project Report PDF
41 pages
CPC 100 Reference Manual ENU
No ratings yet
CPC 100 Reference Manual ENU
333 pages
Electrical Specification Division 16, Section 16116
No ratings yet
Electrical Specification Division 16, Section 16116
4 pages
Sentron Busway System
No ratings yet
Sentron Busway System
24 pages
Iccap
No ratings yet
Iccap
828 pages
30421007-T5T-00290-000-Design Basis Report For Dry Type Ligh
No ratings yet
30421007-T5T-00290-000-Design Basis Report For Dry Type Ligh
89 pages
Arar Electrical Part 1 of 2
No ratings yet
Arar Electrical Part 1 of 2
102 pages
Cable Tray Worksheet For Plant
No ratings yet
Cable Tray Worksheet For Plant
8 pages
D2.3 MCC150506
No ratings yet
D2.3 MCC150506
16 pages
Circiut BreakersClass 251114
No ratings yet
Circiut BreakersClass 251114
46 pages
Ee382M - Vlsi I: Spring 2009 (Prof. David Pan) Final Project
No ratings yet
Ee382M - Vlsi I: Spring 2009 (Prof. David Pan) Final Project
13 pages
15010-Ar-Apt2-4-030-00 - Transformer Room Details - 20180611 PDF
No ratings yet
15010-Ar-Apt2-4-030-00 - Transformer Room Details - 20180611 PDF
1 page
P2a Load Detail and BoQ
100% (1)
P2a Load Detail and BoQ
55 pages
Top 20 Things To Know Before Selecting A Power Transformer
No ratings yet
Top 20 Things To Know Before Selecting A Power Transformer
4 pages
General Method For Cable Sizing - Electrical Installation Guide
No ratings yet
General Method For Cable Sizing - Electrical Installation Guide
10 pages
RS 485 To IEC 61850 Converter
100% (2)
RS 485 To IEC 61850 Converter
2 pages
2020 149-Feb-Schematic For Fire Pump Connection
No ratings yet
2020 149-Feb-Schematic For Fire Pump Connection
1 page
International Telecommunication Union: Recommendation
No ratings yet
International Telecommunication Union: Recommendation
25 pages
Abb - Fuses - Ato Katalog - Eng Lo-Res 09.2015
No ratings yet
Abb - Fuses - Ato Katalog - Eng Lo-Res 09.2015
56 pages
Smart Brains Institute of Engineering Design & Research: Cabling System
No ratings yet
Smart Brains Institute of Engineering Design & Research: Cabling System
11 pages
Communication Cable-Spec
No ratings yet
Communication Cable-Spec
24 pages
Role of Losses in Design of DC Cable For Solar PV Applications
No ratings yet
Role of Losses in Design of DC Cable For Solar PV Applications
5 pages
Major Factors Affecting Cable Ampacity
No ratings yet
Major Factors Affecting Cable Ampacity
6 pages
Protective Conductor Cross-Section Calculation
No ratings yet
Protective Conductor Cross-Section Calculation
3 pages
Anu PDF
No ratings yet
Anu PDF
9 pages
A New Approach To Determine Base Intermediate and Peak-Demand in An Electric Power System
100% (1)
A New Approach To Determine Base Intermediate and Peak-Demand in An Electric Power System
5 pages
TDS 34.5kV 630A RMU Schneider
No ratings yet
TDS 34.5kV 630A RMU Schneider
15 pages
Multi Mode or Single Mode
No ratings yet
Multi Mode or Single Mode
2 pages
04 - Panel Apb3-Panel Layout
No ratings yet
04 - Panel Apb3-Panel Layout
1 page
Trunking Catlouge
No ratings yet
Trunking Catlouge
306 pages
Iec 61439 - 19 - 5 - 12
No ratings yet
Iec 61439 - 19 - 5 - 12
34 pages
Forms of Seperation
No ratings yet
Forms of Seperation
24 pages
Basis of Electrical Design Memo
No ratings yet
Basis of Electrical Design Memo
9 pages
Electrical Technical Specifications
No ratings yet
Electrical Technical Specifications
10 pages
TO Eliminate Common Mode Leakage Current IN Photovoltaic Grid Connected Power System
No ratings yet
TO Eliminate Common Mode Leakage Current IN Photovoltaic Grid Connected Power System
11 pages
REF615 Modbuspoint 756581 ENc
No ratings yet
REF615 Modbuspoint 756581 ENc
48 pages
Performance Improvement in Operating System: Sundhar Ram P R 20072225 Vignesh A 20072235 Vigneshwarasabarinath S 20072236
No ratings yet
Performance Improvement in Operating System: Sundhar Ram P R 20072225 Vignesh A 20072235 Vigneshwarasabarinath S 20072236
65 pages
PHD Proposal
No ratings yet
PHD Proposal
85 pages
SE 350: Operating Systems
No ratings yet
SE 350: Operating Systems
24 pages
Report Multiprocessor Scheduling Algorithm Implementation Using Genetic Algorithms
No ratings yet
Report Multiprocessor Scheduling Algorithm Implementation Using Genetic Algorithms
98 pages
When A GIRL Is Quiet
No ratings yet
When A GIRL Is Quiet
4 pages
Menu Ramadan 1443H & Salawaat Chitti
No ratings yet
Menu Ramadan 1443H & Salawaat Chitti
64 pages
Dua Khatam-ul-Quran
No ratings yet
Dua Khatam-ul-Quran
10 pages
Synopsis - Online Job Portal Final
No ratings yet
Synopsis - Online Job Portal Final
3 pages
Windows Registry & Hiding Suspect's Secret in Registry: Vidya Pratisthan's College of Engineering, Baramati
No ratings yet
Windows Registry & Hiding Suspect's Secret in Registry: Vidya Pratisthan's College of Engineering, Baramati
21 pages
Introduction To Parallel Processing Algorithms and Architectures 1st Edition by Behrooz Parhami ISBN 9780306469640 0306469642instant Download
100% (3)
Introduction To Parallel Processing Algorithms and Architectures 1st Edition by Behrooz Parhami ISBN 9780306469640 0306469642instant Download
78 pages
Chapter02 OSedition7Final
No ratings yet
Chapter02 OSedition7Final
81 pages
Chapter 8
No ratings yet
Chapter 8
59 pages
High Performance Computing: Sabah Sayed
No ratings yet
High Performance Computing: Sabah Sayed
22 pages
CEA201 CH01 Introduction
No ratings yet
CEA201 CH01 Introduction
30 pages
Distributed Systems: Lecturer: Dr. Nadia Tarik Saleh
No ratings yet
Distributed Systems: Lecturer: Dr. Nadia Tarik Saleh
19 pages
ACA (15CS72) MODULE-1: 1.0 Objective
No ratings yet
ACA (15CS72) MODULE-1: 1.0 Objective
61 pages
Think Parallel Brochure - Hybridmode
No ratings yet
Think Parallel Brochure - Hybridmode
1 page
Computer Architecture Suggestions
No ratings yet
Computer Architecture Suggestions
4 pages
Abinitio Intew
No ratings yet
Abinitio Intew
8 pages
Computer Cluster Presentation
No ratings yet
Computer Cluster Presentation
14 pages
Particle-In-Cell Simulation Codes in High Performa
No ratings yet
Particle-In-Cell Simulation Codes in High Performa
24 pages
Chapter1 Computer Abstractions and Technology
No ratings yet
Chapter1 Computer Abstractions and Technology
49 pages
Thinkair: Dynamic Resource Allocation and Parallel Execution in Cloud For Mobile Code Offloading
No ratings yet
Thinkair: Dynamic Resource Allocation and Parallel Execution in Cloud For Mobile Code Offloading
9 pages
2020 Survey DNNCPU MittalRajputSubramoney
No ratings yet
2020 Survey DNNCPU MittalRajputSubramoney
22 pages
University of Technology Computer Science Department Date: 13/9/2011 Time: 3 Hours
No ratings yet
University of Technology Computer Science Department Date: 13/9/2011 Time: 3 Hours
12 pages
5
No ratings yet
5
59 pages
CS 179: GPU Computing: Recitation 1 - 4/1/16
No ratings yet
CS 179: GPU Computing: Recitation 1 - 4/1/16
18 pages
Ds Assignment
No ratings yet
Ds Assignment
6 pages
Course Structure & Syllabus of M. Tech. Programme In: Vlsi Signal Processing
No ratings yet
Course Structure & Syllabus of M. Tech. Programme In: Vlsi Signal Processing
29 pages
Parallel Computing: Lecture 4: Parallel Software: Basics
No ratings yet
Parallel Computing: Lecture 4: Parallel Software: Basics
31 pages
Computer Architecture: Dept. of Computer Science (UOG) University of Gujrat
No ratings yet
Computer Architecture: Dept. of Computer Science (UOG) University of Gujrat
20 pages
Optimization and Monitoring Perfomance
No ratings yet
Optimization and Monitoring Perfomance
48 pages
International Journal of Computer Science and Security (IJCSS), Volume (1), Issue
No ratings yet
International Journal of Computer Science and Security (IJCSS), Volume (1), Issue
101 pages
Parallel DBMS: Chapter 22, Sections 22.1-22.6
No ratings yet
Parallel DBMS: Chapter 22, Sections 22.1-22.6
23 pages
Neural Network Simulations in Matlab
No ratings yet
Neural Network Simulations in Matlab
5 pages
Introduction To Computer Organization
No ratings yet
Introduction To Computer Organization
66 pages
EN Product-Info MultiSEM Rel2-4
No ratings yet
EN Product-Info MultiSEM Rel2-4
15 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
2 pages
Unit 1
No ratings yet
Unit 1
65 pages