0% found this document useful (0 votes)
47 views

Scheduling Algorithms: (Or The Process of Choosing The Best Job To Run)

This document discusses scheduling algorithms used in computing. It begins with an introduction to scheduling and why it is important for maximizing application performance. It then describes common scheduling algorithms like round-robin and fair share used in Unix and commercial workload managers. The document also discusses cluster managers like Condor and PBS and the algorithms they use including backfill, priority, and fair share. It concludes by looking at future directions for scheduling in grid environments and some related research projects.

Uploaded by

Gokhan Ozcan
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Scheduling Algorithms: (Or The Process of Choosing The Best Job To Run)

This document discusses scheduling algorithms used in computing. It begins with an introduction to scheduling and why it is important for maximizing application performance. It then describes common scheduling algorithms like round-robin and fair share used in Unix and commercial workload managers. The document also discusses cluster managers like Condor and PBS and the algorithms they use including backfill, priority, and fair share. It concludes by looking at future directions for scheduling in grid environments and some related research projects.

Uploaded by

Gokhan Ozcan
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 36

Scheduling Algorithms

(or the Process of Choosing the Best Job to Run)

Scheduling Algorithms

Introduction By default Workload managers, their schedulers, and algorithms Future directions

Introduction

Efficient scheduling across nodes is necessary to maximize application performance regardless of the efficiency of your parallel algorithms. Dynamic scheduling in a heterogeneous environment is significantly more complicated. Programming parallel applications is difficult, and not worth the effort unless large performance gains can be realized.

Scheduling

Scheduling is a key part of the workload management software which usually perform some or all of:

Queuing Scheduling Monitoring Resource management Accounting

Cont

The difficult part of scheduling is to balance policy enforcement with resource optimization in order to pick the best job to run. Essentially one can think of the scheduler performing the following loop:

Select the best job to run, according to policy and available resources. Start the job. Stop the job and/or clean up after completion. repeat.

Cont
Scheduling can be described as either

Application level

APST, AMWAT, MARS, (AppLeS) Prophet UNIX, Condor, Maui, PBS

Resource level

By Default: Unix Scheduling Algorithms

UNIX is fundamentally a time-share (TS) operating system [1978] TS scheduler provides illusion to each user that they are the only one accessing system resources Employs a round-robin scheduling algorithm

Round-robin

Each process is placed in a run-queue Allocated a service quantum of time (commonly set to 10 milliseconds) Processes that demand less time run without being interrupted Processes exceeding the service quantum are interrupted and returned to the back of the run-queue to await further processing

Round-robin Cont

Round-robin favours short process demands


Consequently it is biased in favour of a greedy user who runs many short-demand processes The UNIX scheduler schedules processes not users What was needed was a fair-share (FS) scheduler

Fair-share

Implemented on UNIX systems circa 1988 In context of computer resources, fair is meant to imply equity not equality in resource consumption

Fair-share Cont

The FS scheduler has two levels of scheduling: process and user Process level scheduling same as in standard UNIX (priority and nice values act as bias to scheduler as it repositions processes in the run-queue) User level scheduler relationship can be seen in the following (simplified) pseudo-code

Fair-share Algorithm: User Level Scheduling


/*Every 400 milliseconds, accumulate the CPU usage per user*/
Parameters: delta = 4 Sec; decayUsage; For (I=0; I<USERS; I++) { usage[I] *= decayUsage; usage[I] += cost[I]; cost[I] = 0; }

Fair-share Algorithm: Process Level Scheduling


Parameters: priorityDecay = Real number in the range [0..1]

Every 100 milliseconds, decay ALL process priority values


for (k=0; k<PROCS; k++) { sharepriority[k] *= priorityDecay; } priorityDecay = a * p_nice[k] + b; /*Every 10 milliseconds (or tick) alter the active process priorities per

user*/

for (i=0; i<USERS; i++) { sharepriority[i] += usage[i] * p_active[i]; }

Fair-share Cont

Whereas process-level scheduling still occurs 100 times a second, user-level scheduling adjustments (usage parameter) occur once every 4 seconds Also, once a second, process-level priority adjustments that were made in the previous second begin to be forgotten. This is to avoid starving a process

Commercial Products
O/S
AIX

Manager
WLM

Capping
Yes

Parameter
Targets/limits

Compaq
HP-UX IRIX Solaris

Tru64
PRM SHARE II SRM

NO
YES NO NO

Shares
Entitlements Shares Shares

Cluster Managers, Their Schedulers and Algorithms

Condor

Full featured batch system Used to manage clusters of beowulf nodes Based on Network Queuing System (NQS) 1986 NASA first on Cray and then ported to other architectures IEEE POSIX standard (1003.2d)

Portable Batch Scheduler (PBS)

Cluster Managers, Their Schedulers and Algorithms

Maui

Developed to manage machines at the Maui HPCC External scheduler uses native schedulers such as PBS and LoadLeveler Included in cluster building toolkits such as Rocks and OSCAR Genetic algorithms

GAs

Condor

DAGman (Directed Acyclic Graph Manager)


Meta-scheduler for condor jobs Submits jobs in an order represented by a DAG and processes the results

Condor Cont
An example input file Job A A.condor Job B B.condor Job C C.condor Job D D.condor Parent A child B C Parent B C child D #Nodes B and C are run in parallel

A
B D C

Condor Cont

Jobs queues are not strictly FIFO


Condor implements priority scheduling Priority preemption can also be used Priority calculations are based on ratios instead of absolutes

Starvation is prevented by a FS algorithm which attempts to give all users the same amount of machine allocation time over a specified interval (configurable)

PBS

Includes several built-in schedulers Default is FIFO but not strict (based on queue priority)

Behaviour is to maximize CPU utilization Starving-jobs mechanism helps

PBS Scheduling Policies

PBS has seven scheduling algorithms (* represents default)


Decaying FS algorithm Round-robin for jobs and queues Queue priority * Job type or job ordering Strict FIFO Load balancing jobs between hosts Dedicated times/nodes

Maui

Scheduling behaviour is constrained by way of throttling policies


Both soft and hard limits used Applied to each iteration

Three main algorithms used


Backfill Priority FS

Maui Backfill Algorithm


Backfill is scheduling optimization Based on earliest-job-start information

Two passes

Pass 1 jobs that meet soft policies Pass 2 expands list from pass 1 to include hard fairness policies

Maui applies the backfill algorithm specified by the BACKFILLPOLICY parameter, be it firstfit, bestfit, or balfit

Backfill Algorithm

Assuming bestfit the following steps apply:


1. feasible backfill jobs are filtered selecting those that actually fit the current backfill window 2. base degree-of-fit on scheduling criteria (I.e. processors, seconds etc.) 3. job with best fit is started and the backfill window size adjusted 4. while backfill jobs and idle resources remain repeat step 1

Maui Priority Algorithm

Default is trivial FIFO but is weighted and combined based on service, requested resources, fair-share, direct priority, target, and bypass

Priority = serviceweight * servicefactor + resourceweight * resourcefactor + fairshareweight * fairsharefactor + directspecweight * directspecfactor + targetweight * targetfactor + bypassweight * bypassfactor

Each *weight value is a configurable parameter and each *factor is calculated from subcomponents listed above (I.e. user, group, priority, QoS etc.)

Maui Fare-share Algorithm

Composed of several parts which handle historical information, fairshare windows, usage, and impact All are site configurable parameters Purpose of a fairshare algorithm is to steer existing workload

Genetic Algorithms (GA)


A typical GA structure
Genetic Algorithm () [ Initialize population;/*an initial population of strings randomly generated*/ Evaluate population;/*check fitness of string against fitness function*/ While termination criterion not reached [ /*genetic operators*/ Select solution for next population; Perform crossover and mutation; Evaluate population; ] ]

GAs Cont

GAs mimic the evolutionary process. Introduced by J.H Holland 1975. In nature, organisms are always competing for scant resources such as food and space.

GAs Cont

A GA can be analyzed by breaking it up into the following sequence:

Encoding mechanism convert variables

into a form that can be processed by the GA. Fitness function - each generated string has a fitness value (or weight) assigned which determines the survival rate of the string.

GAs Cont

Operators.

Reproduction process by which individual strings are

copied according to their fitness values. Higher fitness values are more likely to be selected. Crossover all strings in the population are randomly paired up and may or may not be subjected to a crossover depending on a randomly generated outcome. Mutation an occasional alteration to the strings in the population and is used to safeguard against premature convergence.

Future Directions

High Performance Grid Schedulers

Scheduling holds the key to performance in grid environments; however, high-performance application scheduling on computation grids represent a world in which much progress needs to be made. Part of the problem lies in the inherent difficulty of the scheduling problem, whose optimal solution is considered infeasible even in the simplest environments.

Current High Performance Grid Scheduler Projects

AppLeS

Application Level Scheduler

Best of candidate schedules based on users performance criteria

MARS Meta-computer Adaptive Runtime Scheduler

Determines candidate schedule that minimizes execution time

Prophet

Determines schedule with the minimal predicted execution time Virtual Distributed Computing Environment

VDCE

List scheduling used to match resources with application tasks

SEA

Scheduling Expert Advisor

Ready tasks enabled in program graph are next to be scheduled

Projects continued

I-SOFT

Centralized scheduler maintains user queues and static capacities (FIFO) Offline genetic algorithm (GA) mappings indexed by dynamic parameters used to determine mapping for current iteration Determination of performance model for candidate schedules with minimal execution time Globally controlled or locally controlled load balancing

IOS

SPP(X)

DOME

Conclusion

Scheduling in a heterogeneous environment is hard. Keywords to remember


Adaptive Dynamic

Links

http://www.openpbs.org/ http://www.cs.wisc.edu/condor http://www.supercluster.org/maui

You might also like