0% found this document useful (0 votes)
6 views

Parallel Algorithm

Note

Uploaded by

hana19082002
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
6 views

Parallel Algorithm

Note

Uploaded by

hana19082002
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 4
105 QUICKSORT-BASED ALGORITHMS Jn Sec. 105 we will develop three parlle-sorting algorithms suitable for ia. plementation on MIMD computers. Developing parallel algorithms is exsist ‘when a cost optinal PRAM algorithm exists, in which the processor inter tions match the underlying architecture, Unfortunately, we do. not have tht Juxury when it comes to sorting. The cost-optimal PRAM-sorting algoritin of Leighton has time complexity O(log) with m processing elements, but it enormous constant of proportionality makes it impractical to use (Leighoe 1984). Bitonic merge-sort has cost O(log" n), which is higher than the CO of the best sequentialsorting algorithms. For this reason we turn to the bes ‘general-purpose sequential sorting algorithm—quicksort—as the basis for parallel algorithms. upren 1 sormma 279 spas Poe cule oorren 8 sorting algorithm commonly used on seial computers, I Op: larity is due to its asymptotical ‘average < Sat ae mee ly optimal average-case behavior of Gialops) Quierta ecu eae ines ater ae SU i etn cp es aoe me mse eB a tae lef less nor eq tote eden ke ed ale titer arenes thane aan val Greens el ae Alico algorithm cose on emen asennad eee eee tlemen) easing paonng Sep he at Paka ne tnd te sgordm reaniel pain eth of oo re, eral oon di ot tsb diet poh et Taras bev patent th een mame ing oe ee that can be solved simultancously, ba Consider the following parallel quicksort algorithm. A number of idestical proces, oe per poe rea ie palin Tneceewss ote Sore are soe in ens a lol mer. sacs mena ses the inde of subaays ata sll sored Wen pee bon oo {aterm gop endo em see anya eR oat Isatoonfl he poss pone ber. bd ora pyre an lene it vo smal aay cong element es on oop Be supper wala rived tte Monte fie cave elec ‘Afier the partitioning step, idemtical 10 the partitioning step performed by the Jota qultant algorithms ee Ui aes Terps oe the glial suck of usted sebarap nd eps he putoing poss 28 th eer subaray Figure 1022 ists te pull agen, What speedup can be expected fom ul quick alg? Not thr eekes t= compere w prion 2 abany Omang Sere The expected speedup is computed by assuming that one comparison takes one unit oie and nds he a of he expected umber ef compan: peromed by the squeal ago oe expec ine equie bye ural goth, To simplify the analysis, assume that n= 2 —1 and p = 2°, where m 4, the third iteration requ time at least [(n — 1)/2 — 1]/2-1 = (n — 7)/4 to perform a -7 comparisons. For the fist log p iterations, there are at least as many processes as partons and the time required by this phase of the parallel quicksort algorithm is TiC, p) = n+ 1) (1 — 1/p)— log p CHAPTER 10: SORTING 275 ‘The number of comparisons performed is Ciln. p) = (n+ I) log p—2(p—1) In the second phase of the parallel algori igorithm there are more subarrays to be sorted than processes. All the Processes are active. If we assume that oe process performs an equal share of the « i Qf 7 omparisons, then the time required is simply the number of comparisons performed divided by p. Hence 7 Caln, p) = Te) - Cin, p) Ta, p) = LO Pp ‘The estimated speedup achievable by the parallel quicksort algorithm is the sequential time divided by the parallel time: Tin) T(r, p) + Trin, p) For example, the best speedup we could expect with a = 65,535 and p=l6 is approximately 5.6. Way is speedup so low? The problem with quicksont is its divide-and-conquer nature. Until the first subarray is partitioned, there are” ho more partitionings to do, Even after the first partitioning step is complete, there are only two subarrays to work with. Hence many processes are idle at the beginning of the parallel algorithm’s execution, waiting for work. Figure 10-24 contains pseudocode for a UMA multiprocessor-oriented par- allel quicksor algorithm, which uses the strategy we have discussed. Function INITIALIZE.STACK initializes the shared stack containing the indices of un- sorted subarrays. When a process calls function STACK.DELETE, it receives the indices of an unsorted subarray if the stack contains indices; otherwise, the “low” index is greater than the “high” index, meaning there is no use- ful work to do at this point. Function STACK.INSERT adds the indices of an unsorted subarray to the stack. Since all these functions access the same shared data structure, their execution must be mutually exclusive. Function ADD.TO.SORTED increases the count of elements that are in their correct positions and execution of this function, 100, must be mutually exclusive. We use monitors to implement these functions. As the pseudocode algorithm shows, we use the familiar strategy of switch- ing from quicksort to insertion sort when the size of the array to be partitioned falls below a predetermined threshold (Sedgewick 1988). Figure 10-25 compares the predicted speedup with the actual speedup achieved by a Sequent C implementation of the algorithm sorting 65,535 inte- gers on allighily loaded Symmetry multiprocessor. The correlation is reasonably good, considering the analysis made the simplifying assumption that each par- titioning step always divides an unsorted subarray into two subarrays of equal size. Speedup = Ee 276 PARALLEL COMPUTING: THEORY AND PRACTICE QUICKSOAT (UMA MULTIPROCESSOR): Global on {Size of array of unsorted elements) a(0...d0 - 1)) {Array of elements to be sorted) (Number of cloments in sorted position) {Smallest subarray that is partition sorted directly} a 'ed rather than {Indices of unsorted subarray) {Final position in subarray of Partitioning key) sorted min.partition Local bounds median begin sorted - 0 INITIALIZE.STACK() for all 2, whereO < i < pdo while (sorted < n) do bounds < STACK.DELETE() while (boundslow < bounds.high) do if (bounds.high — bounds.low < min.partition) then INSERTION.SOAT (a, bounds.low, bounds.high) ADD.TO.SGATED (bounds.high — boundsfow + 1) exit while else median <— PARTITION (bounds.low, bounds.high) STACK.INSERT (median +1, bounds.high) bounds.high < median—1 if boundstow = bounds.high then ADD.TO.SORTED (2) else ADD.TO.SORTED (1) endif endif endwhile endwhile endfor end GURE 10-24 Multiprocessor-oriented parallel quicksort algorithm. A shared stack contains the indices of unsorted subarrays. Processes must execute functions STACK.DELETE(. ADD.TO.SORTED(), and STACK.INSERT() inside critical sections to ensure mutual exclusion.

You might also like