Dependency Analysis of For-Loop Structures For Automatic Parallelization of C Code
Dependency Analysis of For-Loop Structures For Automatic Parallelization of C Code
Tim Jacobson
Mathematics and Computer Science Department
South Dakota School of Mines and Technology
[email protected]
Gregg Stubbendieck
Mathematics and Computer Science Department
South Dakota School of Mines and Technology
[email protected]
Abstract
Dependency analysis techniques used for parllelizing compilers can be used to produce
coarse grained code for distributed memory systems such as a cluster of workstations.
Nested for-loops provide opportunities for this coarse grained parallelization. This paper
describes current dependency analysis tests that can be used to identify ways for
transforming sequential C code into parallel C code. Methods for searching nested for-
loops and array references will be discussed as well as differences of each dependency
test.
1 Introduction
Dependency analysis has predominately resided in the realm of compilers. The goal was
to speed performance of processes by eliminating conflicts caused by data accesses.
These dependency analysis techniques gained popularity in parallel compilers for shared
memory systems. This paper is part of an on going project attempting to use existing
dependency analysis techniques on a distributed memory system. The purpose will be to
convert sequential C programs into parallel C programs that can be run on a cluster of
workstations using a message passing protocol. One of the tenets of this project is that
the parallel code be coarse grained as opposed to fine grained parallel code used with
shared memory systems. One way that coarse grained parallelization can be achieved is
by performing transformations of loop structures. Assuming the data dependencies found
within the loop structures are intentional, dependency analysis will be used to catalog and
verify that transformations do not change the algorithm. In order to perform dependency
analysis, information must first be gathered about the loop structures. Once the loop
structure information has been accumulated, it will be used to determine which
dependency tests to perform. Each dependency test is designed for a specific type of data
reference found in the loops. Pairs of data references will be compared together to
determine if a dependency exists between them. Some dependency tests are known as
exact tests because they can accurately determine dependence or independence. Due to
the complexity of some data references, tests performed on them must use symbolic
representations of dependencies. The simplification of complex data references may only
allow dependency tests to approximate dependencies. The following paragraphs will
describe specific dependency analysis techniques used on for loops of C code and the
difficulties that are encountered.
It should be noted that statements s1 and s2 may be the same statement. Classifying the
dependence between two statements is important for analysis, since different
dependencies will be parallelized in different ways.
When looking at loop structures it is valuable to know the stride the loop performs for the
iteration. The stride value is referred to as the iteration number [3]. Knowing if the
iteration strides in an increasing or decreasing direction will also be of value. In figure 1,
loop i has an iteration number of 1 and has a positive stride. Loop j has an iteration
number of 2 with a positive stride. Loop k has an iteration number of 1 with a negative
stride.
When analyzing a nested loop structure, dependency tests need a way of keeping track of
what level the statements are nested in called the nesting level [3]. Finding the nesting
level is a simple counting. The outer most level is considered level one and each level
inside is counted incrementally. Each statement will be cataloged as to which level it lies
in the nest. Once statements are identified in the nest, the focus will turn to how those
statements change from iteration to iteration.
Each statement lying in a nested loop structure contains an iteration vector that defines
the iteration values of all the loops at that given point [3]. For a nest of n loops, the
iteration vector α = { α1 ,α 2 ,α 3 ,...α n }, where α k : 1< k < n, represents the iteration
number for the loop at nesting level k. The set of all iteration vectors for a given
statement is called the iteration space of the statement. The iteration space gives a
complete view of how memory is being accessed by a statement throughout the lifetime
of the nested loop. Referring to figure 1, the statements have the following iteration
vectors:
Notice that there is no dependency between the two statements, since the first always
contains the constant 0 in the first array reference and the second always contains the
value 10. There is no possibility for the two statements to refer to the same memory
location.
Now that information has been gathered about each statement, the next step is to begin
identifying dependencies between statements. In a nest of loops there is a dependence
from statement s1 to s2 such that α = β where α is a particular iteration vector for s1 and
β is a particular iteration vector for s2 [3]. Thus statement s1 accesses a memory location
on iteration vector α and s2 accesses the same memory location on iteration vector β. It
would be possible perform comparisons between each iteration vector for two statements,
but may take a very long time for loops of any significant size. In fact, if the loop
structures are of a size that would be worth parallelizing, then performing a one to one
comparison for each iteration vector will be ridiculously long.
One way to lessen the computation is to view iteration vectors in a more abstract way.
When a loop dependence occurs between iteration vectors α and β, then the distance
between them is defined as the distance vector d(α , β ), which is the difference between
the two vectors [3].
d( α , β ) k = β k − α k .
After obtaining the distance vector, the values can be further abstracted to a set of
symbols representing the three directions. The direction vector D( α , β ) is defined as:
“<” if d( α , β )k > 0
D(α , β ) = “=” if d( α , β ) k = 0
“>” if d( α , β ) k < 0
These symbols will be used to verify that transformations performed allow dependencies
to be maintained. For example:
As mentioned earlier, the dependence between two statements occurs when both of them
access the same memory location and at least one of them writes to that location. When
considering loop structures, the dependency between two statements may occur in the
same iteration or separate iterations of a loop. If the dependency is found in the same
iteration of the loop it is called a loop-independent dependence [3]. For example: if s1
and s2 are two statements in the same level of a loop and both access memory location M
during that iteration of the loop. Then on the next iteration, both statements will access a
new memory location N. Since the iteration does not access any memory locations from
previous iterations, the iteration is independent of all others. A loop-carried dependence
[3] occurs when a statement accesses a memory location and then on another iteration
there is a second access of that memory location. Thus, s1 will access memory location
M and then on a subsequent iteration s2 will also access memory location M. Again, s1
and s2 may be the same statement where one is a write and one is a read.
To expand on the focus of indices and subscripts will be the question of how many
indices are found in the subscripts of an array reference? The number of indices will be
called the complexity of the array [2][3]. There are three levels of complexity that are
commonly used for dependency analysis. The first is the zero index variable (ZIV). A
ZIV is an array reference, which does not use index variables in a particular subscript
pair. The second is the single index variable (SIV). The SIV uses one index variable in a
particular subscript pair of an array references. Finally, the third is called the multiple
index variable (MIV). The MIV occurs when more than one index variable is reference
in a particular subscript. Note that any pair of multi-dimensional array references may
contain any one or several complexity values in the subscripts. For example, the nested
loop mentioned above in figure 1 contains a ZIV in the first subscript, a SIV in the
second subscript and a MIV in the third subscript. There are different dependency tests
that are performed for ZIV, SIV, and MIV subscripts.
A[i][j][j] = A[i][j][k];
The first subscript is separable because it only contains the index i, but the second and
third are coupled because they share the variable j. Before dependency testing can begin,
a pair of array references must be broken into separable and minimally coupled
subscripts. For the separable subscripts, there are methods of getting exact answers from
dependency tests. Due to complexity, coupled subscripts have more difficulty in finding
exact answers. In the event that an exact answer cannot be determined, a more
conservative ruling will be made. There is the potential that a dependency may be
detected that does not really exist. However, the opposite is not true. Dependency tests
do not declare independence between subscripts when one really exists.
One of the simplest dependency analysis tests is to compare a pair of subscripts that
contain no index variables. Referring to figure 1, subscripts 0 and 10 will be tested for
ZIV. Since these subscripts hold integer values they can easily be compared by
subtraction. If the difference is non-zero there is no dependency. If the subscripts held
variables, then either the value of the variables are known at runtime or they are
unknown. If the variables are known, the test can compare the values similarly by taking
the difference. If they are unknown, then there are two possible situations. One is that
the same variable is used in both subscripts and may have an arithmetic operator
associated with it. A symbolic comparison of the variables can be made to verify they
differ by an arithmetic operator. Otherwise one or both variables are initialized at
runtime and a conservative approach is made to assume they are the same value, because
it cannot be proven they are different.
SIV tests may be divided into several sub-categories based on information associated
with the common index of a subscript pair. An index may be combined with a number or
variable through an arithmetic operator in a subscript. SIV subscripts can be viewed as
linear expressions of the form: ai + c , where a and c are either numbers or variables
representing numbers and i is the index. A pair of subscripts can be written as two linear
functions: f (i ) = a1i + c1 and g (i ) = a2i + c2 . Because these are linear, they can be
plotted as lines in a rectangular coordinate system.
The two major classifications of SIV subscript pairs will be parallel lines, or strong SIV
subscripts, and non-parallel lines, or weak SIV subscripts[2][3]. Weak SIV testing may
be further divided into sub-categories that will be mentioned later. Note that the lines
displayed graphically represent more than the actual data references. The only locations
of concern are points on the lines that actually have integer coordinates since subscripts
must be integer values. Dependencies occur between the two lines only when there exists
a horizontal line that can intersect both subscript lines and find integer coordinates at
those intersections. The horizontal line test is the equivalent of finding two references to
the same memory location.
The strong SIV test is for comparing two subscripts that contain an index i of the form
a1i + c1 and a2i + c2 . Since these can be represented as parallel lines, they have the same
slope or a1 = a2 . The dependence distance between the two lines is merely the difference
between any two points intersected by a horizontal line. The horizontal difference can be
calculated by the finding the intercepts of the horizontal axis.
c1 c2
the horizontal intercepts are − and − .
a1 a1
c1 − c 2
The difference between them is d= ,
a1
g(i)
f(i)
Dependency
i
L U
The weak SIV test occurs when the two subscripts in question have values that form
intersecting lines. For subscripts a1i + c1 and a2i + c2 , a1 ≠ a2 or the slopes of the lines
are not the same [3]. There are a couple special cases of weak SIVs that are easy to test
for dependencies. These include the scenario where one of the subscripts forms a
horizontal line called a weak-zero SIV subscript. The other special case is when the
subscripts form lines that have negative slopes of each other called the weak crossing SIV
subscripts. Both of these cases have fast algorithms for determining dependence. If the
subscripts don’t fall into one of these categories, they must undergo the more expensive
exact SIV test.
The weak-zero SIV test is for comparing two subscripts where only one contains an index
variable. The general form of the subscripts is a1i + c1 and a2i + c2 , where either a1 = 0
or a2 = 0 . The expressions can form two linear functions where one is a horizontal line
and the other is non-horizontal. If the intersection of the two lines has integer coordinate
for i and that location is within the lower and upper bounds of the index, there is a
dependency, see diagram 2. Assuming that a2 = 0 , dependency can be calculated by the
expression
c1 − c2
i= [3].
a1
Weak-zero dependencies have great potential for parallelization because only one
location actually has a dependency and which can treated as a special case outside the
loop.
f(i)
g(i)
i
L U
The weak-crossing SIV test is the situation where the subscript pair contains a common
index but each is the negative of the other. The general form is the expressions a1i + c1
and a2i + c2 with a1 = − a2 [3]. Represented as functions produces two lines that are
mirror images of each other reflected about a vertical line passing through the
intersection of the lines. The point at which they intersect is the location of interest. If
the intersection point has integer coordinates or the horizontal coordinate is half way
between two integer values, then there are dependencies to identify. As before, the
dependencies must occur between the lower and upper limit to be valid, see diagram 3.
Two dependencies may be found. One will be a flow dependency and the other anti-
dependency with a change from one to the other at the point of intersection.
f(i)
g(i)
i
L U
ax by d − c
Divide the equation by g. − =
g g g
Since g divides a and g divides b, then a/g and b/g are integers. If (d-c)/g is also an
integer, then there are integer solutions to x and y which will indicate dependencies.
Finding those dependencies requires knowing the linear combination:
ana + bnb = g .
The greatest common denominator and linear combination can be calculated using the
Euclidean Algorithm. Finally, the following equations can be derived:
d −c c d −c d
xk = na + k and yk = nb − k [3].
g g g g
The variables xk and yk are specific solutions of the index i where there is a dependency
between the subscripts. The variable k is any integer value and guarantees the
dependence will occur for the subscript pair at that value. There are an infinite number of
values of k that can be used, but the only ones of concern will be those that provide
solutions xk and yk within the boundaries of the loop. It is possible that dependencies
will only exist outside the boundaries. Testing for values within the lower and upper
boundaries will tell if a dependency truly exists. The exact SIV test requires much more
work to be certain of dependencies, but has the advantage of identifying all dependencies
for linear SIV subscripts.
A MIV is any subscript pair that contains more than one index. For example:
There are multiple occurrences of dependencies that take place on different iterations of
the outer loop, see diagram 4. However, it is not as obvious to derive the linear equations
for A[i] using techniques mentioned previously. Use of Diophantine equations also
becomes difficult or computational expensive with multiple variables. One of the desires
of dependency analysis is to avoid having to test every possible scenario between the
subscripts in question. There are several dependency tests that have been designed
specifically for MIV subscripts. This paper will cover some of the basic concepts.
The subscripts described in the figure 2 are known as Restricted Double Index Variables
(RDIV). The general form for RDIV subscripts is a1i + c1 and a2 j + c2 [2][3]. Since
both subscripts are linear expressions, testing for dependencies in the RDIV is similar to
the tests described for SIV. The RDIV test utilizes information about loop boundaries to
augment the SIV tests.
A[5]
A[4]
A[3]
A[2]
A[1]
L U i
The greatest common denominator test (GCD) is based on forming affine functions out
of multiple index subscripts.
a1 x1 − b1 y1 + ... + an xn − bn yn = a0 − b0 [3].
If the gcd (a1 ,..., an , b1 ,..., bn ) can divide a0 − b0 then there is a dependency somewhere.
However, the dependency may not be in the bounds of the array, so further testing will be
needed. Also, it is not uncommon for the gcd of several numbers to be one, which
divides everything. Thus the GCD test alone is not always exact. One way to improve
the GCD test is to use Banerjee Inequalities. This technique abstracts the corresponding
coefficients of the affine functions to inequality symbols described for dependence
directions. These dependence directions can be formed into direction vectors and will be
compared to a0 − b0 to determine in which loop the dependence occurs. The details of
Banerjee GCD method are lengthy and will not be explained in this paper. The Banerjee
GCD test is one of the oldest tests and is unofficially used as a benchmark by which
many other MIV tests are compared.
One MIV test that is compared to the Banerjee GCD test is called the Delta test [2][3].
The Delta test is able to take constraints of SIV subscripts and incorporate them into MIV
subscripts with the hope of eliminating a common index. The MIV subscript may be
reduced to a SIV or at the very least a less complex MIV. For example:
The first subscript pair contains only the index i. Analyzing the first pair by the strong
SIV test produces a dependence distance of 1. Now the Delta test will take the first
subscript and subtract it from the second for both array references.
The second subscripts can now be analyzed with the strong SIV test producing a
dependence distance of –1. The distance vector of the two will be d(1,-1). The method
of incorporating a subscript into another subscript is called propagating constraints. If
the index that is propagated is completely eliminated from the subscripts, the Delta test
will produce exact results. If the index cannot be completely eliminated, then at least the
complexity has been reduced which will aid other MIV tests.
There are many other MIV tests that have been developed. Each has unique advantages
and disadvantages. Some use the divide and conquer approach where subscripts are
divided into separable and minimally coupled groups. However, analyzing subscripts
individually and combining may produce false dependencies between subscripts. Other
techniques try to overcome false dependencies by analyzing all the subscript pairs
together, which adds computational complexity and expense. In an attempt to ease
calculations and performance time, some tests abstract the elements of each subscript.
Abstraction may also lead to less accurate results. Tests that are not exact are classified
as approximate tests. Some of the MIV tests that are worth mentioning are the Fourier-
Motzkin Elimination test, the Constraint-Matrix test, the λ-test, the Power test, the I-test
and finally the Omega test [2][3][5][6]. These tests will be looked at in the future as part
of the on going project.
4 Conclusion
This paper has described what data dependencies are and how they are detected in for-
loops. In order for statements to be analyzed, information about the loops they are nested
in must first be gathered. This information is used to determine the types of data
references contained in the statements. Often data references are arrays and the
subscripts found in the arrays are the focus of most dependency analysis tests. Some of
the dependency analysis tests are able to exactly determine if a dependency exists. Other
tests use abstraction as a way of simplifying the analysis and may only produce
approximate answers. The purpose of studying these dependency analysis techniques is
to incorporate them into an automatic parallelization tool that will transform sequential C
code into parallel C code that can be executed on a cluster of workstations. This
parallelization will need to be coarse grained, but must maintain the dependencies that are
inherent to the sequential version. Dependency analysis is the key to finding this coarse
grained parallelization.
5 References
1. Kulkarni, D., & Stumm, M. (1993, June). Loop and Data Transformations: A Tutorial,
Computer Systems Research Institute, University of Toronto, Toronto, Canada.
2. Goff, G., Kennedy, K., & Tseng, C. (1991, June). Practical Dependence Testing,
Proceedings of the SIGPLAN Conference on Programming Language Design and
Implementation, Toronto, Canada.
3. Allen, R., & Kennedy, K. (2002). Optimizing Compilers for Modern Architectures,
San Francisco: Morgan Kaufmann Publishers.