0% found this document useful (0 votes)
6 views

Parallel computation of determinants of matrices

This document presents a parallel algorithm for computing the determinant of matrices with polynomial entries, utilizing classical multivariate Lagrange polynomial interpolation and the Kronecker product structure of the associated linear system. The algorithm effectively reduces computational time and space requirements, making it suitable for various applications in computer algebra, including implicitization and solving polynomial equations. The authors provide a detailed description of the algorithm, examples, and comparisons with existing methods, highlighting its efficiency and parallelization capabilities.

Uploaded by

besnik
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Parallel computation of determinants of matrices

This document presents a parallel algorithm for computing the determinant of matrices with polynomial entries, utilizing classical multivariate Lagrange polynomial interpolation and the Kronecker product structure of the associated linear system. The algorithm effectively reduces computational time and space requirements, making it suitable for various applications in computer algebra, including implicitization and solving polynomial equations. The authors provide a detailed description of the algorithm, examples, and comparisons with existing methods, highlighting its efficiency and parallelization capabilities.

Uploaded by

besnik
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Journal of Symbolic Computation 37 (2004) 749–760

www.elsevier.com/locate/jsc

Parallel computation of determinants of matrices


with polynomial entries
Ana Marco∗, José-Javier Martı́nez
Departamento de Matemáticas, Universidad de Alcalá, Campus Universitario, 28871-Alcalá de Henares,
Madrid, Spain

Received 25 June 2001; accepted 26 November 2003

Abstract

An algorithm for computing the determinant of a matrix whose entries are multivariate
polynomials is presented. It is based on classical multivariate Lagrange polynomial interpolation, and
it exploits the Kronecker product structure of the coefficient matrix of the linear system associated
with the interpolation problem. From this approach, the parallelization of the algorithm arises
naturally. The reduction of the intermediate expression swell is also a remarkable feature of the
algorithm.
© 2004 Elsevier Ltd. All rights reserved.

1. Introduction

In computer algebra many problems of moderate size become intractable because


of the time and the space required for their solution. This is why reducing time and
space requirements as much as possible is an essential point in developing computer
algebra algorithms. An approach that has given solutions to this situation is the design
of parallelized algorithms.
In the last decade many computer algebra algorithms have been parallelized. For
example, in Hong et al. (2000) a parallel algorithm for the computation of the GCD of
two polynomials in the context of quantifier elimination is described. More examples and
references of parallelization of symbolic algorithms can be found in Khanin and Cartmell
(2001).

∗ Corresponding author.
E-mail addresses: [email protected] (A. Marco), [email protected] (J.-J. Martı́nez).

0747-7171/$ - see front matter © 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jsc.2003.11.002
750 A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760

In this paper we will show how the computation of the determinant of a matrix whose
entries are multivariate polynomials can be done in parallel. Our approach is based on
classical multivariate Lagrange polynomial interpolation, and our way to parallelization is
mainly based on exploiting the matrix structure of an appropriate interpolation problem.
An example of this situation is the Sylvester resultant of two polynomials, which is
the determinant of the Sylvester matrix. A parallel resultant computation is presented in
Blochinger et al. (1999), but it must be observed that, as in the case of Hong et al. (2000),
the approach used in Blochinger et al. (1999) consists of parallelizing standard modular
techniques.
Another example of parallelization in the field of resultant computation is the parallel
computation of the entries of the Dixon determinant which appears in Chionh (1997).
Let us observe that the computation of determinants of matrices whose entries are
multivariate polynomials is very common in different problems of computer algebra
such as solving systems of polynomial equations, computing multipolynomial resultants,
implicitizing curves and surfaces and so on. In this sense our work can serve in many
applications as a component of more complex algorithms, which has been pointed out in
Hong (2000) as an important aspect of research.
For example in the field of implicitization the need for improved algorithms, which
could be accomplished by means of parallelization, is indicated as one of the conclusions
in Aries and Senoussi (2001) where it is remarked that the last step of the implicitization
algorithm, corresponding to the computation of the determinant, consumes much time. Our
work addresses that step of the implicitization algorithm, for which the authors have used
the standard function ffgausselim of Maple.
Our method will also be useful as a complement to the techniques of implicitization
by the methods of moving curves or moving surfaces. For example, the method of
moving curves described in Sederberg et al. (1997) generates the implicit equation as
the determinant of a matrix whose entries are quadratic polynomials in x and y. The
order of that matrix is half the order of the corresponding Bézout matrix and, according
to the authors, these compressed expressions may lead to faster evaluation algorithms.
But, as can be read in Manocha and Canny (1993), the bottleneck in these methods is the
symbolic expansion of determinants, a problem which is not considered in Sederberg et al.
(1997).
A brief survey on the solution of linear systems in a computer algebra context, including
the computation of determinants, has been recently given in Kaltofen and Saunders (2003).
In that work, some important issues relevant to our work such as Gaussian elimination,
minor expansion, structured matrices and parallel algorithms are considered.
Let M be a square matrix of order k whose entries are polynomials in R[x 1 , . . . , xr ]
where r ≥ 2. The coefficients will usually be integer, rational or algebraic numbers. We
will compute the determinant of M, which we denote by F(x 1 , . . . , xr ) ∈ R[x 1 , . . . , xr ] by
means of classical multivariate Lagrange polynomial interpolation, so we need the bounds
on the degree of F(x 1 , . . . , xr ) in each one of the variables x i (i = 1, . . . , r ), i.e. n i ∈ N
with degxi (F(x 1 , . . . , xr )) ≤ n i , i = 1, . . . , r . In other words, we need to determine an
appropriate interpolation space to which F(x 1 , . . . , xr ) belongs: a vector space denoted by
Πn1 ,...,nr (x 1 , . . . , xr ) of polynomials in x 1 , . . . , xr with n i being the maximum exponent
in the variable x i (i = 1, . . . , r ).
A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760 751

In some problems the n i are known from previous theoretical results (for example in
the case of curve implicitization considered in Marco and Martı́nez (2001)), while in other
cases they can be calculated from the matrix. Let us observe that the total degree of F is
always a bound for the degree in each variable, although not necessarily the best one. From
now on, we suppose that the n i (i = 1, . . . , r ) are known.
In Section 2 a complete description of the algorithm is given for the case r = 2 and the
case r > 2 is sketched. In Section 3 a detailed example is given, leaving to Section 4 some
comparisons of our algorithm with the Maple command gausselim. Finally Section 5
summarizes some important properties of our approach.

2. Derivation of the algorithm

Let M be a square matrix of order k whose entries are bivariate polynomials in


R[x, y] and let n and m be the bounds on the degree of the determinant of M in x and
y, respectively. As we have seen in Section 1, the determinant of M is a polynomial
F(x, y) ∈ Πn,m (x, y). Our aim is to compute the polynomial F(x, y) by means of classical
bivariate Lagrange interpolation. A good introduction to the theory of interpolation can be
seen in Davis (1975).
If we consider the interpolation nodes (x i , y j ) (i = 0, . . . , n; j = 0, . . . , m) and the
interpolation space Πn,m (x, y), the interpolation problem is stated as follows:
Given (n + 1)(m + 1) values
fi j ∈ R (i = 0, . . . , n; j = 0, . . . , m)
(the interpolation data), find a polynomial

F(x, y) = ci j x i y j ∈ Πn,m (x, y)
(i, j )∈I

(where I is the index set I = {(i, j ) | i = 0, . . . , n; j = 0, . . . , m}) such that


F(x i , y j ) = f i j ∀ (i, j ) ∈ I.
If we consider for the interpolation space Πn,m (x, y) the basis

{x i y j | i = 0, . . . , n; j = 0, . . . , m}
= {1, y, . . . , y m , x, x y, . . . , x y m , . . . , x n , x n y, . . . , x n y m }

with that precise ordering, and the interpolation nodes in the corresponding order

{(x i , y j ) | i = 0, . . . , n; j = 0, . . . , m} = {(x 0 , y0 ), (x 0 , y1 ), . . . , (x 0 , ym ),
(x 1 , y0 ), (x 1 , y1 ), . . . , (x 1 , ym ), . . . , (x n , y0 ), . . . , (x n , ym )},

then the (n + 1)(m + 1) interpolation conditions F(x i , y j ) = f i j can be written as a linear


system
Ac = f,
752 A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760

where the coefficient matrix A is given by the Kronecker product


Vx ⊗ V y ,
(where the Kronecker product B ⊗ D is defined by blocks as (bkl D), with B = (bkl )), with
Vx being the Vandermonde matrix
 
1 x 0 x 02 · · · x 0n
 1 x1 x 2 · · · x n 
 1 1 
Vx =  . . .. . . ..  ,
 .. .. . . . 
1 x n x n2 · · · x nn
and similarly for Vy ,
c = (c00 , . . . , c0m , c10 , . . . , c1m , . . . , cn0 , . . . , cnm )T ,
and
f = ( f 00 , . . . , f0m , f10 , . . . , f1m , . . . , f n0 , . . . , f nm )T .
As is well known, see for example Horn and Johnson (1991) and Martı́nez (1999), if Vx and
Vy are nonsingular matrices the Kronecker product Vx ⊗ Vy will also be nonsingular. The
Vandermonde matrices Vx and Vy are nonsingular if x 0 , x 1 , . . . , x n are pairwise distinct
and y0 , y1 , . . . , ym are pairwise distinct.
Since we are not considering a finite field, we choose in our algorithm x i = i for
i = 0, . . . , n and y j = j for j = 0, . . . , m, so that Vx and Vy are nonsingular and
therefore A = Vx ⊗ Vy , the coefficient matrix of the linear system Ac = f , is nonsingular.
We have proved in this way that the interpolation problem has a unique solution.
This means that if we choose the interpolation data f i j as the corresponding values
F(i, j ) of the determinant of M at the points (i, j ), then the unique solution of the
interpolation problem (i.e. of the linear system Ac = f ) will give us the coefficients of
the determinant F(x, y).
The values fi j = F(i, j ) can be computed by means of the following Maple procedure,
which evaluates M at each interpolation node (i, j ), and then computes the determinant of
the corresponding constant matrix:

Let us observe that this general algorithm could be replaced with another one designed
for each particular case in order to reduce the computational cost of the process.
A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760 753

For example, in the case of implicitizing polynomial curves, the computation of each
determinant can be replaced by the computation of a resultant of two polynomials in R[t],
and so the computational complexity (in terms of arithmetic operations) will be O(k 2 )
instead of O(k 3 ) (see Marco and Martı́nez, 2001).
For the computation of the solution vector c we use an algorithm that reduces the
problem of solving a linear system with coefficient matrix Vx ⊗ Vy to solving n + 1
systems with the same matrix Vy and m + 1 systems with the same matrix Vx . The
algorithm for solving linear systems with a Kronecker product coefficient matrix is given
(for a generalized Kronecker product) in Martı́nez (1999), and a general algorithm for the
Kronecker product of several matrices (not necessarily Vandermonde matrices) is given
in Buis and Dyksen (1996). Taking into account that every linear system to be solved
is a Vandermonde linear system, it is convenient to use the Björck–Pereyra algorithm
(Björck and Pereyra, 1970; Golub and Van Loan, 1996) to solve those linear systems, since
it takes advantage of the special structure of the coefficient matrices Vx and Vy . A serial
implementation of the algorithm in Maple follows:
Stage I. Solution of n + 1 linear systems with coefficient matrix Vy .

Stage II. Solution of m + 1 linear systems with coefficient matrix Vx .


754 A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760

Since ci j is the coefficient of x i y j , the ordering of the components of the solution vector
c = (c00 , . . . , c0m , c10 , . . . , c1m , . . . , cn0 , . . . , cnm )
corresponds to the lexicographic order of the monomials, cnm being the leading coefficient.
Let us observe that the algorithm overwrites f i j with the solution vector component
ci j and it does not construct either the matrix A or the Vandermonde matrices Vx and Vy ,
which implies an additional saving both in memory cost and in computational cost.
The parallelization of the algorithm that we have just described can be easily done
because it performs the same computations on different sets of data without the necessity
of communication between the processors. It is trivial to see that in the computation of the
interpolation data each datum can be computed independently. As for the solution of the
linear system Ac = f (where A = Vx ⊗ Vy ), the solution of the n + 1 Vandermonde linear
systems with the same matrix Vy of stage I can be done independently, and once we have
the solutions of all of these systems, the same happens with the m + 1 Vandermonde linear
systems with matrix Vx of stage II. Taking this into account and considering n = m, we
see that the time required by the algorithm for computing the determinant can be divided
by n + 1 when we have n + 1 processors. This is an important consequence of the fact that
A has a Kronecker product structure (see Buis and Dyksen, 1996; Martı́nez, 1999).
As for the generalization of the algorithm to the case r > 2, we have to say that the
situation is completely analogous to the bivariate case. Now, let M be a matrix whose
entries are polynomials in R[x 1 , . . . , xr ] with r > 2, and let n i be the bound on the degree
of F(x 1 , . . . , xr ) in x i (i = 1, . . . , r ). For computing F(x 1 , . . . , xr ) by means of classical
multivariate Lagrange polynomial interpolation we have to solve a linear system Ac = f
where A = Vx1 ⊗ Vx2 ⊗ · · · ⊗ Vxr with Vxi the Vandermonde matrices generated by
the coordinates x i of the interpolation nodes for i = 1, . . . , r . In the same way as in the
bivariate case, we can reduce the solution of the system Ac = f to the solution of the
corresponding linear systems with coefficients matrices Vxi for i = 1, . . . , r . Since these
matrices are Vandermonde matrices we use the Björck–Pereyra algorithm to solve those
linear systems.
Now we give a theorem about the computational complexity (in terms of arithmetic
operations) of the whole algorithm. For the sake of clarity, we assume n 1 = . . . , nr = n.
The result can easily be extended to the general situation. As is usual in applications, we
also assume that the cost of the evaluation of the matrix at an interpolation node is small
when compared to the computation of the determinant of the evaluated matrix, and so it is
not considered here.
Theorem 2.1. Let M be a square matrix of order k whose entries are polynomials in
R[x 1 , . . . , xr ] with r ≥ 2 and let degxi (det(M)) ≤ n ∀i . The computational cost of
computing det(M) using our algorithm is (in terms of arithmetic operations):

(a) O(nr k 3 ) for the computation of the interpolation data.


(b) O(nr+1 ) for the solution of the corresponding linear system (Vx1 ⊗ Vx2 ⊗ · · · ⊗
Vx r ) c = f .
Proof. (a) The number of arithmetic operations required to compute each interpolation
datum is the number of arithmetic operations required to compute the determinant of
A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760 755

a matrix of order k : O(k 3 ). As (n + 1)r interpolation data have to be computed, the


computational cost of this stage is of O(nr k 3 ) arithmetic operations.
(b) As indicated before, the problem of solving a linear system with coefficient matrix
(B ⊗ D) c = f (where B is of order p and D is of order q) is reduced to solving
p systems with the same matrix D and q systems with the same matrix B.
So, in the bivariate case we reduce the solution of the system (Vx1 ⊗ Vx2 ) c = f to
the solution of n + 1 systems of order n + 1 with the same Vandermonde matrix
Vx2 and n + 1 systems of order n + 1 with the same Vandermonde matrix Vx1 .
Therefore, the case r = 2 requires O(n 3 ) arithmetic operations since the solution of
each Vandermonde linear system by means of the Björck–Pereyra algorithm requires
O(n 2 ) arithmetic operations.
Now, taking into account that
Vx1 ⊗ Vx2 ⊗ · · · ⊗ Vxr = (Vx1 ⊗ · · · ⊗ Vxr−1 ) ⊗ Vxr ,
an induction argument completes the proof for r > 2. 
Let us point out that without exploiting the Kronecker product structure, the solution by
Gaussian elimination of a linear system with coefficient matrix of order nr would require
O(n 3r ) arithmetic operations.
As for the parallelization, we are in the same situation as in the bivariate case. Each
interpolation datum can be computed independently and each one of the systems with the
same matrix Vxi can be solved independently (see Buis and Dyksen (1996) for a general
situation not involving
 Vandermonde matrices). So, if n i is the bound for the degree of the
variable x i , N = ri=1 (n i + 1), and we have a large enough number of processors, the
computing time of the corresponding stage of the algorithm is divided by N/(n i + 1).
It should be understood that one of our aims has been to present our method in a clear
and complete way so that it can be easily applied using Maple. Of course, all the stages of
the algorithm could be improved: using optimized methods for evaluating the matrix at the
interpolation nodes, taking advantage of the particular matrix structure when computing the
numerical determinants, and solving the linear systems with faster methods. For instance,
the Vandermonde systems could be solved using superfast methods like those given in Lu
(1994) and Bini and Pan (1994). Nevertheless, it must be remarked that the strength of our
method comes mainly from the possibility of parallelization.

3. A small example in detail


We illustrate the algorithm described in Section 2 with the following example. Consider
the polynomials
g(x, y, t) = 1 + x + y + x y + y 2 + x y 2 + t + xt + yt + x yt + y 2 t + x y 2t − 2t 2
+ xt 2 + yt 2 + x yt 2 + y 2 t 2 + x y 2 t 2
and
h(x, y, t) = 1 + y + t − 2x + xt + t 2 + y 2 + x y 2 + x y + x yt + yt + y 2 t + x y 2t
+ xt 2 + yt 2 + x yt 2 + y 2 t 2 + x y 2t 2 + x 2 yt + x 2 t + x 2 y 2 + x 2 y
+ x 2 y 2 t + x 2 t 2 + x 2 yt 2 + x 2 y 2 t 2 + x 2 .
756 A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760

Our aim is to compute Rest (g, h), the resultant with respect to t of g(x, y, t)
and h(x, y, t), i.e. the determinant of the Sylvester matrix of g(x, y, t) and h(x, y, t)
considered as polynomials in t. A formula for computing the bounds of Rest (g, h) in x
and y is well known (see, for example, Winkler, 1996). Such bounds are n = 6 and m = 8
in our example.
We will consider that the coefficients of g(x, y, t) and h(x, y, t) (polynomials
in R[x, y]) are expressed in their Horner form to reduce as much as possible the
computational cost of the evaluation.
We are going to show the interpolation data, the results of stage I and the results of
stage II in a compact form, using matrices to avoid unnecessary notations. The rows of the
matrix below correspond to the interpolation data, where the interpolation datum f i j is the
(i + 1, j + 1) entry:
 9 81 441 1 521 3 969 8 649 16 641 29 241 47 961 
 −18 810 6570 25 470 69 822 156 042 304 650 540 270 891 630 
 
 63 6 399 46 503 175 239 474 903 1 055 223 2 053 359 3 633 903 5 988 879 
 
 900 2 2198 788 
 26 568 179 208 661 428 1 777 140 3 931 560 7 631 208 13 483 908 .
 3 789 5 9833 629 
 76 869 495 405 1 804 149 4 820 229 1 0633 149 2 0604 789 36 369 405 
 10 674 178 686 1 117 566 4 033 026 1 0733 634 2 3630 814 4 5738 846 80 674 866 13 2658 866 
24 147 359 235 2 199 915 7 887 195 2 0932 587 4 6018 107 8 8996 275 15 6890 115 25 7893 155

Now we compute the coefficients of Rest (g, h) making use of the algorithm described
in Section 2. We detail the solution of all the Vandermonde linear systems involved in the
process.
The solutions of the seven linear systems with the same Vandermonde coefficient matrix
Vy (i.e. the output of stage I) are the rows of the matrix below:
 
9 18 27 18 9 0 0 0 0
 −18 72 243 342 171 0 0 0 0
 
 63 882 2025 2286 1143 0 0 0 0
 
 900 0
 4392 8613 8442 4221 0 0 0 .
 3789 13 842 25 191 22 698 11 349 0
 0 0 0 
 10 674 33 768 58 887 50 238 25 119 0 0 0 0
24 147 70 002 118 773 97 542 48 771 0 0 0 0
The results of stage II, i.e. the solutions of the nine linear systems with the same
Vandermonde coefficient matrix Vx and vectors of data being the columns of the matrix
above, are the columns of the following matrix, where ci j , the coefficient of x i y j , is the
(i + 1, j + 1) entry:
 
9 18 27 18 9 0 0 0 0
 −27 0 27 54 27 0 0 0 0
 
 27 0 54 108 54 0 0 0 0
 
 −54 0 54 108 54 0 0
 0 0 .
 27 54 81 54 27 0 0
 0 0 
 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760 757

So, the resultant Rest (g, h) is

F(x, y) = 27x 4 y 4 + 54x 4 y 3 + 81x 4 y 2 + 54x 4 y + 27x 4 + 54x 3 y 4 + 108x 3 y 3


+ 54x 3 y 2 − 54x 3 + 54x 2 y 4 + 108x 2 y 3 + 54x 2 y 2 + 27x 2 + 27x y 4
+ 54x y 3 + 27x y 2 − 27x + 9y 4 + 18y 3 + 27y 2 + 18y + 9.

Let us observe that we do not need to solve the nine Vandermonde linear systems with
matrix Vx . We only need to solve the first five systems because the solution of the other four
is obviously the vector of zeros. This situation appears in all the examples where the bound
of the determinant in y is not the exact degree of the polynomial in y (in this example the
bound was m = 8 while the exact degree is 4).

4. Some comparisons

Of course, for problems of small size like the example of Section 3, any method will
usually give the answer in little time. But when the size of the problem is not small
the differences appear clearly. In this section we will include five examples showing
the performance of the method that we have presented. All the results here included are
obtained by using Maple 8 on a personal computer.
In Examples 1–4 M is the Sylvester matrix constructed with Maple by means of the
command

where
f = (t + 1)n − x(t + 2)n , g = (t + 3)n − y(t + 4)n , n = 13, 15, 17, 19.
In Example 5 M is again a Sylvester matrix, but now
√ √
f = (t + 1)8 + 2 − x(t + 2)8 , g = (t + 3)8 + 3 − y(t + 4)8 .
We have chosen the computation of the determinant of a Sylvester matrix for
three important reasons: the relevance of such matrices in applications (computation of
resultants, implicitization, . . . ), the availability of the degree bounds n i needed to apply
our method and the possibility of an easy reproduction of our experiments by any reader.
Since all the examples correspond to the implicitization of rational curves, using a
result quoted in Marco and Martı́nez (2001) we have in Examples 1–4 n 1 = n 2 = n
(n = 13, 15, 17, 19, respectively), and in Example 5 n 1 = n 2 = 8.
Now we briefly indicate the results obtained when using a sequential implementation of
our algorithm and the command gausselim of the Maple package linalg. Let us observe
that we have also compared our algorithm with the commands det and ffgausselim, both
in the Maple package linalg, and with the command Determinant of the Maple package
LinearAlgebra, considering the options fracfree, minor, multivar and algnum. As
the results obtained when using these Maple commands are not better than the results
obtained with gausselim, we have only included the results for this latter Maple
command.
758 A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760

Example 1 (n = 13). The time and the space required by our algorithm for computing
the determinant of M are 18.719 s and 6224 780 bytes. The time and the space needed by
gausselim in computing the same determinant are 47.069 s and 7535 260 bytes.
Example 2 (n = 15). The time required by our algorithm for computing the determinant
of M is of 42.669 s whereas the time spent by gausselim is of 214.521 s (i.e. our algorithm
is approximately 5 times faster). As for the space, our algorithm has needed 6224 780 bytes
(more or less the same as in Example 1) while gausselim has required 10 483 840 bytes.
Example 3 (n = 17). The time spent by our algorithm in computing the determinant of M
is of 91.625 s; it is more than 6 times faster than gausselim, which has required 647.954 s.
As for the space, our algorithm has required 6421352 bytes, half the space required by
gausselim, which has needed 13 301 372 bytes.
Example 4 (n = 19). The time and the space required by our algorithm for computing
the determinant of M are 192.194 s and 6355 828 bytes. The time and the space needed by
gausselim in computing the same determinant are 2176.711 s and 19 264 056 bytes.
Example 5. In this example the polynomials f and g have lower degree in t, but some of
their coefficients are algebraic numbers, which makes the computation slower.
The time spent by our algorithm in computing the determinant of M is of 10.978 s;
it is 3 times faster than gausselim, which has required 33.773 s. As for the space, our
algorithm has required 6224 780 bytes, while gausselim has needed 16 970 716 bytes.
Remark 1. Let us point out that in spite of the length of the numbers involved, the
interpolation stage (i.e. the solution of the linear system) has taken less than 1 s in all
the examples.
Remark 2. If our algorithm had been fully parallelized using a large enough number of
processors for each case, the determinants of all the examples would have been computed
in a couple of seconds.

5. Conclusions and final remarks

In the previous sections we have shown how to construct a parallelized algorithm for
computing the determinant of a matrix whose entries are multivariate polynomials. Our
approach is based on classical multivariate Lagrange polynomial interpolation and takes
advantage of the matrix structure of the linear system associated with the interpolation
problem.
Let us observe that we do not consider the case of sparse interpolation, since the
dense case arises naturally in many applications (Aries and Senoussi, 2001). For the
sparse multivariate polynomial interpolation problem, a practical algorithm has appeared in
Murao and Fujise (1996), in which modular techniques are used and the parallelization of
some steps of the algorithm is considered. The approach used in Murao and Fujise (1996)
also needs bounds for the degree of each variable of the interpolating polynomial, which
in our case gives the interpolation space without any additional work. Moreover, we do not
need to construct explicitly the coefficient matrix of the corresponding linear system, which
A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760 759

means an additional saving in computing time. And finally the possibility of parallelization
is given by the Kronecker product structure of that coefficient matrix.
In the following remarks we state in detail some of the features of the algorithm:
1. In contrast with the approach of Manocha and Canny (1993), where a different
interpolation technique is considered, we do not use modular techniques. This is
why we do not need to look for big primes and so we get the same results even when
the coefficients are not rational numbers. Of course, as Example 5 shows, if we work
with algebraic numbers the cost of arithmetic operations will increase.
2. The algorithm only makes use of arithmetic operations with the coefficients of
the polynomials in the entries of the matrix. In this way we reduce enormously
the problem of “intermediate expression swell” that appears when computing the
determinant of matrices with polynomial entries (see Section 2.8 of Davenport et al.,
1988). It is clear from the examples of Section 4 that the interpolation data are
very big numbers, but that cannot be avoided if one wishes to use interpolation.
Nevertheless, it should also be observed that our method only works with numbers
and not with more complicated expressions: they are big, but they are numbers, and
in that case the arithmetic of Maple (at least for the rational case) performs very
efficiently.
3. The algorithm has a high degree of intrinsic parallelism both in the computation
of the interpolation data and in solving all the linear systems with the same
Vandermonde matrix Vxi , i = 1, . . . , r . This is why the parallelized algorithm
reduces greatly the time required for computing the determinant.
4. The algorithm is completely deterministic in nature, i.e. it does not contain any
probabilistic step.
5. The loop structure of the algorithm allows us to estimate in advance the time the
algorithm will spend in solving each specific problem. To be precise, if we must
compute N determinants (the interpolation data), the total time will be less than the
time required to compute the datum f n1 ,...,nr multiplied by N. And, analogously, the
time needed to solve all the Vandermonde systems can be estimated by multiplying
the number of systems to be solved by the time required for solving one linear
system.

Acknowledgements
The authors wish to thank two anonymous referees whose suggestions led to several
improvements of the paper, including the addition of Section 4.
This research was partially supported by Research Grant BFM 2003–03510 from the
Spanish Ministerio de Ciencia y Tecnologı́a.

References
Aries, F., Senoussi, R., 2001. An implicitization algorithm for rational surfaces with no base points.
J. Symbolic Computation 31, 357–365.
Bini, D., Pan, V.Y., 1994. Polynomial and Matrix Computations. Fundamental Algorithms.
volume 1: Birkhäuser, Boston.
760 A. Marco, J.-J. Martı́nez / Journal of Symbolic Computation 37 (2004) 749–760

Björck, A., Pereyra, V., 1970. Solution of Vandermonde systems of equations. Math. Comp. 24,
893–903.
Blochinger, W., Küchlin, W., Ludwig, C., Weber, A., 1999. An object-oriented platform for
distributed high-performance symbolic computation. Math. Comput. Simulation 49, 161–178.
Buis, P.E., Dyksen, W.R., 1996. Efficient vector and parallel manipulation of tensor products. ACM
Trans. Math. Software 22 (1), 18–23.
Chionh, E.W., 1997. Concise parallel Dixon determinant. Comput. Aided Geom. Design 14,
561–570.
Davenport, J.H., Siret, Y., Tournier, E., 1988. Computer Algebra: Systems and Algorithms for
Algebraic Computation. Academic Press, New York.
Davis, P.J., 1975. Interpolation and Approximation. Dover Publications Inc., New York.
Golub, G.H., Van Loan, C.F., 1996. Matrix Computations. third edition. Johns Hopkins University
Press, Baltimore, MD.
Hong, H., 2000. J. Symbolic Computation 29, 3–4 (Editorial).
Hong, H., Liska, R., Robidoux, N., Steinberg, S., 2000. Elimination of variables in parallel. SIAM
News 33 (8), 1, 12–13.
Horn, R.A., Johnson, C.R., 1991. Topics in Matrix Analysis. Cambridge University Press,
Cambridge.
Kaltofen, E., Saunders, B.D., 2003. Linear Systems. In: Grabmeier, J., Kaltofen, E.,
Weispfenning, V. (Eds.), Computer Algebra Handbook. Springer-Verlag, Berlin (section 2.3.1).
Khanin, R., Cartmell, M., 2001. Parallelization of perturbation analysis: application to large–scale
engineering problems. J. Symbolic Computation 31, 461–473.
Lu, H., 1994. Fast solution of confluent Vandermonde linear systems. SIAM J. Matrix Anal. Appl.
15 (4), 1277–1289.
Manocha, D., Canny, J.F., 1993. Multipolynomial resultant algorithms. J. Symbolic Computation
15, 99–122.
Marco, A., Martı́nez, J.J., 2001. Using polynomial interpolation for implicitizing algebraic curves.
Comput. Aided Geom. Design 18, 309–319.
Martı́nez, J.J., 1999. A generalized Kronecker product and linear systems. Internat. J. Math. Ed. Sci.
Tech. 30 (1), 137–141.
Murao, H., Fujise, T., 1996. Modular algorithm for sparse multivariate polynomial interpolation and
its parallel implementation. J. Symbolic Computation 21, 377–396.
Sederberg, T., Goldman, R., Du, H., 1997. Implicitizing rational curves by the method of moving
algebraic curves. J. Symbolic Computation 23, 153–175.
Winkler, F., 1996. Polynomial Algorithms in Computer Algebra. Springer-Verlag, Wien, New York.

You might also like