subroutine rscale (lftblk, array, nrwblk, nbloks, rgtblk, * b, pivot, iflag, nparts, work) c double precision lftblk(1), array(1), rgtblk(1), b(1), work(1) integer nrwblk, nbloks, pivot(1), iflag, nparts c ***----------------------------------------------------------*** c * This subroutine solves the linear system A x = b where * c * A is an Almost Block Diagonal matrix of the form * c * * c * lftblk rgtblk * c * array(,,1) * c * array(,,2) * c * . * c * . * c * . * c * array(,,nbloks) * c * * c * lftblk and rgtblk are each nrwblkxnrwblk, array(,,k) * c * is nrwblkx2*nrwblk, and array(,,k) and array(,,k+1) * c * overlap by nrwblk columns. The linear system is square * c * and of order (nbloks+1)*nrwblk. * c * * c * THE ALGORITHM: * c * * c * The system is decomposed and solved using a variant * c * of the parallel Rescaling algorithm described in [1]. * c * Parallelism is achieved by slicing the system into * c * 'nparts' partitions in such a way that each partition * c * can be processed independently. Assuming at least one * c * processor is available per partition, a speed-up of S * c * (over sequential Rescaling) may be attained where * c * * c * 1 <= S < nparts, * c * * c * with S = 1 if nbloks < 2*nparts, * c * and S ~ nparts if nparts << nbloks/nparts. * c * * c * In other words, for systems of sufficiently high order, * c * speed-up is approximately linear with respect to nparts * c * when nparts is sufficiently small. * c * * c * PARAMETERS: * c * * c * on entry * c * * c * lftblk [double precision(nrwblk,nrwblk)] * c * The top left block of the ABD matrix. * c * * c * array [double precision(nrwblk,2*nrwblk,nbloks)] * c * array(,,k) contains the k-th nrwblkx2*nrwblk * c * block of the ABD matrix. * c * * c * nrwblk [integer] * c * The number of rows in lftblk, array(,,k), * c * and rgtblk. The number of columns in * c * lftblk and rgtblk. There are 2*nrwblk * c * columns in array(,,k). * c * * c * nbloks [integer] * c * The number of nrwblkx2*nrwblk blocks * c * in array(,,). * c * * c * rgtblk [double precision(nrwblk,nrwblk)] * c * The top right block of the ABD matrix. * c * * c * b [double precision((nbloks+1)*nrwblk)] * c * The right-hand side vector. * c * * c * pivot [integer((nbloks+1)*nrwblk)] * c * Work space to hold the pivoting strategy. * c * * c * nparts [integer] * c * The number of partitions to use in the * c * decomposition and solve. * c * * c * work [double precision * c * ((nbloks+2*nparts+1)*nrwblk**2)] * c * Work space to hold fill-in and local * c * storage for BLAS. * c * * c * on return * c * * c * lftblk, array, rgtblk, work * c * The desired decomposition of the ABD matrix. * c * * c * [ Note: If iflag = -1 the factorization is * c * not complete. ] * c * * c * nrwblk, nbloks * c * Unchanged. * c * * c * b [double precision((nbloks+1)*nrwblk)] * c * The solution vector (if iflag = 0). * c * * c * pivot [integer((nbloks+1)*nrwblk)] * c * The pivoting strategy (if iflag = 0). * c * * c * iflag [integer] * c * = 0 on normal return * c * = -1 if the ABD matrix is singular * c * * c * [ Note: Only exact singularity is detected; * c * iflag = 0 is not a guarantee of well- * c * conditioning. ] * c * * c * nparts [integer] * c * Normally unchanged. If, however, the * c * requested number of partitions would * c * result in fewer than 2 blocks of array(,,) * c * per partition (i.e. if nbloks < 2*nparts), * c * the subroutine automatically resets nparts * c * to 1 and uses non-partitioned Rescaling. * c * * c * SUBROUTINES CALLED: * c * * c * rscfa (lftblk, array, nrwblk, nbloks, rgtblk, * c * pivot, iflag, nparts, work) * c * * c * Factors the ABD matrix using parallel Rescaling. * c * Parameters are as described above. * c * * c * rscsl (lftblk, array, nrwblk, nbloks, rgtblk, * c * b, pivot, nparts, work) * c * * c * Uses the factors returned by 'rscfa' to perform * c * forward elimination and back-solve on right-hand * c * side b. Parameters are as described above. * c * * c * SOLVING FOR MULTIPLE RIGHT-HAND SIDES: * c * * c * 'rscale' is called only once for a given system A x = b. * c * If iflag = 0 the system is solved. In order to solve for * c * a different right-hand side (i.e. A x = b'), 'rscsl' is * c * called directly. The arrays lftblk, array, rgtblk, work, * c * and pivot contain the decomposition of A and pivoting * c * strategy on return from 'rscale' and therefore must not * c * be altered between successive calls to 'rscsl'. b is * c * the only parameter that may be changed. * c * * c * REFERENCES: * c * * c * [1] K.R. Jackson and R.N. Pancer, The parallel solution * c * of ABD systems arising in numerical methods for * c * BVPs for ODEs, University of Toronto, Department * c * of Computer Science, Technical Report 255/91, 1992. * c ***----------------------------------------------------------*** c call rscfa (lftblk, array, nrwblk, nbloks, rgtblk, * pivot, iflag, nparts, work) if (iflag .eq. 0) then call rscsl (lftblk, array, nrwblk, nbloks, rgtblk, * b, pivot, nparts, work) end if return end