Numerical Analysis/
DCS/
University of Toronto
The PMIRKDC package
PMIRKDC is a collection of Fortran subroutines
for solving boundary value ordinary differential equations
(BVODEs) on a parallel shared-memory computer.
PMIRKDC is based on the package MIRKDC [1], which employs
mono-implicit Runge-Kutta schemes within a defect control algorithm.
The primary computational costs involve the treatment of large,
almost block diagonal (ABD) linear systems. The most significant
feature of PMIRKDC is the replacement of sequential ABD software,
COLROW, with new parallel ABD software, RSCALE, based on a parallel
block eigenvalue rescaling algorithm. Other modifications involve
parallelization of the setup of the ABD systems and solution interpolants,
and defect estimation. When run on a sequential computer, all parallel
directives in PMIRKDC are ignored (they are treated as comments).
The software contained on this page was used to generate many of
the numerical results discussed in [2], in which a modification of the
nonlinear problem Swirling Flow III [3] is solved with the
PMIRKDC/RSCALE package. Complete download and installation instructions,
along with sample input and output, are included below.
Source code
There are 10 files in total:
- Parallel MirkDC subroutines
- Swirling Flow III (SWF3) subroutines (solution uses parameter
continuation on SWF3's epsilon)
- Timing statistics collection subroutines
- ABD system reformatting subroutines
- Parallel processor set-up subroutines
Installation
PMIRKDC must be linked with RSCALE, LAPACK and the level-3 BLAS. Follow these
steps to install on a sequential computer:
- Download the 10 files listed above and the 16 files listed
on the RSCALE page
to a common directory.
- Compile with f77 -c -u *.f
- Link and load with
f77 *.o -LYourLibrary -llapack -lblas
Execution / input and output
The executable a.out reads from standard (keyboard) input and
writes to standard (terminal screen) output. Input/output may be redirected
from/to files with a.out < input.txt > output.txt
a.out expects the input described below. Further details
on these input values may be found in swf_iii
and pmirkdc.
- order of MIRK scheme (integer)
- defect tolerance (double precision)
- number of subintervals in initial mesh (integer)
- SWF3's epsilon (double precision)
- SWF3's disk coordinates (double precision)
- number of partitions (integer)
- output control
(integer; controls verbosity of PMIRKDC's diagnostics -
see pmirkdc for further details)
- selection criterion for hybrid ABD system solver kernel
(integer; ignored in this implementation - just set to 0)
- parameter continuation strategy nconts
(integer; see below)
The SWF3 problem is solved using a parameter continuation
strategy on epsilon. The input value nconts is
interpreted as follows:
- if (nconts = 1 .or. nconts = 0) parameter continuation
is not used; i.e., we attempt to solve the problem for the given
value of epsilon with a single call to PMIRKDC
- if (iabs(nconts) > 1) parameter continuation is used;
the next line of input contains nconts values for epsilon,
in descending order of magnitude, with the last value the desired
epsilon for the final solution, and with sign(nconts)
specifying how statistics will be recorded:
- if (nconts > 0) statistics are accumulated
over all nconts continuation problems
- if (nconts < 0) statistics are recorded for
the last continuation problem only
For example, the data given in input.txt
specifies an (epsilon = .0001) SWF3 problem on the interval [-1, 1].
The problem is to be solved using a 4th-order MIRK scheme, starting with
an initial uniform mesh of 10 subintervals. Five continuation iterations
will be invoked, for epsilon = .002, .001, .0004, .0002,
and finally .0001. For each continuation problem, the computed solution
must satisfy a defect tolerance of 1.d-07. All parallelized code segments
are to be partitioned into 2 slices and executed concurrently on 2 processors.
Timing statistics will be reported for the final continuation iteration only.
When a.out is run with
input.txt, the result is
output.txt.
The output is more or less self-explanatory. Three tables are generated
showing a breakdown of (1) the monitored non-(Linear Algebra) time,
(2) the monitored Linear Algebra time, and (3) the overall statistics.
(``Linear Algebra'', in this context, refers to the factorization
and solution of the ABD linear systems, and ``non-(Linear Algebra)''
refers to the residual and Jacobian evaluations, and defect estimation.)
The total time ``not monitored'' shown in the overall statistics refers
to the cumulative time taken by the non-parallelized segments of PMIRKDC.
It is the least significant time when a problem is solved on one processor,
but clearly becomes increasingly significant as more processors are used
and the parallelized segments of the code begin to speed up.
References
- W.H. Enright and P.H. Muir,
Runge-Kutta software with defect control for boundary value ODEs,
SIAM J. Sci. Comput., 17 (1996) pp. 479-497.
- P.H. Muir, R.N. Pancer and K.R. Jackson,
PMIRKDC: a parallel mono-implicit Runge-Kutta code with
defect control for boundary value ODEs,
Parallel Comput., 29/6 (2003) pp. 711-741.
- U.M. Ascher, R.M.M. Mattheij and R.D. Russell,
Numerical Solution of Boundary Value Problems for
Ordinary Differential Equations,
Classics in Applied Mathematics Series, SIAM, Philadelphia (1995).
Last modified by Richard Pancer, 13 November 2004.
Comments, complaints and reports of broken links to
pancer@utsc.utoronto.ca.