@ -55,24 +55,22 @@ original version by: Nikos Drakos, CBLU, University of Leeds
Abstract</A>
Abstract</A>
</H1>
</H1>
MLD2P4 (M<SMALL>ULTI-</SMALL>L<SMALL>EVEL </SMALL>D<SMALL>OMAIN </SMALL>D<SMALL>ECOMPOSITION </SMALL>P<SMALL>ARALLEL </SMALL>P<SMALL>RECONDITIONERS </SMALL>P<SMALL>ACKAGE BASED ON
</H1>
Multigrid preconditioners, coupled with Krylov iterative
solvers, are widely used in the parallel solution of large and sparse linear systems,
because of their optimality in the solution of linear systems arising from the
discretization of scalar elliptic Partial Differential Equations (PDEs) on regular grids.
Optimality, also known as algorithmic scalability, is the property
of having a computational cost per iteration that depends linearly on
the problem size, and a convergence rate that is independent of the problem size.
Multigrid preconditioners are based on a recursive application of a two-grid process
consisting of smoother iterations and a coarse-space (or coarse-level) correction.
The smoothers may be either basic iterative methods, such as the Jacobi and Gauss-Seidel ones,
or more complex subspace-correction methods, such as the Schwarz ones.
The coarse-space correction consists of solving, in an appropriately chosen
coarse space, the residual equation associated with the approximate solution computed
by the smoother, and of using the solution of this equation to correct the
previous approximation. The transfer of information between the original
(fine) space and the coarse one is performed by using suitable restriction and
prolongation operators. The construction of the coarse space and the corresponding
transfer operators is carried out by applying a so-called coarsening algorithm to the system
matrix. Two main approaches can be used to perform coarsening: the geometric approach,
which exploits the knowledge of some physical grid associated with the matrix
and requires the user to define transfer operators from the fine
to the coarse level and vice versa, and the algebraic approach, which builds
the coarse-space correction and the associate transfer operators using only matrix
information. The first approach may be difficult when the system comes from
discretizations on complex geometries;
furthermore, ad hoc one-level smoothers may be required to get an efficient
interplay between fine and coarse levels, e.g., when matrices with highly varying coefficients
are considered. The second approach performs a fully automatic coarsening and enforces the
interplay between fine and coarse level by suitably choosing the coarse space and
the coarse-to-fine interpolation (see, e.g., [<A
HREF="node27.html#Briggs2000">2</A>,<A
<P>
HREF="node27.html#Stuben_01">27</A>,<A
<I>Domain Decomposition</I> (DD) preconditioners, coupled with Krylov iterative
HREF="node27.html#dd2_96">25</A>] for details.)
MLD2P4 uses a pure algebraic approach, based on the smoothed
aggregation algorithm [<A
solvers, are widely used in the parallel solution of large and sparse linear systems.
HREF="node27.html#BREZINA_VANEK">1</A>,<A
These preconditioners are based on the divide and conquer technique: the matrix
HREF="node27.html#VANEK_MANDEL_BREZINA">29</A>],
for building the sequence of coarse matrices and transfer operators,
starting from the original one.
A decoupled version of this algorithm is implemented, where the smoothed
aggregation is applied locally to each submatrix [<A
to be preconditioned is divided into submatrices, a ``local'' linear system
HREF="node27.html#TUMINARO_TONG">28</A>].
A brief description of the AMG preconditioners implemented in MLD2P4 is given in
Sections <AHREF="node12.html#sec:multilevel">4.1</A>-<AHREF="#sec:smoothers">4.3</A>. For further details the reader
is referred to [<A
involving each submatrix is (approximately) solved, and the local solutions are used
HREF="node27.html#para_04">3</A>,<A
to build a preconditioner for the whole original matrix. This process
HREF="node27.html#aaecc_07">4</A>,<A
often corresponds to dividing a physical domain associated to the original matrix
HREF="node27.html#apnum_07">5</A>,<A
into subdomains, e.g. in a PDE discretization, to (approximately) solving the
HREF="node27.html#MLD2P4_TOMS">9</A>].
We note that optimal multigrid preconditioners do not necessarily correspond
to minimum execution times in a parallel setting. Indeed, to obtain effective parallel
multigrid preconditioners, a tradeoff between the optimality and the cost of building and
applying the smoothers and the coarse-space corrections must be achieved. Effective
parallel preconditioners require algorithmic scalability to be coupled with implementation
scalability, i.e., a computational cost per iteration which remains (almost) constant as
the number of parallel processors increases.
<BR><HR>
subproblems corresponding to the subdomains and to building an approximate
solution of the original problem from the local solutions
[<A
HREF="node28.html#Cai_Widlund_92">6</A>,<A
HREF="node28.html#dd1_94">7</A>,<A
HREF="node28.html#dd2_96">23</A>].
<P>
<I>Additive Schwarz</I> preconditioners are DD preconditioners using overlapping
submatrices, i.e. with some common rows, to couple the local information
related to the submatrices (see, e.g., [<A
HREF="node28.html#dd2_96">23</A>]).
The main motivation for choosing Additive Schwarz preconditioners is their
intrinsic parallelism. A drawback of these
preconditioners is that the number of iterations of the preconditioned solvers
generally grows with the number of submatrices. This may be a serious limitation
on parallel computers, since the number of submatrices usually matches the number
of available processors. Optimal convergence rates, i.e. iteration numbers
independent of the number of submatrices, can be obtained by correcting the
preconditioner through a suitable approximation of the original linear system
in a coarse space, which globally couples the information related to the single
submatrices.
<P>
<I>Two-level Schwarz</I> preconditioners are obtained
by combining basic (one-level) Schwarz preconditioners with a coarse-level
correction. In this context, the one-level preconditioner is often
called `smoother'. Different two-level preconditioners are obtained by varying the
choice of the smoother and of the coarse-level correction, and the
way they are combined [<A
HREF="node28.html#dd2_96">23</A>]. The same reasoning can be applied starting
from the coarse-level system, i.e. a coarse-space correction can be built
from this system, thus obtaining <I>multi-level</I> preconditioners.
<P>
It is worth noting that optimal preconditioners do not necessarily correspond
to minimum execution times. Indeed, to obtain effective multi-level preconditioners
a tradeoff between optimality of convergence and the cost of building and applying
the coarse-space corrections must be achieved. The choice of the number of levels,
i.e. of the coarse-space corrections, also affects the effectiveness of the
preconditioners. One more goal is to get convergence rates as less sensitive
as possible to variations in the matrix coefficients.
<P>
Two main approaches can be used to build coarse-space corrections. The geometric approach
applies coarsening strategies based on the knowledge of some physical grid associated
to the matrix and requires the user to define grid transfer operators from the fine
to the coarse levels and vice versa. This may result difficult for complex geometries;
furthermore, suitable one-level preconditioners may be required to get efficient
interplay between fine and coarse levels, e.g. when matrices with highly varying coefficients
are considered. The algebraic approach builds coarse-space corrections using only matrix
information. It performs a fully automatic coarsening and enforces the interplay between
the fine and coarse levels by suitably choosing the coarse space and the coarse-to-fine
interpolation [<A
HREF="node28.html#Stuben_01">25</A>].
<P>
MLD2P4 uses a pure algebraic approach for building the sequence of coarse matrices
starting from the original matrix. The algebraic approach is based on the <I>smoothed
aggregation</I> algorithm [<A
HREF="node28.html#BREZINA_VANEK">1</A>,<A
HREF="node28.html#VANEK_MANDEL_BREZINA">27</A>]. A decoupled version
of this algorithm is implemented, where the smoothed aggregation is applied locally
to each submatrix [<A
HREF="node28.html#TUMINARO_TONG">26</A>]. In the next two subsections we provide
a brief description of the multi-level Schwarz preconditioners and of the smoothed
aggregation technique as implemented in MLD2P4. For further details the reader
ALT="$A=(a_{ij}) \in \mathbb{R}^{n \times n}$"> is a nonsingular sparse matrix;
for ease of presentation we assume <IMG
nonsingular sparse matrix with a symmetric nonzero pattern,
let <IMG
WIDTH="92" HEIGHT="36" ALIGN="MIDDLE" BORDER="0"
SRC="img5.png"
ALT="$G=(W,E)$"> be the adjacency graph of <IMG
WIDTH="18" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
WIDTH="18" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img2.png"
SRC="img2.png"
ALT="$A$">, where <!-- MATH
ALT="$A$"> is real, but the
results are valid for the complex case as well.
Let us assume as finest index space the set of row (column) indices of <IMG
$W=\{1, 2, \ldots, n\}$
-->
<IMG
WIDTH="139" HEIGHT="36" ALIGN="MIDDLE" BORDER="0"
SRC="img6.png"
ALT="$W=\{1, 2, \ldots, n\}$">
and <!-- MATH
$E=\{(i,j) : a_{ij} \neq 0\}$
-->
<IMG
WIDTH="162" HEIGHT="36" ALIGN="MIDDLE" BORDER="0"
SRC="img7.png"
ALT="$E=\{(i,j) : a_{ij} \neq 0\}$"> are the vertex set and the edge set of <IMG
WIDTH="18" HEIGHT="16" ALIGN="BOTTOM" BORDER="0"
SRC="img8.png"
ALT="$G$">,
respectively. Two vertices are called adjacent if there is an edge connecting
them. For any integer <IMG
WIDTH="45" HEIGHT="34" ALIGN="MIDDLE" BORDER="0"
SRC="img9.png"
ALT="$\delta > 0$">, a <IMG
WIDTH="13" HEIGHT="16" ALIGN="BOTTOM" BORDER="0"
SRC="img10.png"
ALT="$\delta$">-overlap
partition of <IMG
WIDTH="23" HEIGHT="16" ALIGN="BOTTOM" BORDER="0"
SRC="img11.png"
ALT="$W$"> can be defined recursively as follows.
Given a 0-overlap (or non-overlapping) partition of <IMG
WIDTH="23" HEIGHT="16" ALIGN="BOTTOM" BORDER="0"
SRC="img11.png"
ALT="$W$">,
i.e. a set of <IMG
WIDTH="20" HEIGHT="18" ALIGN="BOTTOM" BORDER="0"
SRC="img12.png"
ALT="$m$"> disjoint nonempty sets <!-- MATH
$W_i^0 \subset W$
-->
<IMG
WIDTH="73" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img13.png"
ALT="$W_i^0 \subset W$"> such that
<!-- MATH
$\cup_{i=1}^m W_i^0 = W$
-->
<IMG
WIDTH="108" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img14.png"
ALT="$\cup_{i=1}^m W_i^0 = W$">, a <IMG
WIDTH="13" HEIGHT="16" ALIGN="BOTTOM" BORDER="0"
SRC="img10.png"
ALT="$\delta$">-overlap
partition of <IMG
WIDTH="23" HEIGHT="16" ALIGN="BOTTOM" BORDER="0"
SRC="img11.png"
ALT="$W$"> is obtained by considering the sets
<!-- MATH
$W_i^\delta \supset W_i^{\delta-1}$
-->
<IMG
WIDTH="97" HEIGHT="41" ALIGN="MIDDLE" BORDER="0"
SRC="img15.png"
ALT="$W_i^\delta \supset W_i^{\delta-1}$"> obtained by including the vertices that
are adjacent to any vertex in <!-- MATH
$W_i^{\delta-1}$
-->
<IMG
WIDTH="48" HEIGHT="41" ALIGN="MIDDLE" BORDER="0"
SRC="img16.png"
ALT="$W_i^{\delta-1}$">.
<P>
Let <IMG
WIDTH="22" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img17.png"
ALT="$n_i^\delta$"> be the size of <IMG
WIDTH="31" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img18.png"
ALT="$W_i^\delta$"> and <!-- MATH
$R_i^{\delta} \in
\Re^{n_i^\delta \times n}$
-->
<IMG
WIDTH="93" HEIGHT="45" ALIGN="MIDDLE" BORDER="0"
SRC="img19.png"
ALT="$R_i^{\delta} \in
\Re^{n_i^\delta \times n}$"> the restriction operator that maps
a vector <IMG
WIDTH="57" HEIGHT="34" ALIGN="MIDDLE" BORDER="0"
SRC="img20.png"
ALT="$v \in \Re^n$"> onto the vector <!-- MATH
$v_i^{\delta} \in \Re^{n_i^\delta}$
-->
<IMG
WIDTH="70" HEIGHT="45" ALIGN="MIDDLE" BORDER="0"
SRC="img21.png"
ALT="$v_i^{\delta} \in \Re^{n_i^\delta}$">
containing the components of <IMG
WIDTH="14" HEIGHT="18" ALIGN="BOTTOM" BORDER="0"
SRC="img22.png"
ALT="$v$"> corresponding to the vertices in
<IMG
WIDTH="31" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img18.png"
ALT="$W_i^\delta$">. The transpose of <IMG
WIDTH="25" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img23.png"
ALT="$R_i^{\delta}$"> is a
prolongation operator from <!-- MATH
$\Re^{n_i^\delta}$
-->
<IMG
WIDTH="33" HEIGHT="24" ALIGN="BOTTOM" BORDER="0"
SRC="img24.png"
ALT="$\Re^{n_i^\delta}$"> to <IMG
WIDTH="26" HEIGHT="16" ALIGN="BOTTOM" BORDER="0"
SRC="img25.png"
ALT="$\Re^n$">.
The matrix <!-- MATH
$A_i^\delta=R_i^\delta A (R_i^\delta)^T \in
\Re^{n_i^\delta \times n_i^\delta}$
-->
<IMG
WIDTH="201" HEIGHT="45" ALIGN="MIDDLE" BORDER="0"
SRC="img26.png"
ALT="$A_i^\delta=R_i^\delta A (R_i^\delta)^T \in
\Re^{n_i^\delta \times n_i^\delta}$"> can be considered
as a restriction of <IMG
WIDTH="18" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
WIDTH="18" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img2.png"
SRC="img2.png"
ALT="$A$"> corresponding to the set <IMG
ALT="$A$">, i.e.,
<!-- MATH
WIDTH="30" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
$\Omega = \{1, 2, \ldots, n\}$
SRC="img27.png"
ALT="$W_i^{\delta}$">.
<P>
The <I>classical one-level AS</I> preconditioner is defined by
<BR><P></P>
<DIVALIGN="CENTER">
<!-- MATH
\begin{displaymath}
M_{AS}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
(A_i^\delta)^{-1} R_i^{\delta},
\end{displaymath}
-->
<IMG
WIDTH="207" HEIGHT="58" BORDER="0"
SRC="img28.png"
ALT="\begin{displaymath}
M_{AS}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
(A_i^\delta)^{-1} R_i^{\delta},
\end{displaymath}">
</DIV>
<BRCLEAR="ALL">
<P></P>
where <IMG
WIDTH="25" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img29.png"
ALT="$A_i^\delta$"> is assumed to be nonsingular. Its application
to a vector <IMG
WIDTH="57" HEIGHT="34" ALIGN="MIDDLE" BORDER="0"
SRC="img20.png"
ALT="$v \in \Re^n$"> within a Krylov solver requires the following
three steps:
<OL>
<LI>restriction of <IMG
WIDTH="14" HEIGHT="18" ALIGN="BOTTOM" BORDER="0"
SRC="img22.png"
ALT="$v$"> as <!-- MATH
$v_i = R_i^{\delta} v$
-->
<IMG
WIDTH="71" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img30.png"
ALT="$v_i = R_i^{\delta} v$">, <IMG
WIDTH="96" HEIGHT="33" ALIGN="MIDDLE" BORDER="0"
SRC="img31.png"
ALT="$i=1,\ldots,m$">;
</LI>
<LI>solution of the linear systems <!-- MATH
$A_i^\delta w_i = v_i$
-->
<IMG
WIDTH="80" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img32.png"
ALT="$A_i^\delta w_i = v_i$">,
<IMG
WIDTH="96" HEIGHT="33" ALIGN="MIDDLE" BORDER="0"
SRC="img31.png"
ALT="$i=1,\ldots,m$">;
</LI>
<LI>prolongation and sum of the <IMG
WIDTH="22" HEIGHT="31" ALIGN="MIDDLE" BORDER="0"
SRC="img33.png"
ALT="$w_i$">'s, i.e. <!-- MATH
$w = \sum_{i=1}^m (R_i^{\delta})^T w_i$
-->
<IMG
WIDTH="145" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img34.png"
ALT="$w = \sum_{i=1}^m (R_i^{\delta})^T w_i$">.
</LI>
</OL>
Note that the linear systems at step 2 are usually solved approximately,
e.g. using incomplete LU factorizations such as ILU(<IMG
WIDTH="13" HEIGHT="31" ALIGN="MIDDLE" BORDER="0"
SRC="img35.png"
ALT="$p$">), MILU(<IMG
WIDTH="13" HEIGHT="31" ALIGN="MIDDLE" BORDER="0"
SRC="img35.png"
ALT="$p$">) and
ILU(<IMG
WIDTH="27" HEIGHT="31" ALIGN="MIDDLE" BORDER="0"
SRC="img36.png"
ALT="$p,t$">) [<A
HREF="node28.html#Saad_book">22</A>, Chapter 10].
<P>
A variant of the classical AS preconditioner that outperforms it
in terms of convergence rate and of computation and communication
time on parallel distributed-memory computers is the so-called <I>Restricted AS
(RAS)</I> preconditioner [<A
HREF="node28.html#CAI_SARKIS">5</A>,<A
HREF="node28.html#EFSTATHIOU">15</A>]. It
is obtained by zeroing the components of <IMG
WIDTH="22" HEIGHT="31" ALIGN="MIDDLE" BORDER="0"
SRC="img33.png"
ALT="$w_i$"> corresponding to the
overlapping vertices when applying the prolongation. Therefore,
RAS differs from classical AS by the prolongation operators,
ALT="$\tilde{R}_i^0$"> is obtained by zeroing the rows of <IMG
WIDTH="25" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img39.png"
ALT="$R_i^\delta$">
corresponding to the vertices in <!-- MATH
$W_i^\delta \backslash W_i^0$
-->
-->
<IMG
<IMG
WIDTH="66" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
WIDTH="132" HEIGHT="36" ALIGN="MIDDLE" BORDER="0"
SRC="img40.png"
SRC="img5.png"
ALT="$W_i^\delta \backslash W_i^0$">:
ALT="$\Omega = \{1, 2, \ldots, n\}$">.
Any algebraic multilevel preconditioners implemented in MLD2P4 generates
a hierarchy of index spaces and a corresponding hierarchy of matrices,
<BR><P></P>
<BR><P></P>
<DIVALIGN="CENTER">
<!-- MATH
\begin{displaymath}
M_{RAS}^{-1}= \sum_{i=1}^m (\tilde{R}_i^0)^T
(A_i^\delta)^{-1} R_i^{\delta}.
\end{displaymath}
-->
<IMG
WIDTH="217" HEIGHT="58" BORDER="0"
SRC="img41.png"
ALT="\begin{displaymath}
M_{RAS}^{-1}= \sum_{i=1}^m (\tilde{R}_i^0)^T
(A_i^\delta)^{-1} R_i^{\delta}.
\end{displaymath}">
</DIV>
<BRCLEAR="ALL">
<P></P>
Analogously, the AS variant called <I>AS with Harmonic extension (ASH)</I>
The M<SMALL>ULTI-</SMALL>L<SMALL>EVEL </SMALL>D<SMALL>OMAIN </SMALL>D<SMALL>ECOMPOSITION </SMALL>P<SMALL>ARALLEL </SMALL>P<SMALL>RECONDITIONERS </SMALL>P<SMALL>ACKAGE BASED ON
The M<SMALL>ULTI-</SMALL>L<SMALL>EVEL </SMALL>D<SMALL>OMAIN </SMALL>D<SMALL>ECOMPOSITION </SMALL>P<SMALL>ARALLEL </SMALL>P<SMALL>RECONDITIONERS </SMALL>P<SMALL>ACKAGE BASED ON
</SMALL>PSBLAS (MLD2P4) provides parallel Algebraic MultiGrid (AMG) and domain decomposition
</SMALL>PSBLAS (MLD2P4) provides parallel Algebraic MultiGrid (AMG) and Domain
preconditioners, designed to provide scalable and easy-to-use preconditioners
Decomposition preconditioners (see, e.g., [<A
multi-level Schwarz preconditioners [<A
HREF="node27.html#Briggs2000">2</A>,<A
HREF="node28.html#Stuben_01">25</A>,<A
HREF="node27.html#Stuben_01">27</A>,<A
HREF="node28.html#dd2_96">23</A>],
HREF="node27.html#dd2_96">25</A>]),
to be used in the iterative solutions of sparse linear systems:
to be used in the iterative solution of linear systems,
<BR>
<BR>
<DIVALIGN="RIGHT">
<DIVALIGN="RIGHT">
@ -86,26 +86,37 @@ Ax=b,
where <IMG
where <IMG
WIDTH="18" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
WIDTH="18" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img2.png"
SRC="img2.png"
ALT="$A$"> is a square, real or complex, sparse matrix. Multi-level preconditioners can be obtained by combining several AMG cycles (V, W, K) with
ALT="$A$"> is a square, real or complex, sparse matrix. The name of the package comes from its original implementation, containing
different smoothers (Jacobi, hybrid forward/backward Gauss-Seidel, block-Jacobi, additive Schwarz methods).
multi-level additive and hybrid Schwarz preconditioners, as well as one-level additive
An algebraic approach is used to
Schwarz preconditioners. The current version extends the original plan by including
generate a hierarchy of coarse-level matrices and operators, without
multi-level cycles and smoothers widely used in multigrid methods.
explicitly using any information on the geometry of the original problem, e.g.,
the discretization of a PDE. The smoothed aggregation technique is applied
<P>
as algebraic coarsening strategy [<A
The multi-level preconditioners implemented in MLD2P4 are obtained by combining
HREF="node28.html#BREZINA_VANEK">1</A>,<A
AMG cycles with smoothers and coarsest-level solvers. The V-, W-, and
HREF="node28.html#VANEK_MANDEL_BREZINA">27</A>].
K-cycles [<A
Either exact or approximate solvers are available to solve the coarsest-level system. Specifically,
HREF="node27.html#Briggs2000">2</A>,<A
different versions of sparse LU factorizations from external packages, and native incomplete
HREF="node27.html#Notay2008">23</A>] are available, which allow to define
LU factorizations and iterative block-Jacobi solvers can be used.
almost all the preconditioners in the package, including the multi-level hybrid
All smoothers can be also exploited as one-level preconditioners.
Schwarz ones; a specific cycle is implemented to obained multi-level additive
Schwarz preconditioners. The Jacobi, hybridforward/backward Gauss-Seidel, block-Jacobi, and additive Schwarz methods
are available as smoothers. An algebraic approach is used to generate a hierarchy of
coarse-level matrices and operators, without explicitly using any information on the
geometry of the original problem, e.g., the discretization of a PDE. To this end,
the smoothed aggregation technique [<A
HREF="node27.html#BREZINA_VANEK">1</A>,<A
HREF="node27.html#VANEK_MANDEL_BREZINA">29</A>]
is applied. Either exact or approximate solvers can be used on the coarsest-level
system. Specifically, different sparse LU factorizations from external
packages, and native incomplete LU factorizations and Jacobi, hybrid Gauss-Seidel,
and block-Jacobi solvers are available. All smoothers can be also exploited as one-level
preconditioners.
<P>
<P>
MLD2P4 is written in Fortran 2003, following an
MLD2P4 is written in Fortran 2003, following an
object-oriented design through the exploitation of features
object-oriented design through the exploitation of features
such as abstract data type creation, functional overloading, and
such as abstract data type creation, type extension, functional overloading, and
dynamic memory management.
dynamic memory management. The parallel implementation is based on a Single Program Multiple Data
The parallel implementation is based on a Single Program Multiple Data
(SPMD) paradigm. Single and
(SPMD) paradigm. Single and
double precision implementations of MLD2P4 are available for both the
double precision implementations of MLD2P4 are available for both the
real and the complex case, which can be used through a single
real and the complex case, which can be used through a single
@ -113,84 +124,81 @@ interface.
<P>
<P>
MLD2P4 has been designed to implement scalable and easy-to-use
MLD2P4 has been designed to implement scalable and easy-to-use
multilevel preconditioners in the context of the PSBLAS
multilevel preconditioners in the context of the PSBLAS (Parallel Sparse BLAS)