@ -55,46 +55,46 @@ original version by: Nikos Drakos, CBLU, University of Leeds
<BR>
Multigrid Background
</H1>
Multigrid preconditioners, coupled with Krylov iterative
solvers, are widely used in the parallel solution of large and sparse linear systems,
because of their optimality in the solution of linear systems arising from the
discretization of scalar elliptic Partial Differential Equations (PDEs) on regular grids.
Optimality, also known as algorithmic scalability, is the property
of having a computational cost per iteration that depends linearly on
the problem size, and a convergence rate that is independent of the problem size.
Multigrid preconditioners are based on a recursive application of a two-grid process
consisting of smoother iterations and a coarse-space (or coarse-level) correction.
The smoothers may be either basic iterative methods, such as the Jacobi and Gauss-Seidel ones,
or more complex subspace-correction methods, such as the Schwarz ones.
The coarse-space correction consists of solving, in an appropriately chosen
coarse space, the residual equation associated with the approximate solution computed
by the smoother, and of using the solution of this equation to correct the
previous approximation. The transfer of information between the original
(fine) space and the coarse one is performed by using suitable restriction and
prolongation operators. The construction of the coarse space and the corresponding
transfer operators is carried out by applying a so-called coarsening algorithm to the system
matrix. Two main approaches can be used to perform coarsening: the geometric approach,
which exploits the knowledge of some physical grid associated with the matrix
and requires the user to define transfer operators from the fine
to the coarse level and vice versa, and the algebraic approach, which builds
the coarse-space correction and the associate transfer operators using only matrix
information. The first approach may be difficult when the system comes from
discretizations on complex geometries;
furthermore, ad hoc one-level smoothers may be required to get an efficient
interplay between fine and coarse levels, e.g., when matrices with highly varying coefficients
are considered. The second approach performs a fully automatic coarsening and enforces the
interplay between fine and coarse level by suitably choosing the coarse space and
the coarse-to-fine interpolation (see, e.g., [<A
HREF="node26.html#Briggs2000">3</A>,<A
HREF="node26.html#Stuben_01">23</A>,<A
HREF="node26.html#dd2_96">21</A>] for details.)
MLD2P4 uses a pure algebraic approach, based on the smoothed
aggregation algorithm [<A
HREF="node26.html#BREZINA_VANEK">2</A>,<A
HREF="node26.html#VANEK_MANDEL_BREZINA">25</A>],
for building the sequence of coarse matrices and transfer operators,
starting from the original one.
A decoupled version of this algorithm is implemented, where the smoothed
aggregation is applied locally to each submatrix [<A
HREF="node26.html#TUMINARO_TONG">24</A>].
A brief description of the AMG preconditioners implemented in MLD2P4 is given in
Sections <AHREF="node12.html#sec:multilevel">4.1</A>-<AHREF="#sec:smoothers">4.3</A>. For further details the reader
is referred to [<A
HREF="node26.html#para_04">4</A>,<A
HREF="node26.html#aaecc_07">5</A>,<A
HREF="node26.html#apnum_07">7</A>,<A
HREF="node26.html#MLD2P4_TOMS">8</A>].
We note that optimal multigrid preconditioners do not necessarily correspond
to minimum execution times in a parallel setting. Indeed, to obtain effective parallel
multigrid preconditioners, a tradeoff between the optimality and the cost of building and
applying the smoothers and the coarse-space corrections must be achieved. Effective
parallel preconditioners require algorithmic scalability to be coupled with implementation
scalability, i.e., a computational cost per iteration which remains (almost) constant as
the number of parallel processors increases.
<BR><HR>
HREF="node27.html#Briggs2000">3</A>,<A
HREF="node27.html#Stuben_01">23</A>,<A
HREF="node27.html#dd2_96">21</A>] for details.)
MLD2P4 uses a pure algebraic approach, based on the smoothed
aggregation algorithm [<A
HREF="node27.html#BREZINA_VANEK">2</A>,<A
HREF="node27.html#VANEK_MANDEL_BREZINA">25</A>],
for building the sequence of coarse matrices and transfer operators,
starting from the original one.
A decoupled version of this algorithm is implemented, where the smoothed
aggregation is applied locally to each submatrix [<A
HREF="node27.html#TUMINARO_TONG">24</A>].
A brief description of the AMG preconditioners implemented in MLD2P4 is given in
Sections <AHREF="node12.html#sec:multilevel">4.1</A>-<AHREF="#sec:smoothers">4.3</A>. For further details the reader
is referred to [<A
HREF="node27.html#para_04">4</A>,<A
HREF="node27.html#aaecc_07">5</A>,<A
HREF="node27.html#apnum_07">7</A>,<A
HREF="node27.html#MLD2P4_TOMS">8</A>].
We note that optimal multigrid preconditioners do not necessarily correspond
to minimum execution times in a parallel setting. Indeed, to obtain effective parallel
multigrid preconditioners, a tradeoff between the optimality and the cost of building and
applying the smoothers and the coarse-space corrections must be achieved. Effective
parallel preconditioners require algorithmic scalability to be coupled with implementation
scalability, i.e., a computational cost per iteration which remains (almost) constant as
the number of parallel processors increases.
<BR><HR>
The M<SMALL>ULTI-</SMALL>L<SMALL>EVEL </SMALL>D<SMALL>OMAIN </SMALL>D<SMALL>ECOMPOSITION </SMALL>P<SMALL>ARALLEL </SMALL>P<SMALL>RECONDITIONERS </SMALL>P<SMALL>ACKAGE BASED ON
</SMALL>PSBLAS (MLD2P4) provides parallel Algebraic MultiGrid (AMG) and Domain
Decomposition preconditioners (see, e.g., [<A
HREF="node26.html#Briggs2000">3</A>,<A
HREF="node26.html#Stuben_01">23</A>,<A
HREF="node26.html#dd2_96">21</A>]),
HREF="node27.html#Briggs2000">3</A>,<A
HREF="node27.html#Stuben_01">23</A>,<A
HREF="node27.html#dd2_96">21</A>]),
to be used in the iterative solution of linear systems,
<BR>
<DIVALIGN="RIGHT">
@ -95,8 +95,8 @@ multi-level cycles and smoothers widely used in multigrid methods.
The multi-level preconditioners implemented in MLD2P4 are obtained by combining
AMG cycles with smoothers and coarsest-level solvers. The V-, W-, and
K-cycles [<A
HREF="node26.html#Briggs2000">3</A>,<A
HREF="node26.html#Notay2008">19</A>] are available, which allow to define
HREF="node27.html#Briggs2000">3</A>,<A
HREF="node27.html#Notay2008">19</A>] are available, which allow to define
almost all the preconditioners in the package, including the multi-level hybrid
Schwarz ones; a specific cycle is implemented to obtain multi-level additive
Schwarz preconditioners. The Jacobi, hybridforward/backward Gauss-Seidel, block-Jacobi, and additive Schwarz methods
@ -104,8 +104,8 @@ are available as smoothers. An algebraic approach is used to generate a hierarch
coarse-level matrices and operators, without explicitly using any information on the
geometry of the original problem, e.g., the discretization of a PDE. To this end,
the smoothed aggregation technique [<A
HREF="node26.html#BREZINA_VANEK">2</A>,<A
HREF="node26.html#VANEK_MANDEL_BREZINA">25</A>]
HREF="node27.html#BREZINA_VANEK">2</A>,<A
HREF="node27.html#VANEK_MANDEL_BREZINA">25</A>]
is applied. Either exact or approximate solvers can be used on the coarsest-level
system. Specifically, different sparse LU factorizations from external
packages, and native incomplete LU factorizations and Jacobi, hybrid Gauss-Seidel,
@ -126,8 +126,8 @@ interface.
MLD2P4 has been designed to implement scalable and easy-to-use
multilevel preconditioners in the context of the PSBLAS (Parallel Sparse BLAS)
computational framework [<A
HREF="node26.html#psblas_00">15</A>,<A
HREF="node26.html#PSBLAS3">14</A>]. PSBLAS provides basic linear algebra
HREF="node27.html#psblas_00">15</A>,<A
HREF="node27.html#PSBLAS3">14</A>]. PSBLAS provides basic linear algebra
operators and data management facilities for distributed sparse matrices,
as well as parallel Krylov solvers which can be used with the MLD2P4 preconditioners.
The choice of PSBLAS has been mainly motivated by the need of having
@ -150,14 +150,14 @@ few black-box routines at the upper layer allow all users to easily
build and apply any preconditioner available in MLD2P4;
facilities are also available allowing expert users to extend the set of smoothers
and solvers for building new versions of the preconditioners (see
We note that the user interface of MLD2P4 2.1 has been extended with respect to the
previous versions in order to separate the construction of the multi-level hierarchy from
the construction of the smoothers and solvers, and to allow for more flexibility
at each level. The software architecture described in [<A
HREF="node26.html#MLD2P4_TOMS">8</A>] has significantly
HREF="node27.html#MLD2P4_TOMS">8</A>] has significantly
evolved too, in order to fully exploit the Fortran 2003 features implemented in PSBLAS 3.
However, compatibility with previous versions has been preserved.
@ -171,34 +171,34 @@ preconditioners with the Krylov solvers implemented in PSBLAS are reported
in Section <AHREF="node13.html#sec:started">5</A>, where the Fortran codes of a few sample programs
are also shown. A reference guide for the user interface routines is provided
in Section <AHREF="node15.html#sec:userinterface">6</A>. Information on the extension of the package
through the addition of new smoothers and solvers is reported in Section <AHREF="#sec:adding"><IMGALIGN="BOTTOM"BORDER="1"ALT="[*]"SRC="crossref.png"></A>.
through the addition of new smoothers and solvers is reported in Section <AHREF="node24.html#sec:adding">7</A>.
The error handling mechanism used by the package
is briefly described in Section <AHREF="node24.html#sec:errors">7</A>. The copyright terms concerning the
distribution and modification of MLD2P4 are reported in Appendix <AHREF="node25.html#sec:license">A</A>.
is briefly described in Section <AHREF="node25.html#sec:errors">8</A>. The copyright terms concerning the
distribution and modification of MLD2P4 are reported in Appendix <AHREF="node26.html#sec:license">A</A>.