</H2><FONTSIZE="+1"><FONTSIZE="+1"><FONTSIZE="+1">
The smoothers implemented in MLD2P4 include the Jacobi and block-Jacobi methods,
a hybrid version of the forward and backward Gauss-Seidel methods, and the
additive Schwarz (AS) ones (see, e.g., [<A
HREF="node29.html#Saad_book">20</A>,<A
HREF="node29.html#dd2_96">21</A>]).
The hybrid Gauss-Seidel
version is considered because the original Gauss-Seidel method is inherently sequential.
At each iteration of the hybrid version, each parallel process uses the most recent values
of its own local variables and the values of the non-local variables computed at the
previous iteration, obtained by exchanging data with other processes before
the beginning of the current iteration.
In the AS methods, the index space <IMG
WIDTH="25" HEIGHT="18" ALIGN="BOTTOM" BORDER="0"
SRC="img9.png"
ALT="$\Omega^k$"> is divided into <IMG
WIDTH="28" HEIGHT="31" ALIGN="MIDDLE" BORDER="0"
SRC="img50.png"
ALT="$m_k$">
subsets <IMG
WIDTH="25" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img51.png"
ALT="$\Omega^k_i$"> of size <IMG
WIDTH="32" HEIGHT="31" ALIGN="MIDDLE" BORDER="0"
SRC="img52.png"
ALT="$n_{k,i}$">, possibly
overlapping. For each <IMG
WIDTH="11" HEIGHT="18" ALIGN="BOTTOM" BORDER="0"
SRC="img30.png"
ALT="$i$"> we consider the restriction
operator <!-- MATH
$R_i^k \in \mathbb{R}^{n_{k,i} \times n_k}$
-->
<IMG
WIDTH="110" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img53.png"
ALT="$R_i^k \in \mathbb{R}^{n_{k,i} \times n_k}$">
that maps a vector <IMG
WIDTH="23" HEIGHT="19" ALIGN="BOTTOM" BORDER="0"
SRC="img54.png"
ALT="$x^k$"> to the vector <IMG
WIDTH="22" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img55.png"
ALT="$x_i^k$"> made of the components of <IMG
WIDTH="23" HEIGHT="19" ALIGN="BOTTOM" BORDER="0"
SRC="img54.png"
ALT="$x^k$">
with indices in <IMG
WIDTH="25" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img51.png"
ALT="$\Omega^k_i$">, and the prolongation operator
<!-- MATH
$P^k_i = (R_i^k)^T$
-->
<IMG
WIDTH="95" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img56.png"
ALT="$P^k_i = (R_i^k)^T$">. These operators are then used to build
<!-- MATH
$A_i^k=R_i^kA^kP_i^k$
-->
<IMG
WIDTH="113" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img57.png"
ALT="$A_i^k=R_i^kA^kP_i^k$">, which is the restriction of <IMG
WIDTH="26" HEIGHT="18" ALIGN="BOTTOM" BORDER="0"
SRC="img41.png"
ALT="$A^k$"> to the index
space <IMG
WIDTH="25" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img51.png"
ALT="$\Omega^k_i$">.
The classical AS preconditioner <IMG
WIDTH="41" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img58.png"
ALT="$M^k_{AS}$"> is defined as
</FONT></FONT></FONT>
<P></P><FONTSIZE="+1"><FONTSIZE="+1"><FONTSIZE="+1">
where <IMG
WIDTH="26" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img60.png"
ALT="$A_i^k$"> is supposed to be nonsingular. We observe that an approximate
inverse of <IMG
WIDTH="26" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img60.png"
ALT="$A_i^k$"> is usually considered instead of <IMG
WIDTH="57" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img61.png"
ALT="$(A_i^k)^{-1}$">.
The setup of <IMG
WIDTH="41" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img58.png"
ALT="$M^k_{AS}$"> during the multilevel build phase
involves
</FONT></FONT></FONT>
<UL>
<LI>the definition of the index subspaces <IMG
WIDTH="25" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img62.png"
ALT="$\Omega_i^k$"> and of the corresponding
operators <IMG
WIDTH="26" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img63.png"
ALT="$R_i^k$"> (and <IMG
WIDTH="26" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img64.png"
ALT="$P_i^k$">);
</LI>
<LI>the computation of the submatrices <IMG
WIDTH="26" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img60.png"
ALT="$A_i^k$">;
</LI>
<LI>the computation of their inverses (usually approximated
through some form of incomplete factorization).
</LI>
</UL><FONTSIZE="+1"><FONTSIZE="+1"><FONTSIZE="+1">
The computation of <!-- MATH
$z^k=M^k_{AS}w^k$
-->
<IMG
WIDTH="102" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img65.png"
ALT="$z^k=M^k_{AS}w^k$">, with <!-- MATH
$w^k \in \mathbb{R}^{n_k}$
-->
<IMG
WIDTH="76" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img66.png"
ALT="$w^k \in \mathbb{R}^{n_k}$">, during the
multilevel application phase, requires
</FONT></FONT></FONT>
<UL>
<LI>the restriction of <IMG
WIDTH="25" HEIGHT="19" ALIGN="BOTTOM" BORDER="0"
SRC="img67.png"
ALT="$w^k$"> to the subspaces <!-- MATH
$\mathbb{R}^{n_{k,i}}$
-->
<IMG
WIDTH="41" HEIGHT="15" ALIGN="BOTTOM" BORDER="0"
SRC="img68.png"
ALT="$\mathbb{R}^{n_{k,i}}$">,
i.e. <!-- MATH
$w_i^k = R_i^{k} w^k$
-->
<IMG
WIDTH="91" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img69.png"
ALT="$w_i^k = R_i^{k} w^k$">;
</LI>
<LI>the computation of the vectors <!-- MATH
$z_i^k=(A_i^k)^{-1} w_i^k$
-->
<IMG
WIDTH="119" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img70.png"
ALT="$z_i^k=(A_i^k)^{-1} w_i^k$">;
</LI>
<LI>the prolongation and the sum of the previous vectors,
i.e. <!-- MATH
$z^k = \sum_{i=1}^{m_k} P_i^k z_i^k$
-->
<IMG
WIDTH="127" HEIGHT="39" ALIGN="MIDDLE" BORDER="0"
SRC="img71.png"
ALT="$z^k = \sum_{i=1}^{m_k} P_i^k z_i^k$">.
</LI>
</UL><FONTSIZE="+1"><FONTSIZE="+1"><FONTSIZE="+1">
Variants of the classical AS method, which use modifications of the
restriction and prolongation operators, are also implemented in MLD2P4.
Among them, the Restricted AS (RAS) preconditioner usually
outperforms the classical AS preconditioner in terms of convergence
rate and of computation and communication time on parallel distributed-memory
computers, and is therefore the most widely used among the AS
preconditioners [<A
HREF="node29.html#CAI_SARKIS">6</A>].
Direct solvers based on sparse LU factorizations, implemented in the
third-party libraries reported in Section <AHREF="node7.html#sec:third-party">3.2</A>, can be applied
as coarsest-level solvers by MLD2P4. Native inexact solvers based on
incomplete LU factorizations, as well as Jacobi, hybrid (forward) Gauss-Seidel,
and block Jacobi preconditioners are also available. Direct solvers usually
lead to more effective preconditioners in terms of algorithmic scalability;
however, this does not guarantee parallel efficiency.