<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"  
  "http://www.w3.org/TR/html4/loose.dtd">  
<html > 
<head><title>Smoothers and coarsest-level solvers</title> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> 
<meta name="generator" content="TeX4ht (http://www.tug.org/tex4ht/)"> 
<meta name="originator" content="TeX4ht (http://www.tug.org/tex4ht/)"> 
<!-- html,3 --> 
<meta name="src" content="userhtml.tex"> 
<link rel="stylesheet" type="text/css" href="userhtml.css"> 
</head><body 
>
   <!--l. 216--><div class="crosslinks"><p class="noindent"><span 
class="cmr-12">[</span><a 
href="userhtmlsu7.html" ><span 
class="cmr-12">prev</span></a><span 
class="cmr-12">] [</span><a 
href="userhtmlsu7.html#tailuserhtmlsu7.html" ><span 
class="cmr-12">prev-tail</span></a><span 
class="cmr-12">] [</span><a 
href="#tailuserhtmlsu8.html"><span 
class="cmr-12">tail</span></a><span 
class="cmr-12">] [</span><a 
href="userhtmlse4.html#userhtmlsu8.html" ><span 
class="cmr-12">up</span></a><span 
class="cmr-12">] </span></p></div>
   <h4 class="subsectionHead"><span class="titlemark"><span 
class="cmr-12">4.3   </span></span> <a 
 id="x16-150004.3"></a><span 
class="cmr-12">Smoothers and coarsest-level solvers</span></h4>
<!--l. 218--><p class="noindent" ><span 
class="cmr-12">The smoothers implemented in MLD2P4 include the Jacobi and block-Jacobi methods,</span>
<span 
class="cmr-12">a hybrid version of the forward and backward Gauss-Seidel methods, and the additive</span>
<span 
class="cmr-12">Schwarz (AS) ones (see, e.g., </span><span class="cite"><span 
class="cmr-12">[</span><a 
href="userhtmlli4.html#XSaad_book"><span 
class="cmr-12">21</span></a><span 
class="cmr-12">,</span><span 
class="cmr-12">&#x00A0;</span><a 
href="userhtmlli4.html#Xdd2_96"><span 
class="cmr-12">22</span></a><span 
class="cmr-12">]</span></span><span 
class="cmr-12">).</span>
<!--l. 222--><p class="indent" >   <span 
class="cmr-12">The hybrid Gauss-Seidel version is considered because the original Gauss-Seidel</span>
<span 
class="cmr-12">method is inherently sequential. At each iteration of the hybrid version, each parallel</span>
<span 
class="cmr-12">process uses the most recent values of its own local variables and the values</span>
<span 
class="cmr-12">of the non-local variables computed at the previous iteration, obtained by</span>
<span 
class="cmr-12">exchanging data with other processes before the beginning of the current</span>
<span 
class="cmr-12">iteration.</span>
<!--l. 229--><p class="indent" >   <span 
class="cmr-12">In the AS methods, the index space &#x03A9;</span><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">is divided into </span><span 
class="cmmi-12">m</span><sub>
<span 
class="cmmi-8">k</span></sub> <span 
class="cmr-12">subsets &#x03A9;</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">of</span>
<span 
class="cmr-12">size </span><span 
class="cmmi-12">n</span><sub><span 
class="cmmi-8">k,i</span></sub><span 
class="cmr-12">, possibly overlapping. For each </span><span 
class="cmmi-12">i </span><span 
class="cmr-12">we consider the restriction operator</span>
<span 
class="cmmi-12">R</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmsy-10x-x-120">&#x2208; </span><span 
class="msbm-10x-x-120">&#x211D;</span><sup><span 
class="cmmi-8">n</span><sub><span 
class="cmmi-6">k,i</span></sub><span 
class="cmsy-8">&#x00D7;</span><span 
class="cmmi-8">n</span><sub><span 
class="cmmi-6">k</span></sub></sup> <span 
class="cmr-12">that maps a vector </span><span 
class="cmmi-12">x</span><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">to the vector </span><span 
class="cmmi-12">x</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">made of the components of</span>
<span 
class="cmmi-12">x</span><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">with indices in &#x03A9;</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">, and the prolongation operator </span><span 
class="cmmi-12">P</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">= (</span><span 
class="cmmi-12">R</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">)</span><sup><span 
class="cmmi-8">T</span> </sup><span 
class="cmr-12">. These</span>
<span 
class="cmr-12">operators are then used to build </span><span 
class="cmmi-12">A</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">= </span><span 
class="cmmi-12">R</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmmi-12">A</span><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmmi-12">P</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">, which is the restriction of</span>
<span 
class="cmmi-12">A</span><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">to the index space &#x03A9;</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">. The classical AS preconditioner </span><span 
class="cmmi-12">M</span><sub>
<span 
class="cmmi-8">AS</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">is defined</span>
<span 
class="cmr-12">as</span>
   <center class="math-display" >
<img 
src="userhtml18x.png" alt="            m&#x2211;k
(M AkS )-1 =     Pki (Aki)-1Rki,
            i=1
" class="math-display" ></center>
<!--l. 241--><p class="nopar" > <span 
class="cmr-12">where </span><span 
class="cmmi-12">A</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">is supposed to be nonsingular. We observe that an approximate inverse of</span>
<span 
class="cmmi-12">A</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">is usually considered instead of (</span><span 
class="cmmi-12">A</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">)</span><sup><span 
class="cmsy-8">-</span><span 
class="cmr-8">1</span></sup><span 
class="cmr-12">. The setup of </span><span 
class="cmmi-12">M</span><sub>
<span 
class="cmmi-8">AS</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">during the multilevel</span>
<span 
class="cmr-12">build phase involves</span>
     <ul class="itemize1">
     <li class="itemize"><span 
class="cmr-12">the definition of the index subspaces &#x03A9;</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">and of the corresponding operators</span>
     <span 
class="cmmi-12">R</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">(and </span><span 
class="cmmi-12">P</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">);</span>
     </li>
     <li class="itemize"><span 
class="cmr-12">the computation of the submatrices </span><span 
class="cmmi-12">A</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">;</span>
     </li>
     <li class="itemize"><span 
class="cmr-12">the computation of their inverses (usually approximated through some form</span>
     <span 
class="cmr-12">of incomplete factorization).</span></li></ul>
                                                                               

                                                                               
<!--l. 253--><p class="noindent" ><span 
class="cmr-12">The computation of </span><span 
class="cmmi-12">z</span><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">= </span><span 
class="cmmi-12">M</span><sub>
<span 
class="cmmi-8">AS</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmmi-12">w</span><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">, with </span><span 
class="cmmi-12">w</span><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmsy-10x-x-120">&#x2208; </span><span 
class="msbm-10x-x-120">&#x211D;</span><sup><span 
class="cmmi-8">n</span><sub><span 
class="cmmi-6">k</span></sub></sup><span 
class="cmr-12">, during the multilevel application</span>
<span 
class="cmr-12">phase, requires</span>
     <ul class="itemize1">
     <li class="itemize"><span 
class="cmr-12">the restriction of </span><span 
class="cmmi-12">w</span><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">to the subspaces </span><span 
class="msbm-10x-x-120">&#x211D;</span><sup><span 
class="cmmi-8">n</span><sub><span 
class="cmmi-6">k,i</span></sub></sup><span 
class="cmr-12">, i.e.</span><span 
class="cmr-12">&#x00A0;</span><span 
class="cmmi-12">w</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">= </span><span 
class="cmmi-12">R</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmmi-12">w</span><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">;</span>
     </li>
     <li class="itemize"><span 
class="cmr-12">the computation of the vectors </span><span 
class="cmmi-12">z</span><sub><span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup> <span 
class="cmr-12">= (</span><span 
class="cmmi-12">A</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">)</span><sup><span 
class="cmsy-8">-</span><span 
class="cmr-8">1</span></sup><span 
class="cmmi-12">w</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">;</span>
     </li>
     <li class="itemize"><span 
class="cmr-12">the  prolongation  and  the  sum  of  the  previous  vectors,  i.e.</span><span 
class="cmr-12">&#x00A0;</span><span 
class="cmmi-12">z</span><sup><span 
class="cmmi-8">k</span></sup>    <span 
class="cmr-12">=</span>
     <span 
class="cmex-10x-x-120">&#x2211;</span>
        <sub><span 
class="cmmi-8">i</span><span 
class="cmr-8">=1</span></sub><sup><span 
class="cmmi-8">m</span><sub><span 
class="cmmi-6">k</span></sub></sup><span 
class="cmmi-12">P</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmmi-12">z</span><sub>
<span 
class="cmmi-8">i</span></sub><sup><span 
class="cmmi-8">k</span></sup><span 
class="cmr-12">.</span></li></ul>
<!--l. 262--><p class="noindent" ><span 
class="cmr-12">Variants of the classical AS method, which use modifications of the restriction and</span>
<span 
class="cmr-12">prolongation operators, are also implemented in MLD2P4. Among them, the Restricted</span>
<span 
class="cmr-12">AS (RAS) preconditioner usually outperforms the classical AS preconditioner in terms</span>
<span 
class="cmr-12">of convergence rate and of computation and communication time on parallel</span>
<span 
class="cmr-12">distributed-memory computers, and is therefore the most widely used among the AS</span>
<span 
class="cmr-12">preconditioners</span><span 
class="cmr-12">&#x00A0;</span><span class="cite"><span 
class="cmr-12">[</span><a 
href="userhtmlli4.html#XCAI_SARKIS"><span 
class="cmr-12">6</span></a><span 
class="cmr-12">]</span></span><span 
class="cmr-12">.</span>
<!--l. 270--><p class="indent" >   <span 
class="cmr-12">Direct solvers based on sparse LU factorizations, implemented in the third-party</span>
<span 
class="cmr-12">libraries reported in Section</span><span 
class="cmr-12">&#x00A0;</span><a 
href="userhtmlsu2.html#x9-80003.2"><span 
class="cmr-12">3.2</span><!--tex4ht:ref: sec:third-party --></a><span 
class="cmr-12">, can be applied as coarsest-level solvers by</span>
<span 
class="cmr-12">MLD2P4. Native inexact solvers based on incomplete LU factorizations, as well as</span>
<span 
class="cmr-12">Jacobi, hybrid (forward) Gauss-Seidel, and block Jacobi preconditioners are</span>
<span 
class="cmr-12">also available. Direct solvers usually lead to more effective preconditioners in</span>
<span 
class="cmr-12">terms of algorithmic scalability; however, this does not guarantee parallel</span>
<span 
class="cmr-12">efficiency.</span>
                                                                               

                                                                               
                                                                               

                                                                               
   <!--l. 1--><div class="crosslinks"><p class="noindent"><span 
class="cmr-12">[</span><a 
href="userhtmlsu7.html" ><span 
class="cmr-12">prev</span></a><span 
class="cmr-12">] [</span><a 
href="userhtmlsu7.html#tailuserhtmlsu7.html" ><span 
class="cmr-12">prev-tail</span></a><span 
class="cmr-12">] [</span><a 
href="userhtmlsu8.html" ><span 
class="cmr-12">front</span></a><span 
class="cmr-12">] [</span><a 
href="userhtmlse4.html#userhtmlsu8.html" ><span 
class="cmr-12">up</span></a><span 
class="cmr-12">] </span></p></div>
<!--l. 1--><p class="indent" >   <a 
 id="tailuserhtmlsu8.html"></a>  
</body></html>