|
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
|
|
|
"http://www.w3.org/TR/html4/loose.dtd">
|
|
|
|
<html >
|
|
|
|
<head><title>Smoothers and coarsest-level solvers</title>
|
|
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
|
|
<meta name="generator" content="TeX4ht (http://www.tug.org/tex4ht/)">
|
|
|
|
<meta name="originator" content="TeX4ht (http://www.tug.org/tex4ht/)">
|
|
|
|
<!-- html,3 -->
|
|
|
|
<meta name="src" content="userhtml.tex">
|
|
|
|
<link rel="stylesheet" type="text/css" href="userhtml.css">
|
|
|
|
</head><body
|
|
|
|
>
|
|
|
|
<!--l. 216--><div class="crosslinks"><p class="noindent"><span
|
|
|
|
class="cmbx-12">[</span><a
|
|
|
|
href="userhtmlsu7.html" ><span
|
|
|
|
class="cmbx-12">prev</span></a><span
|
|
|
|
class="cmbx-12">] [</span><a
|
|
|
|
href="userhtmlsu7.html#tailuserhtmlsu7.html" ><span
|
|
|
|
class="cmbx-12">prev-tail</span></a><span
|
|
|
|
class="cmbx-12">] [</span><a
|
|
|
|
href="#tailuserhtmlsu8.html"><span
|
|
|
|
class="cmbx-12">tail</span></a><span
|
|
|
|
class="cmbx-12">] [</span><a
|
|
|
|
href="userhtmlse4.html#userhtmlsu8.html" ><span
|
|
|
|
class="cmbx-12">up</span></a><span
|
|
|
|
class="cmbx-12">] </span></p></div>
|
|
|
|
<h4 class="subsectionHead"><span class="titlemark"><span
|
|
|
|
class="cmbx-12">4.3 </span></span> <a
|
|
|
|
id="x16-150004.3"></a><span
|
|
|
|
class="cmbx-12">Smoothers and coarsest-level solvers</span></h4>
|
|
|
|
<!--l. 218--><p class="noindent" ><span
|
|
|
|
class="cmbx-12">The smoothers implemented in MLD2P4 include the Jacobi and</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">block-Jacobi methods, a hybrid version of the forward and backward</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">Gauss-Seidel methods, and the additive Schwarz (AS) ones (see, e.g.,</span>
|
|
|
|
<span class="cite"><span
|
|
|
|
class="cmbx-12">[</span><a
|
|
|
|
href="userhtmlli4.html#XSaad_book"><span
|
|
|
|
class="cmbx-12">23</span></a><span
|
|
|
|
class="cmbx-12">,</span><span
|
|
|
|
class="cmbx-12"> </span><a
|
|
|
|
href="userhtmlli4.html#Xdd2_96"><span
|
|
|
|
class="cmbx-12">24</span></a><span
|
|
|
|
class="cmbx-12">]</span></span><span
|
|
|
|
class="cmbx-12">).</span>
|
|
|
|
<!--l. 222--><p class="indent" > <span
|
|
|
|
class="cmbx-12">The hybrid Gauss-Seidel version is considered because the original</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">Gauss-Seidel method is inherently sequential. At each iteration of the</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">hybrid version, each parallel process uses the most recent values of its own</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">local variables and the values of the non-local variables computed at the</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">previous iteration, obtained by exchanging data with other processes before</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">the beginning of the current iteration.</span>
|
|
|
|
<!--l. 229--><p class="indent" > <span
|
|
|
|
class="cmbx-12">In the AS methods, the index space </span><span
|
|
|
|
class="cmr-12">Ω</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">is divided into </span><span
|
|
|
|
class="cmmi-12">m</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">k</span></sub> <span
|
|
|
|
class="cmbx-12">subsets </span><span
|
|
|
|
class="cmr-12">Ω</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">of</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">size </span><span
|
|
|
|
class="cmmi-12">n</span><sub><span
|
|
|
|
class="cmmi-8">k,i</span></sub><span
|
|
|
|
class="cmbx-12">, possibly overlapping. For each </span><span
|
|
|
|
class="cmmi-12">i </span><span
|
|
|
|
class="cmbx-12">we consider the restriction</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">operator </span><span
|
|
|
|
class="cmmi-12">R</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmsy-10x-x-120">∈ </span><span
|
|
|
|
class="msbm-10x-x-120">ℝ</span><sup><span
|
|
|
|
class="cmmi-8">n</span><sub><span
|
|
|
|
class="cmmi-6">k,i</span></sub><span
|
|
|
|
class="cmsy-8">×</span><span
|
|
|
|
class="cmmi-8">n</span><sub><span
|
|
|
|
class="cmmi-6">k</span></sub></sup> <span
|
|
|
|
class="cmbx-12">that maps a vector </span><span
|
|
|
|
class="cmmi-12">x</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">to the vector </span><span
|
|
|
|
class="cmmi-12">x</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">made of the</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">components of </span><span
|
|
|
|
class="cmmi-12">x</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">with indices in </span><span
|
|
|
|
class="cmr-12">Ω</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">, and the prolongation operator</span>
|
|
|
|
<span
|
|
|
|
class="cmmi-12">P</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmr-12">= (</span><span
|
|
|
|
class="cmmi-12">R</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmr-12">)</span><sup><span
|
|
|
|
class="cmmi-8">T</span> </sup><span
|
|
|
|
class="cmbx-12">. These operators are then used to build </span><span
|
|
|
|
class="cmmi-12">A</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmr-12">= </span><span
|
|
|
|
class="cmmi-12">R</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmmi-12">A</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmmi-12">P</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">, which is</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">the restriction of </span><span
|
|
|
|
class="cmmi-12">A</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">to the index space </span><span
|
|
|
|
class="cmr-12">Ω</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">. The classical AS preconditioner</span>
|
|
|
|
<span
|
|
|
|
class="cmmi-12">M</span><sub><span
|
|
|
|
class="cmmi-8">AS</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">is defined as</span>
|
|
|
|
<center class="math-display" >
|
|
|
|
<img
|
|
|
|
src="userhtml18x.png" alt=" k -1 m∑k k k -1 k
|
|
|
|
(M AS) = Pi (A i) R i,
|
|
|
|
i=1
|
|
|
|
" class="math-display" ></center>
|
|
|
|
<!--l. 241--><p class="nopar" > <span
|
|
|
|
class="cmbx-12">where </span><span
|
|
|
|
class="cmmi-12">A</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">is supposed to be nonsingular. We observe that an approximate</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">inverse of </span><span
|
|
|
|
class="cmmi-12">A</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">is usually considered instead of </span><span
|
|
|
|
class="cmr-12">(</span><span
|
|
|
|
class="cmmi-12">A</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmr-12">)</span><sup><span
|
|
|
|
class="cmsy-8">-</span><span
|
|
|
|
class="cmr-8">1</span></sup><span
|
|
|
|
class="cmbx-12">. The setup of </span><span
|
|
|
|
class="cmmi-12">M</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">AS</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">during the multilevel build phase involves</span>
|
|
|
|
<ul class="itemize1">
|
|
|
|
<li class="itemize"><span
|
|
|
|
class="cmbx-12">the definition of the index subspaces </span><span
|
|
|
|
class="cmr-12">Ω</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">and of the corresponding</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">operators </span><span
|
|
|
|
class="cmmi-12">R</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">(and </span><span
|
|
|
|
class="cmmi-12">P</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">);</span>
|
|
|
|
</li>
|
|
|
|
<li class="itemize"><span
|
|
|
|
class="cmbx-12">the computation of the submatrices </span><span
|
|
|
|
class="cmmi-12">A</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">;</span>
|
|
|
|
</li>
|
|
|
|
<li class="itemize"><span
|
|
|
|
class="cmbx-12">the computation of their inverses (usually approximated through</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">some form of incomplete factorization).</span></li></ul>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<!--l. 253--><p class="noindent" ><span
|
|
|
|
class="cmbx-12">The computation of </span><span
|
|
|
|
class="cmmi-12">z</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmr-12">= </span><span
|
|
|
|
class="cmmi-12">M</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">AS</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmmi-12">w</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">, with </span><span
|
|
|
|
class="cmmi-12">w</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmsy-10x-x-120">∈ </span><span
|
|
|
|
class="msbm-10x-x-120">ℝ</span><sup><span
|
|
|
|
class="cmmi-8">n</span><sub><span
|
|
|
|
class="cmmi-6">k</span></sub></sup><span
|
|
|
|
class="cmbx-12">, during the multilevel</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">application phase, requires</span>
|
|
|
|
<ul class="itemize1">
|
|
|
|
<li class="itemize"><span
|
|
|
|
class="cmbx-12">the restriction of </span><span
|
|
|
|
class="cmmi-12">w</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmbx-12">to the subspaces </span><span
|
|
|
|
class="msbm-10x-x-120">ℝ</span><sup><span
|
|
|
|
class="cmmi-8">n</span><sub><span
|
|
|
|
class="cmmi-6">k,i</span></sub></sup><span
|
|
|
|
class="cmbx-12">, i.e.</span><span
|
|
|
|
class="cmbx-12"> </span><span
|
|
|
|
class="cmmi-12">w</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmr-12">= </span><span
|
|
|
|
class="cmmi-12">R</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmmi-12">w</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">;</span>
|
|
|
|
</li>
|
|
|
|
<li class="itemize"><span
|
|
|
|
class="cmbx-12">the computation of the vectors </span><span
|
|
|
|
class="cmmi-12">z</span><sub><span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmr-12">= (</span><span
|
|
|
|
class="cmmi-12">A</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmr-12">)</span><sup><span
|
|
|
|
class="cmsy-8">-</span><span
|
|
|
|
class="cmr-8">1</span></sup><span
|
|
|
|
class="cmmi-12">w</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">;</span>
|
|
|
|
</li>
|
|
|
|
<li class="itemize"><span
|
|
|
|
class="cmbx-12">the prolongation and the sum of the previous vectors, i.e.</span><span
|
|
|
|
class="cmbx-12"> </span><span
|
|
|
|
class="cmmi-12">z</span><sup><span
|
|
|
|
class="cmmi-8">k</span></sup> <span
|
|
|
|
class="cmr-12">=</span>
|
|
|
|
<span
|
|
|
|
class="cmex-10x-x-120">∑</span>
|
|
|
|
<sub><span
|
|
|
|
class="cmmi-8">i</span><span
|
|
|
|
class="cmr-8">=1</span></sub><sup><span
|
|
|
|
class="cmmi-8">m</span><sub><span
|
|
|
|
class="cmmi-6">k</span></sub></sup><span
|
|
|
|
class="cmmi-12">P</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmmi-12">z</span><sub>
|
|
|
|
<span
|
|
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
|
|
class="cmmi-8">k</span></sup><span
|
|
|
|
class="cmbx-12">.</span></li></ul>
|
|
|
|
<!--l. 262--><p class="noindent" ><span
|
|
|
|
class="cmbx-12">Variants of the classical AS method, which use modifications of the restriction</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">and prolongation operators, are also implemented in MLD2P4. Among</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">them, the Restricted AS (RAS) preconditioner usually outperforms</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">the classical AS preconditioner in terms of convergence rate and of</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">computation and communication time on parallel distributed-memory</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">computers, and is therefore the most widely used among the AS</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">preconditioners</span><span
|
|
|
|
class="cmbx-12"> </span><span class="cite"><span
|
|
|
|
class="cmbx-12">[</span><a
|
|
|
|
href="userhtmlli4.html#XCAI_SARKIS"><span
|
|
|
|
class="cmbx-12">6</span></a><span
|
|
|
|
class="cmbx-12">]</span></span><span
|
|
|
|
class="cmbx-12">.</span>
|
|
|
|
<!--l. 270--><p class="indent" > <span
|
|
|
|
class="cmbx-12">Direct solvers based on sparse LU factorizations, implemented in</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">the third-party libraries reported in Section</span><span
|
|
|
|
class="cmbx-12"> </span><a
|
|
|
|
href="userhtmlsu2.html#x9-80003.2"><span
|
|
|
|
class="cmbx-12">3.2</span><!--tex4ht:ref: sec:third-party --></a><span
|
|
|
|
class="cmbx-12">, can be applied as</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">coarsest-level solvers by MLD2P4. Native inexact solvers based on</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">incomplete LU factorizations, as well as Jacobi, hybrid (forward)</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">Gauss-Seidel, and block Jacobi preconditioners are also available. Direct</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">solvers usually lead to more effective preconditioners in terms of</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">algorithmic scalability; however, this does not guarantee parallel</span>
|
|
|
|
<span
|
|
|
|
class="cmbx-12">efficiency.</span>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<!--l. 1--><div class="crosslinks"><p class="noindent"><span
|
|
|
|
class="cmbx-12">[</span><a
|
|
|
|
href="userhtmlsu7.html" ><span
|
|
|
|
class="cmbx-12">prev</span></a><span
|
|
|
|
class="cmbx-12">] [</span><a
|
|
|
|
href="userhtmlsu7.html#tailuserhtmlsu7.html" ><span
|
|
|
|
class="cmbx-12">prev-tail</span></a><span
|
|
|
|
class="cmbx-12">] [</span><a
|
|
|
|
href="userhtmlsu8.html" ><span
|
|
|
|
class="cmbx-12">front</span></a><span
|
|
|
|
class="cmbx-12">] [</span><a
|
|
|
|
href="userhtmlse4.html#userhtmlsu8.html" ><span
|
|
|
|
class="cmbx-12">up</span></a><span
|
|
|
|
class="cmbx-12">] </span></p></div>
|
|
|
|
<!--l. 1--><p class="indent" > <a
|
|
|
|
id="tailuserhtmlsu8.html"></a>
|
|
|
|
</body></html>
|