You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
356 lines
13 KiB
HTML
356 lines
13 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
|
"http://www.w3.org/TR/html4/loose.dtd">
|
|
<html >
|
|
<head><title>Smoothers and coarsest-level solvers</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<meta name="generator" content="TeX4ht (https://tug.org/tex4ht/)">
|
|
<meta name="originator" content="TeX4ht (https://tug.org/tex4ht/)">
|
|
<!-- html,3 -->
|
|
<meta name="src" content="userhtml.tex">
|
|
<link rel="stylesheet" type="text/css" href="userhtml.css">
|
|
</head><body
|
|
>
|
|
<!--l. 216--><div class="crosslinks"><p class="noindent"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlsu7.html" ><span
|
|
class="cmr-12">prev</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlsu7.html#tailuserhtmlsu7.html" ><span
|
|
class="cmr-12">prev-tail</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="#tailuserhtmlsu8.html"><span
|
|
class="cmr-12">tail</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlse4.html#userhtmlsu8.html" ><span
|
|
class="cmr-12">up</span></a><span
|
|
class="cmr-12">] </span></p></div>
|
|
<h4 class="subsectionHead"><span class="titlemark"><span
|
|
class="cmr-12">4.3 </span></span> <a
|
|
id="x16-150004.3"></a><span
|
|
class="cmr-12">Smoothers and coarsest-level solvers</span></h4>
|
|
<!--l. 218--><p class="noindent" ><span
|
|
class="cmr-12">The smoothers implemented in MLD2P4 include the Jacobi and block-Jacobi methods,</span>
|
|
<span
|
|
class="cmr-12">a hybrid version of the forward and backward Gauss-Seidel methods, and the additive</span>
|
|
<span
|
|
class="cmr-12">Schwarz (AS) ones (see, e.g., </span><span class="cite"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlli4.html#XSaad_book"><span
|
|
class="cmr-12">23</span></a><span
|
|
class="cmr-12">,</span><span
|
|
class="cmr-12"> </span><a
|
|
href="userhtmlli4.html#Xdd2_96"><span
|
|
class="cmr-12">24</span></a><span
|
|
class="cmr-12">]</span></span><span
|
|
class="cmr-12">).</span>
|
|
<!--l. 222--><p class="indent" > <span
|
|
class="cmr-12">The hybrid Gauss-Seidel version is considered because the original Gauss-Seidel</span>
|
|
<span
|
|
class="cmr-12">method is inherently sequential. At each iteration of the hybrid version, each parallel</span>
|
|
<span
|
|
class="cmr-12">process uses the most recent values of its own local variables and the values</span>
|
|
<span
|
|
class="cmr-12">of the non-local variables computed at the previous iteration, obtained by</span>
|
|
<span
|
|
class="cmr-12">exchanging data with other processes before the beginning of the current</span>
|
|
<span
|
|
class="cmr-12">iteration.</span>
|
|
<!--l. 229--><p class="indent" > <span
|
|
class="cmr-12">In the AS methods, the index space Ω</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">is divided into </span><span
|
|
class="cmmi-12">m</span><sub>
|
|
<span
|
|
class="cmmi-8">k</span></sub> <span
|
|
class="cmr-12">subsets Ω</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">of</span>
|
|
<span
|
|
class="cmr-12">size </span><span
|
|
class="cmmi-12">n</span><sub><span
|
|
class="cmmi-8">k,i</span></sub><span
|
|
class="cmr-12">, possibly overlapping. For each </span><span
|
|
class="cmmi-12">i </span><span
|
|
class="cmr-12">we consider the restriction operator</span>
|
|
<span
|
|
class="cmmi-12">R</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmsy-10x-x-120">∈ </span><span
|
|
class="msbm-10x-x-120">ℝ</span><sup><span
|
|
class="cmmi-8">n</span><sub><span
|
|
class="cmmi-6">k,i</span></sub><span
|
|
class="cmsy-8">×</span><span
|
|
class="cmmi-8">n</span><sub><span
|
|
class="cmmi-6">k</span></sub></sup> <span
|
|
class="cmr-12">that maps a vector </span><span
|
|
class="cmmi-12">x</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">to the vector </span><span
|
|
class="cmmi-12">x</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">made of the components of</span>
|
|
<span
|
|
class="cmmi-12">x</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">with indices in Ω</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">, and the prolongation operator </span><span
|
|
class="cmmi-12">P</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">= (</span><span
|
|
class="cmmi-12">R</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">)</span><sup><span
|
|
class="cmmi-8">T</span> </sup><span
|
|
class="cmr-12">. These</span>
|
|
<span
|
|
class="cmr-12">operators are then used to build </span><span
|
|
class="cmmi-12">A</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">= </span><span
|
|
class="cmmi-12">R</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmmi-12">A</span><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmmi-12">P</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">, which is the restriction of</span>
|
|
<span
|
|
class="cmmi-12">A</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">to the index space Ω</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">. The classical AS preconditioner </span><span
|
|
class="cmmi-12">M</span><sub>
|
|
<span
|
|
class="cmmi-8">AS</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">is defined</span>
|
|
<span
|
|
class="cmr-12">as</span>
|
|
<center class="math-display" >
|
|
<img
|
|
src="userhtml18x.png" alt=" m∑k
|
|
(M AkS )-1 = Pki (Aki)-1Rki,
|
|
i=1
|
|
" class="math-display" ></center>
|
|
<!--l. 241--><p class="nopar" > <span
|
|
class="cmr-12">where </span><span
|
|
class="cmmi-12">A</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">is supposed to be nonsingular. We observe that an approximate inverse of</span>
|
|
<span
|
|
class="cmmi-12">A</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">is usually considered instead of (</span><span
|
|
class="cmmi-12">A</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">)</span><sup><span
|
|
class="cmsy-8">-</span><span
|
|
class="cmr-8">1</span></sup><span
|
|
class="cmr-12">. The setup of </span><span
|
|
class="cmmi-12">M</span><sub>
|
|
<span
|
|
class="cmmi-8">AS</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">during the multilevel</span>
|
|
<span
|
|
class="cmr-12">build phase involves</span>
|
|
<ul class="itemize1">
|
|
<li class="itemize"><span
|
|
class="cmr-12">the definition of the index subspaces Ω</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">and of the corresponding operators</span>
|
|
<span
|
|
class="cmmi-12">R</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">(and </span><span
|
|
class="cmmi-12">P</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">);</span>
|
|
</li>
|
|
<li class="itemize"><span
|
|
class="cmr-12">the computation of the submatrices </span><span
|
|
class="cmmi-12">A</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">;</span>
|
|
</li>
|
|
<li class="itemize"><span
|
|
class="cmr-12">the computation of their inverses (usually approximated through some form</span>
|
|
<span
|
|
class="cmr-12">of incomplete factorization).</span></li></ul>
|
|
|
|
|
|
|
|
<!--l. 253--><p class="noindent" ><span
|
|
class="cmr-12">The computation of </span><span
|
|
class="cmmi-12">z</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">= </span><span
|
|
class="cmmi-12">M</span><sub>
|
|
<span
|
|
class="cmmi-8">AS</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmmi-12">w</span><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">, with </span><span
|
|
class="cmmi-12">w</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmsy-10x-x-120">∈ </span><span
|
|
class="msbm-10x-x-120">ℝ</span><sup><span
|
|
class="cmmi-8">n</span><sub><span
|
|
class="cmmi-6">k</span></sub></sup><span
|
|
class="cmr-12">, during the multilevel application</span>
|
|
<span
|
|
class="cmr-12">phase, requires</span>
|
|
<ul class="itemize1">
|
|
<li class="itemize"><span
|
|
class="cmr-12">the restriction of </span><span
|
|
class="cmmi-12">w</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">to the subspaces </span><span
|
|
class="msbm-10x-x-120">ℝ</span><sup><span
|
|
class="cmmi-8">n</span><sub><span
|
|
class="cmmi-6">k,i</span></sub></sup><span
|
|
class="cmr-12">, i.e.</span><span
|
|
class="cmr-12"> </span><span
|
|
class="cmmi-12">w</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">= </span><span
|
|
class="cmmi-12">R</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmmi-12">w</span><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">;</span>
|
|
</li>
|
|
<li class="itemize"><span
|
|
class="cmr-12">the computation of the vectors </span><span
|
|
class="cmmi-12">z</span><sub><span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">= (</span><span
|
|
class="cmmi-12">A</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">)</span><sup><span
|
|
class="cmsy-8">-</span><span
|
|
class="cmr-8">1</span></sup><span
|
|
class="cmmi-12">w</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">;</span>
|
|
</li>
|
|
<li class="itemize"><span
|
|
class="cmr-12">the prolongation and the sum of the previous vectors, i.e.</span><span
|
|
class="cmr-12"> </span><span
|
|
class="cmmi-12">z</span><sup><span
|
|
class="cmmi-8">k</span></sup> <span
|
|
class="cmr-12">=</span>
|
|
<span
|
|
class="cmex-10x-x-120">∑</span>
|
|
<sub><span
|
|
class="cmmi-8">i</span><span
|
|
class="cmr-8">=1</span></sub><sup><span
|
|
class="cmmi-8">m</span><sub><span
|
|
class="cmmi-6">k</span></sub></sup><span
|
|
class="cmmi-12">P</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmmi-12">z</span><sub>
|
|
<span
|
|
class="cmmi-8">i</span></sub><sup><span
|
|
class="cmmi-8">k</span></sup><span
|
|
class="cmr-12">.</span></li></ul>
|
|
<!--l. 262--><p class="noindent" ><span
|
|
class="cmr-12">Variants of the classical AS method, which use modifications of the restriction and</span>
|
|
<span
|
|
class="cmr-12">prolongation operators, are also implemented in MLD2P4. Among them, the Restricted</span>
|
|
<span
|
|
class="cmr-12">AS (RAS) preconditioner usually outperforms the classical AS preconditioner in terms</span>
|
|
<span
|
|
class="cmr-12">of convergence rate and of computation and communication time on parallel</span>
|
|
<span
|
|
class="cmr-12">distributed-memory computers, and is therefore the most widely used among the AS</span>
|
|
<span
|
|
class="cmr-12">preconditioners</span><span
|
|
class="cmr-12"> </span><span class="cite"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlli4.html#XCAI_SARKIS"><span
|
|
class="cmr-12">6</span></a><span
|
|
class="cmr-12">]</span></span><span
|
|
class="cmr-12">.</span>
|
|
<!--l. 270--><p class="indent" > <span
|
|
class="cmr-12">Direct solvers based on sparse LU factorizations, implemented in the third-party</span>
|
|
<span
|
|
class="cmr-12">libraries reported in Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="userhtmlsu2.html#x9-80003.2"><span
|
|
class="cmr-12">3.2</span><!--tex4ht:ref: sec:third-party --></a><span
|
|
class="cmr-12">, can be applied as coarsest-level solvers by</span>
|
|
<span
|
|
class="cmr-12">MLD2P4. Native inexact solvers based on incomplete LU factorizations, as well as</span>
|
|
<span
|
|
class="cmr-12">Jacobi, hybrid (forward) Gauss-Seidel, and block Jacobi preconditioners are</span>
|
|
<span
|
|
class="cmr-12">also available. Direct solvers usually lead to more effective preconditioners in</span>
|
|
<span
|
|
class="cmr-12">terms of algorithmic scalability; however, this does not guarantee parallel</span>
|
|
<span
|
|
class="cmr-12">efficiency.</span>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<!--l. 1--><div class="crosslinks"><p class="noindent"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlsu7.html" ><span
|
|
class="cmr-12">prev</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlsu7.html#tailuserhtmlsu7.html" ><span
|
|
class="cmr-12">prev-tail</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlsu8.html" ><span
|
|
class="cmr-12">front</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlse4.html#userhtmlsu8.html" ><span
|
|
class="cmr-12">up</span></a><span
|
|
class="cmr-12">] </span></p></div>
|
|
<!--l. 1--><p class="indent" > <a
|
|
id="tailuserhtmlsu8.html"></a>
|
|
</body></html>
|