You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
amg4psblas/docs/html/userhtmlse1.html

380 lines
16 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html >
<head><title>General Overview</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="generator" content="TeX4ht (https://tug.org/tex4ht/)">
<meta name="originator" content="TeX4ht (https://tug.org/tex4ht/)">
<!-- html,3 -->
<meta name="src" content="userhtml.tex">
<link rel="stylesheet" type="text/css" href="userhtml.css">
</head><body
>
<!--l. 1--><div class="crosslinks"><p class="noindent"><span
class="cmr-12">[</span><a
href="userhtmlse2.html" ><span
class="cmr-12">next</span></a><span
class="cmr-12">] [</span><a
href="userhtmlli2.html" ><span
class="cmr-12">prev</span></a><span
class="cmr-12">] [</span><a
href="userhtmlli2.html#tailuserhtmlli2.html" ><span
class="cmr-12">prev-tail</span></a><span
class="cmr-12">] [</span><a
href="#tailuserhtmlse1.html"><span
class="cmr-12">tail</span></a><span
class="cmr-12">] [</span><a
href="userhtml.html#userhtmlse1.html" ><span
class="cmr-12">up</span></a><span
class="cmr-12">] </span></p></div>
<h3 class="sectionHead"><span class="titlemark"><span
class="cmr-12">1 </span></span> <a
id="x4-30001"></a><span
class="cmr-12">General Overview</span></h3>
<!--l. 5--><p class="noindent" ><span
class="cmr-12">The </span><span
class="cmcsc-10x-x-120">A<span
class="small-caps">l</span><span
class="small-caps">g</span><span
class="small-caps">e</span><span
class="small-caps">b</span><span
class="small-caps">r</span><span
class="small-caps">a</span><span
class="small-caps">i</span><span
class="small-caps">c</span> M<span
class="small-caps">u</span><span
class="small-caps">l</span><span
class="small-caps">t</span><span
class="small-caps">i</span>G<span
class="small-caps">r</span><span
class="small-caps">i</span><span
class="small-caps">d</span> P<span
class="small-caps">r</span><span
class="small-caps">e</span><span
class="small-caps">c</span><span
class="small-caps">o</span><span
class="small-caps">n</span><span
class="small-caps">d</span><span
class="small-caps">i</span><span
class="small-caps">t</span><span
class="small-caps">i</span><span
class="small-caps">o</span><span
class="small-caps">n</span><span
class="small-caps">e</span><span
class="small-caps">r</span><span
class="small-caps">s</span> P<span
class="small-caps">a</span><span
class="small-caps">c</span><span
class="small-caps">k</span><span
class="small-caps">a</span><span
class="small-caps">g</span><span
class="small-caps">e</span> <span
class="small-caps">b</span><span
class="small-caps">a</span><span
class="small-caps">s</span><span
class="small-caps">e</span><span
class="small-caps">d</span> <span
class="small-caps">o</span><span
class="small-caps">n</span> PSBLAS</span>
<span
class="cmcsc-10x-x-120">(AMG4PSBLAS) </span><span
class="cmr-12">provides parallel Algebraic MultiGrid (AMG) preconditioners (see,</span>
<span
class="cmr-12">e.g., </span><span class="cite"><span
class="cmr-12">[</span><a
href="userhtmlli4.html#XBriggs2000"><span
class="cmr-12">3</span></a><span
class="cmr-12">,</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlli4.html#XStuben_01"><span
class="cmr-12">27</span></a><span
class="cmr-12">]</span></span><span
class="cmr-12">), to be used in the iterative solution of linear systems,</span>
<table
class="equation"><tr><td>
<center class="math-display" >
<img
src="userhtml0x.png" alt="Ax = b,
" class="math-display" ><a
id="x4-3001r1"></a></center></td><td class="equation-label"><span
class="cmr-12">(1)</span></td></tr></table>
<!--l. 11--><p class="nopar" >
<span
class="cmr-12">where </span><span
class="cmmi-12">A </span><span
class="cmr-12">is a square, real or complex, sparse symmetric positive definite (s.p.d)</span>
<span
class="cmr-12">matrix.</span>
<!--l. 19--><p class="indent" > <span
class="cmr-12">The preconditioners implemented in AMG4PSBLAS are obtained by combining 3</span>
<span
class="cmr-12">different types of AMG cycles with smoothers and coarsest-level solvers. The V-, W-,</span>
<span
class="cmr-12">and a version of a Krylov-type cycle (K-cycle)</span><span
class="cmr-12">&#x00A0;</span><span class="cite"><span
class="cmr-12">[</span><a
href="userhtmlli4.html#XBriggs2000"><span
class="cmr-12">3</span></a><span
class="cmr-12">,</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlli4.html#XNotay2008"><span
class="cmr-12">23</span></a><span
class="cmr-12">]</span></span> <span
class="cmr-12">are available, which can be</span>
<span
4 years ago
class="cmr-12">combined with Jacobi hybrid forward/backward Gauss-Seidel, block-Jacobi, and</span>
<span
4 years ago
class="cmr-12">additive Schwarz smoothers. Also </span><span
class="cmmi-12">&#x2113;</span><sub><span
class="cmr-8">1</span></sub> <span
4 years ago
class="cmr-12">versions of Jacobi, block-Jacobi and Gauss-Seidel</span>
<span
4 years ago
class="cmr-12">smoothers are available. An algebraic approach is used to generate a hierarchy of</span>
<span
4 years ago
class="cmr-12">coarse-level matrices and operators, without explicitly using any information</span>
<span
4 years ago
class="cmr-12">on the geometry of the original problem, e.g., the discretization of a PDE.</span>
<span
class="cmr-12">To this end, two different coarsening strategies, based on aggregation, are</span>
<span
class="cmr-12">available:</span>
<ul class="itemize1">
<li class="itemize"><span
class="cmr-12">a decoupled version of the well known smoothed aggregation procedure</span>
<span
class="cmr-12">proposed in</span><span
class="cmr-12">&#x00A0;</span><span class="cite"><span
class="cmr-12">[</span><a
href="userhtmlli4.html#XBREZINA_VANEK"><span
class="cmr-12">2</span></a><span
class="cmr-12">,</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlli4.html#XVANEK_MANDEL_BREZINA"><span
class="cmr-12">29</span></a><span
class="cmr-12">]</span></span><span
class="cmr-12">, and already included in the previous versions of the</span>
<span
class="cmr-12">package</span><span
class="cmr-12">&#x00A0;</span><span class="cite"><span
class="cmr-12">[</span><a
href="userhtmlli4.html#XBDDF2007"><span
class="cmr-12">10</span></a><span
class="cmr-12">,</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlli4.html#XMLD2P4_TOMS"><span
class="cmr-12">9</span></a><span
class="cmr-12">]</span></span><span
class="cmr-12">;</span>
</li>
<li class="itemize"><span
class="cmr-12">the first parallel implementation of a coupled version of Coarsening based</span>
<span
class="cmr-12">on Compatible Weighted Matching introduced in</span><span
class="cmr-12">&#x00A0;</span><span class="cite"><span
class="cmr-12">[</span><a
href="userhtmlli4.html#XDV2013"><span
class="cmr-12">30</span></a><span
class="cmr-12">,</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlli4.html#XDFV2018"><span
class="cmr-12">31</span></a><span
class="cmr-12">]</span></span> <span
class="cmr-12">and described in</span>
<span
class="cmr-12">details in</span><span
class="cmr-12">&#x00A0;</span><span class="cite"><span
class="cmr-12">[</span><a
href="userhtmlli4.html#XDDF2020"><span
class="cmr-12">11</span></a><span
class="cmr-12">]</span></span><span
class="cmr-12">;</span></li></ul>
<!--l. 32--><p class="indent" > <span
class="cmr-12">Either exact or approximate solvers can be used on the coarsest-level system.</span>
<span
class="cmr-12">Specifically, different sparse LU factorizations from external packages, native</span>
<span
class="cmr-12">incomplete LU and approximate inverse factorizations, weighted Jacobi, hybrid</span>
<span
class="cmr-12">Gauss-Seidel, block-Jacobi solvers and recursive call to preconditioned Krylov</span>
<span
class="cmr-12">methods are available. All the smoothers can be also exploited as one-level</span>
<span
class="cmr-12">preconditioners.</span>
<!--l. 36--><p class="indent" > <span
class="cmr-12">AMG4PSBLAS is written in Fortran</span><span
class="cmr-12">&#x00A0;2003, following an object-oriented design</span>
<span
class="cmr-12">through the exploitation of features such as abstract data type creation, type</span>
<span
class="cmr-12">extension, functional overloading, and dynamic memory management. The parallel</span>
<span
class="cmr-12">implementation is based on a Single Program Multiple Data (SPMD) paradigm.</span>
<span
class="cmr-12">Single and double precision implementations of AMG4PSBLAS are available</span>
<span
class="cmr-12">for both the real and the complex case, which can be used through a single</span>
<span
class="cmr-12">interface.</span>
<!--l. 46--><p class="indent" > <span
class="cmr-12">AMG4PSBLAS has been designed to implement scalable and easy-to-use</span>
<span
class="cmr-12">multilevel preconditioners in the context of the PSBLAS (Parallel Sparse BLAS)</span>
<span
class="cmr-12">computational framework</span><span
class="cmr-12">&#x00A0;</span><span class="cite"><span
class="cmr-12">[</span><a
href="userhtmlli4.html#Xpsblas_00"><span
class="cmr-12">18</span></a><span
class="cmr-12">,</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlli4.html#XPSBLAS3"><span
class="cmr-12">17</span></a><span
class="cmr-12">]</span></span><span
class="cmr-12">. PSBLAS provides basic linear algebra operators</span>
<span
class="cmr-12">and data management facilities for distributed sparse matrices, kernels for</span>
<span
class="cmr-12">sequential incomplete factorizations needed for the parallel block-Jacobi and</span>
<span
class="cmr-12">additive Schwarz smoothers, and parallel Krylov solvers which can be used with</span>
<span
class="cmr-12">the AMG4PSBLAS preconditioners. The choice of PSBLAS has been mainly</span>
<span
class="cmr-12">motivated by the need of having a portable and efficient software infrastructure</span>
<span
class="cmr-12">implementing &#8220;de facto&#8221; standard parallel sparse linear algebra kernels, to</span>
<span
class="cmr-12">pursue goals such as performance, portability, modularity ed extensibility</span>
<span
class="cmr-12">in the development of the preconditioner package. On the other hand, the</span>
<span
class="cmr-12">implementation of AMG4PSBLAS, which was driven by the need to face the exascale</span>
<span
class="cmr-12">challenge, has led to some important revisions and extentions of the PSBLAS</span>
<span
class="cmr-12">infrastructure. The inter-process comunication required by AMG4PSBLAS</span>
<span
class="cmr-12">is encapsulated in the PSBLAS routines; therefore, AMG4PSBLAS can be</span>
<span
class="cmr-12">run on any parallel machine where PSBLAS implementations are available.</span>
<span
class="cmr-12">In the most recent version of PSBLAS (release 3.7), a plug-in for GPU is</span>
<span
class="cmr-12">included; it includes CUDA versions of main vector operations and of sparse</span>
<span
class="cmr-12">matrix-vector multiplication, so that Krylov methods coupled with AMG4PBLAS</span>
<span
class="cmr-12">preconditioners relying on Jacobi and block-Jacobi smoothers with sparse</span>
<span
class="cmr-12">approximate inverses on the blocks can be efficiently executed on cluster of</span>
<span
class="cmr-12">GPUs.</span>
<!--l. 64--><p class="indent" > <span
class="cmr-12">AMG4PSBLAS has a layered and modular software architecture where three main</span>
<span
class="cmr-12">layers can be identified. The lower layer consists of the PSBLAS kernels, the middle</span>
<span
class="cmr-12">one implements the construction and application phases of the preconditioners, and the</span>
<span
class="cmr-12">upper one provides a uniform interface to all the preconditioners. This architecture</span>
<span
class="cmr-12">allows for different levels of use of the package: few black-box routines at the upper</span>
<span
class="cmr-12">layer allow all users to easily build and apply any preconditioner available in</span>
<span
class="cmr-12">AMG4PSBLAS; facilities are also available allowing expert users to extend the set of</span>
<span
class="cmr-12">smoothers and solvers for building new versions of the preconditioners (see</span>
<span
class="cmr-12">Section</span><span
class="cmr-12">&#x00A0;</span><a
4 years ago
href="userhtmlse6.html#x25-290006"><span
class="cmr-12">6</span><!--tex4ht:ref: sec:adding --></a><span
class="cmr-12">).</span>
<!--l. 75--><p class="indent" > <span
class="cmr-12">This guide is organized as follows. General information on the distribution of the</span>
<span
class="cmr-12">source code is reported in Section</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlse2.html#x5-40002"><span
class="cmr-12">2</span><!--tex4ht:ref: sec:distribution --></a><span
class="cmr-12">, while details on the configuration and installation</span>
<span
class="cmr-12">of the package are given in Section</span><span
class="cmr-12">&#x00A0;</span><a
href="userhtmlse3.html#x7-60003"><span
class="cmr-12">3</span><!--tex4ht:ref: sec:building --></a><span
class="cmr-12">. The basics for building and applying the</span>
<span
class="cmr-12">preconditioners with the Krylov solvers implemented in PSBLAS are reported</span>
<span
class="cmr-12">in</span><span
class="cmr-12">&#x00A0;Section</span><span
class="cmr-12">&#x00A0;</span><a
4 years ago
href="userhtmlse4.html#x13-120004"><span
class="cmr-12">4</span><!--tex4ht:ref: sec:started --></a><span
class="cmr-12">, where the Fortran codes of a few sample programs are also shown.</span>
<span
class="cmr-12">A reference guide for the user interface routines is provided in Section</span><span
class="cmr-12">&#x00A0;</span><a
4 years ago
href="userhtmlse5.html#x15-140005"><span
class="cmr-12">5</span><!--tex4ht:ref: sec:userinterface --></a><span
class="cmr-12">.</span>
<span
class="cmr-12">Information on the extension of the package through the addition of new</span>
<span
class="cmr-12">smoothers and solvers is reported in Section</span><span
class="cmr-12">&#x00A0;</span><a
4 years ago
href="userhtmlse6.html#x25-290006"><span
class="cmr-12">6</span><!--tex4ht:ref: sec:adding --></a><span
class="cmr-12">. The error handling mechanism</span>
<span
class="cmr-12">used by the package is briefly described in Section</span><span
class="cmr-12">&#x00A0;</span><a
4 years ago
href="userhtmlse7.html#x26-300007"><span
class="cmr-12">7</span><!--tex4ht:ref: sec:errors --></a><span
class="cmr-12">. The copyright terms</span>
<span
class="cmr-12">concerning the distribution and modification of AMG4PSBLAS are reported in</span>
<span
class="cmr-12">Appendix</span><span
class="cmr-12">&#x00A0;</span><a
4 years ago
href="userhtmlse8.html#x27-31000A"><span
class="cmr-12">A</span><!--tex4ht:ref: sec:license --></a><span
class="cmr-12">.</span>
<!--l. 1--><div class="crosslinks"><p class="noindent"><span
class="cmr-12">[</span><a
href="userhtmlse2.html" ><span
class="cmr-12">next</span></a><span
class="cmr-12">] [</span><a
href="userhtmlli2.html" ><span
class="cmr-12">prev</span></a><span
class="cmr-12">] [</span><a
href="userhtmlli2.html#tailuserhtmlli2.html" ><span
class="cmr-12">prev-tail</span></a><span
class="cmr-12">] [</span><a
href="userhtmlse1.html" ><span
class="cmr-12">front</span></a><span
class="cmr-12">] [</span><a
href="userhtml.html#userhtmlse1.html" ><span
class="cmr-12">up</span></a><span
class="cmr-12">] </span></p></div>
<!--l. 1--><p class="indent" > <a
id="tailuserhtmlse1.html"></a>
</body></html>