You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
955 lines
49 KiB
HTML
955 lines
49 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
|
"http://www.w3.org/TR/html4/loose.dtd">
|
|
<html >
|
|
<head><title>Getting Started</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<meta name="generator" content="TeX4ht (https://tug.org/tex4ht/)">
|
|
<meta name="originator" content="TeX4ht (https://tug.org/tex4ht/)">
|
|
<!-- html,3 -->
|
|
<meta name="src" content="userhtml.tex">
|
|
<link rel="stylesheet" type="text/css" href="userhtml.css">
|
|
</head><body
|
|
>
|
|
<!--l. 1--><div class="crosslinks"><p class="noindent"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlse5.html" ><span
|
|
class="cmr-12">next</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlse3.html" ><span
|
|
class="cmr-12">prev</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlse3.html#tailuserhtmlse3.html" ><span
|
|
class="cmr-12">prev-tail</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="#tailuserhtmlse4.html"><span
|
|
class="cmr-12">tail</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtml.html#userhtmlse4.html" ><span
|
|
class="cmr-12">up</span></a><span
|
|
class="cmr-12">] </span></p></div>
|
|
<h3 class="sectionHead"><span class="titlemark"><span
|
|
class="cmr-12">4 </span></span> <a
|
|
id="x7-130004"></a><span
|
|
class="cmr-12">Getting Started</span></h3>
|
|
<!--l. 5--><p class="noindent" ><span
|
|
class="cmr-12">This section describes the basics for building and applying AMG4PSBLAS one-level</span>
|
|
<span
|
|
class="cmr-12">and multilevel (i.e., AMG) preconditioners with the Krylov solvers included in</span>
|
|
<span
|
|
class="cmr-12">PSBLAS</span><span
|
|
class="cmr-12"> </span><span class="cite"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlli3.html#XPSBLASGUIDE"><span
|
|
class="cmr-12">21</span></a><span
|
|
class="cmr-12">]</span></span><span
|
|
class="cmr-12">.</span>
|
|
<!--l. 9--><p class="indent" > <span
|
|
class="cmr-12">The following steps are required:</span>
|
|
<ol class="enumerate1" >
|
|
<li
|
|
class="enumerate" id="x7-13002x1">
|
|
<!--l. 11--><p class="noindent" ><span
|
|
class="cmti-12">Declare the preconditioner data structure</span><span
|
|
class="cmr-12">. It is a derived data type,</span>
|
|
<span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">amg_</span></span></span><span
|
|
class="cmti-12">x</span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">prec_</span></span></span> <span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">type</span></span></span><span
|
|
class="cmr-12">, where </span><span
|
|
class="cmti-12">x </span><span
|
|
class="cmr-12">may be </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">s</span></span></span><span
|
|
class="cmr-12">, </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">d</span></span></span><span
|
|
class="cmr-12">, </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">c</span></span></span> <span
|
|
class="cmr-12">or </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">z</span></span></span><span
|
|
class="cmr-12">, according to the basic data</span>
|
|
<span
|
|
class="cmr-12">type of the sparse matrix (</span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">s</span></span></span> <span
|
|
class="cmr-12">= real single precision; </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">d</span></span></span> <span
|
|
class="cmr-12">= real double precision;</span>
|
|
<span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">c</span></span></span> <span
|
|
class="cmr-12">= complex single precision; </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">z</span></span></span> <span
|
|
class="cmr-12">= complex double precision). This data</span>
|
|
<span
|
|
class="cmr-12">structure is accessed by the user only through the AMG4PSBLAS routines,</span>
|
|
<span
|
|
class="cmr-12">following an object-oriented approach.</span>
|
|
</li>
|
|
<li
|
|
class="enumerate" id="x7-13004x2">
|
|
<!--l. 19--><p class="noindent" ><span
|
|
class="cmti-12">Allocate and initialize the preconditioner data structure, according to a</span>
|
|
<span
|
|
class="cmti-12">preconditioner type chosen by the user</span><span
|
|
class="cmr-12">. This is performed by the routine</span>
|
|
<code class="lstinline"><span style="color:#000000">init</span></code><span
|
|
class="cmr-12">, which also sets defaults for each preconditioner type selected by</span>
|
|
<span
|
|
class="cmr-12">the user. The preconditioner types and the defaults associated with them</span>
|
|
<span
|
|
class="cmr-12">are given in Table</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-13015r1"><span
|
|
class="cmr-12">1</span><!--tex4ht:ref: tab:precinit --></a><span
|
|
class="cmr-12">, where the strings used by </span><code class="lstinline"><span style="color:#000000">init</span></code> <span
|
|
class="cmr-12">to identify the</span>
|
|
<span
|
|
class="cmr-12">preconditioner types are also given. Note that these strings are valid also if</span>
|
|
<span
|
|
class="cmr-12">uppercase letters are substituted by corresponding lowercase ones.</span>
|
|
</li>
|
|
<li
|
|
class="enumerate" id="x7-13006x3">
|
|
<!--l. 29--><p class="noindent" ><span
|
|
class="cmti-12">Modify the selected preconditioner type, by properly setting preconditioner</span>
|
|
<span
|
|
class="cmti-12">parameters. </span><span
|
|
class="cmr-12">This is performed by the routine </span><code class="lstinline"><span style="color:#000000">set</span></code><span
|
|
class="cmr-12">. This routine must be</span>
|
|
<span
|
|
class="cmr-12">called if the user wants to modify the default values of the parameters</span>
|
|
<span
|
|
class="cmr-12">associated with the selected preconditioner type, to obtain a variant of that</span>
|
|
<span
|
|
class="cmr-12">preconditioner. Examples of use of </span><code class="lstinline"><span style="color:#000000">set</span></code> <span
|
|
class="cmr-12">are given in Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-140004.1"><span
|
|
class="cmr-12">4.1</span><!--tex4ht:ref: sec:examples --></a><span
|
|
class="cmr-12">; a complete</span>
|
|
<span
|
|
class="cmr-12">list of all the preconditioner parameters and their allowed and default values</span>
|
|
<span
|
|
class="cmr-12">is provided in Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="userhtmlse5.html#x8-160005"><span
|
|
class="cmr-12">5</span><!--tex4ht:ref: sec:userinterface --></a><span
|
|
class="cmr-12">, Tables</span><span
|
|
class="cmr-12"> </span><a
|
|
href="userhtmlse5.html#x8-18009r2"><span
|
|
class="cmr-12">2</span><!--tex4ht:ref: tab:p_cycle --></a><span
|
|
class="cmr-12">-</span><a
|
|
href="userhtmlse5.html#x8-18015r8"><span
|
|
class="cmr-12">8</span><!--tex4ht:ref: tab:p_smoother_1 --></a><span
|
|
class="cmr-12">.</span>
|
|
|
|
|
|
|
|
</li>
|
|
<li
|
|
class="enumerate" id="x7-13008x4">
|
|
<!--l. 40--><p class="noindent" ><span
|
|
class="cmti-12">Build the preconditioner for a given matrix</span><span
|
|
class="cmr-12">. If the selected preconditioner is</span>
|
|
<span
|
|
class="cmr-12">multilevel, then two steps must be performed, as specified next.</span>
|
|
<ol class="enumerate2" >
|
|
<li
|
|
class="enumerate" id="x7-13009x0">
|
|
<!--l. 43--><p class="noindent" ><span
|
|
class="cmti-12">Build the AMG hierarchy for a given matrix. </span><span
|
|
class="cmr-12">This is performed by the</span>
|
|
<span
|
|
class="cmr-12">routine </span><code class="lstinline"><span style="color:#000000">hierarchy_build</span></code><span
|
|
class="cmr-12">.</span>
|
|
</li>
|
|
<li
|
|
class="enumerate" id="x7-13010x0">
|
|
<!--l. 45--><p class="noindent" ><span
|
|
class="cmti-12">Build the preconditioner for a given matrix. </span><span
|
|
class="cmr-12">This is performed by the</span>
|
|
<span
|
|
class="cmr-12">routine </span><code class="lstinline"><span style="color:#000000">smoothers_build</span></code><span
|
|
class="cmr-12">.</span></li></ol>
|
|
<!--l. 48--><p class="noindent" ><span
|
|
class="cmr-12">If the selected preconditioner is one-level, it is built in a single step, performed by</span>
|
|
<span
|
|
class="cmr-12">the routine </span><code class="lstinline"><span style="color:#000000">bld</span></code><span
|
|
class="cmr-12">.</span>
|
|
</li>
|
|
<li
|
|
class="enumerate" id="x7-13012x5">
|
|
<!--l. 50--><p class="noindent" ><span
|
|
class="cmti-12">Apply the preconditioner at each iteration of a Krylov solver. </span><span
|
|
class="cmr-12">This is performed by</span>
|
|
<span
|
|
class="cmr-12">the method </span><code class="lstinline"><span style="color:#000000">apply</span></code><span
|
|
class="cmr-12">. When using the PSBLAS Krylov solvers, this step is</span>
|
|
<span
|
|
class="cmr-12">completely transparent to the user, since </span><code class="lstinline"><span style="color:#000000">apply</span></code> <span
|
|
class="cmr-12">is called by the PSBLAS routine</span>
|
|
<span
|
|
class="cmr-12">implementing the Krylov solver (</span><code class="lstinline"><span style="color:#000000">psb_krylov</span></code><span
|
|
class="cmr-12">).</span>
|
|
</li>
|
|
<li
|
|
class="enumerate" id="x7-13014x6">
|
|
<!--l. 54--><p class="noindent" ><span
|
|
class="cmti-12">Free the preconditioner data structure</span><span
|
|
class="cmr-12">. This is performed by the routine </span><code class="lstinline"><span style="color:#000000">free</span></code><span
|
|
class="cmr-12">.</span>
|
|
<span
|
|
class="cmr-12">This step is complementary to step 1 and should be performed when the</span>
|
|
<span
|
|
class="cmr-12">preconditioner is no more used.</span></li></ol>
|
|
<!--l. 59--><p class="indent" > <span
|
|
class="cmr-12">All the previous routines are available as methods of the preconditioner object. A</span>
|
|
<span
|
|
class="cmr-12">detailed description of them is given in Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="userhtmlse5.html#x8-160005"><span
|
|
class="cmr-12">5</span><!--tex4ht:ref: sec:userinterface --></a><span
|
|
class="cmr-12">. Examples showing the basic use of</span>
|
|
<span
|
|
class="cmr-12">AMG4PSBLAS are reported in Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-140004.1"><span
|
|
class="cmr-12">4.1</span><!--tex4ht:ref: sec:examples --></a><span
|
|
class="cmr-12">.</span>
|
|
<div class="table">
|
|
|
|
|
|
|
|
<!--l. 63--><p class="indent" > <a
|
|
id="x7-13015r1"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 64--><p class="noindent" >
|
|
<div class="tabular"> <table id="TBL-1" class="tabular"
|
|
|
|
><colgroup id="TBL-1-1g"><col
|
|
id="TBL-1-1"></colgroup><colgroup id="TBL-1-2g"><col
|
|
id="TBL-1-2"></colgroup><colgroup id="TBL-1-3g"><col
|
|
id="TBL-1-3"></colgroup><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-1-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-1-1"
|
|
class="td11"><span
|
|
class="cmcsc-10x-x-109"><span
|
|
class="small-caps">t</span><span
|
|
class="small-caps">y</span><span
|
|
class="small-caps">p</span><span
|
|
class="small-caps">e</span> </span></td><td style="white-space:normal; text-align:left;" id="TBL-1-1-2"
|
|
class="td11"><!--l. 68--><p class="noindent" ><span
|
|
class="cmcsc-10x-x-109"><span
|
|
class="small-caps">s</span><span
|
|
class="small-caps">t</span><span
|
|
class="small-caps">r</span><span
|
|
class="small-caps">i</span><span
|
|
class="small-caps">n</span><span
|
|
class="small-caps">g</span></span> </td><td style="white-space:normal; text-align:left;" id="TBL-1-1-3"
|
|
class="td11"><!--l. 68--><p class="noindent" ><span
|
|
class="cmcsc-10x-x-109"><span
|
|
class="small-caps">d</span><span
|
|
class="small-caps">e</span><span
|
|
class="small-caps">f</span><span
|
|
class="small-caps">a</span><span
|
|
class="small-caps">u</span><span
|
|
class="small-caps">l</span><span
|
|
class="small-caps">t</span> <span
|
|
class="small-caps">p</span><span
|
|
class="small-caps">r</span><span
|
|
class="small-caps">e</span><span
|
|
class="small-caps">c</span><span
|
|
class="small-caps">o</span><span
|
|
class="small-caps">n</span><span
|
|
class="small-caps">d</span><span
|
|
class="small-caps">i</span><span
|
|
class="small-caps">t</span><span
|
|
class="small-caps">i</span><span
|
|
class="small-caps">o</span><span
|
|
class="small-caps">n</span><span
|
|
class="small-caps">e</span><span
|
|
class="small-caps">r</span></span> </td></tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-2-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-2-1"
|
|
class="td11">No preconditioner </td><td style="white-space:normal; text-align:left;" id="TBL-1-2-2"
|
|
class="td11"><!--l. 69--><p class="noindent" ><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">NONE</span><span style="color:#000000">’</span></code> </td><td style="white-space:normal; text-align:left;" id="TBL-1-2-3"
|
|
class="td11"><!--l. 69--><p class="noindent" >Considered to use the PSBLAS Krylov
|
|
solvers with no preconditioner. </td>
|
|
</tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-3-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-3-1"
|
|
class="td11">Diagonal </td><td style="white-space:normal; text-align:left;" id="TBL-1-3-2"
|
|
class="td11"><!--l. 71--><p class="noindent" ><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">DIAG</span><span style="color:#000000">’</span></code>,
|
|
<code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">JACOBI</span><span style="color:#000000">’</span></code>,
|
|
<code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">L1</span><span style="color:#000000">-</span><span style="color:#000000">JACOBI</span><span style="color:#000000">’</span></code> </td><td style="white-space:normal; text-align:left;" id="TBL-1-3-3"
|
|
class="td11"><!--l. 71--><p class="noindent" >Diagonal preconditioner. For any zero
|
|
diagonal entry of the matrix to be
|
|
preconditioned, the corresponding entry
|
|
of the preconditioner is set to 1. </td>
|
|
</tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-4-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-4-1"
|
|
class="td11">Gauss-Seidel </td><td style="white-space:normal; text-align:left;" id="TBL-1-4-2"
|
|
class="td11"><!--l. 74--><p class="noindent" ><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">GS</span><span style="color:#000000">’</span></code>,
|
|
<code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">L1</span><span style="color:#000000">-</span><span style="color:#000000">GS</span><span style="color:#000000">’</span></code> </td><td style="white-space:normal; text-align:left;" id="TBL-1-4-3"
|
|
class="td11"><!--l. 74--><p class="noindent" >Hybrid Gauss-Seidel (forward), that is,
|
|
global block Jacobi with Gauss-Seidel as
|
|
local solver. </td>
|
|
</tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-5-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-5-1"
|
|
class="td11">Symmetrized Gauss-Seidel</td><td style="white-space:normal; text-align:left;" id="TBL-1-5-2"
|
|
class="td11"><!--l. 77--><p class="noindent" ><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">FBGS</span><span style="color:#000000">’</span></code>,
|
|
<code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">L1</span><span style="color:#000000">-</span><span style="color:#000000">FBGS</span><span style="color:#000000">’</span></code> </td><td style="white-space:normal; text-align:left;" id="TBL-1-5-3"
|
|
class="td11"><!--l. 77--><p class="noindent" >Symmetrized hybrid Gauss-Seidel, that
|
|
is, forward Gauss-Seidel followed by
|
|
backward Gauss-Seidel. </td>
|
|
</tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-6-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-6-1"
|
|
class="td11">Block Jacobi </td><td style="white-space:normal; text-align:left;" id="TBL-1-6-2"
|
|
class="td11"><!--l. 80--><p class="noindent" ><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">BJAC</span><span style="color:#000000">’</span></code>,
|
|
<code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">L1</span><span style="color:#000000">-</span><span style="color:#000000">BJAC</span><span style="color:#000000">’</span></code> </td><td style="white-space:normal; text-align:left;" id="TBL-1-6-3"
|
|
class="td11"><!--l. 80--><p class="noindent" >Block-Jacobi with ILU(0) on the local
|
|
blocks. </td>
|
|
</tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-7-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-7-1"
|
|
class="td11">Additive Schwarz </td><td style="white-space:normal; text-align:left;" id="TBL-1-7-2"
|
|
class="td11"><!--l. 81--><p class="noindent" ><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">AS</span><span style="color:#000000">’</span></code> </td><td style="white-space:normal; text-align:left;" id="TBL-1-7-3"
|
|
class="td11"><!--l. 81--><p class="noindent" >Additive Schwarz (AS), with overlap 1
|
|
and ILU(0) on the local blocks. </td>
|
|
</tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-8-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-8-1"
|
|
class="td11">Multilevel </td><td style="white-space:normal; text-align:left;" id="TBL-1-8-2"
|
|
class="td11"><!--l. 83--><p class="noindent" ><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">ML</span><span style="color:#000000">’</span></code> </td><td style="white-space:normal; text-align:left;" id="TBL-1-8-3"
|
|
class="td11"><!--l. 83--><p class="noindent" >V-cycle with one hybrid
|
|
forward Gauss-Seidel (GS) sweep as
|
|
pre-smoother and one hybrid backward
|
|
GS sweep as post-smoother, decoupled
|
|
smoothed aggregation as coarsening
|
|
algorithm, and LU (plus triangular solve)
|
|
as coarsest-level solver. See the default
|
|
values in Tables <a
|
|
href="userhtmlse5.html#x8-18009r2">2<!--tex4ht:ref: tab:p_cycle --></a>-<a
|
|
href="userhtmlse5.html#x8-18015r8">8<!--tex4ht:ref: tab:p_smoother_1 --></a> for further details of
|
|
the preconditioner. </td>
|
|
</tr><tr
|
|
class="hline"><td><hr></td><td><hr></td><td><hr></td></tr><tr
|
|
style="vertical-align:baseline;" id="TBL-1-9-"><td style="white-space:nowrap; text-align:left;" id="TBL-1-9-1"
|
|
class="td11"> </td></tr></table> </div>
|
|
<br /> <div class="caption"
|
|
><span class="id">Table 1: </span><span
|
|
class="content">Preconditioner types, corresponding strings and default choices. </span></div><!--tex4ht:label?: x7-13015r1 -->
|
|
</div>
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
</div>
|
|
<!--l. 98--><p class="indent" > <span
|
|
class="cmr-12">Note that the module </span><code class="lstinline"><span style="color:#000000">amg_prec_mod</span></code><span
|
|
class="cmr-12">, containing the definition of the preconditioner</span>
|
|
<span
|
|
class="cmr-12">data type and the interfaces to the routines of AMG4PSBLAS, must be used</span>
|
|
<span
|
|
class="cmr-12">in any program calling such routines. The modules </span><code class="lstinline"><span style="color:#000000">psb_base_mod</span></code><span
|
|
class="cmr-12">, for the</span>
|
|
<span
|
|
class="cmr-12">sparse matrix and communication descriptor data types, and </span><code class="lstinline"><span style="color:#000000">psb_krylov_mod</span></code><span
|
|
class="cmr-12">,</span>
|
|
<span
|
|
class="cmr-12">for interfacing with the Krylov solvers, must be also used (see Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-140004.1"><span
|
|
class="cmr-12">4.1</span><!--tex4ht:ref: sec:examples --></a><span
|
|
class="cmr-12">).</span>
|
|
<br
|
|
class="newline" />
|
|
<!--l. 105--><p class="indent" > <span
|
|
class="cmbx-12">Remark 1. </span><span
|
|
class="cmr-12">Coarsest-level solvers based on the LU factorization, such as those</span>
|
|
<span
|
|
class="cmr-12">implemented in UMFPACK, MUMPS, SuperLU, and SuperLU</span><span
|
|
class="cmr-12">_Dist, usually lead to</span>
|
|
<span
|
|
class="cmr-12">smaller numbers of preconditioned Krylov iterations than inexact solvers, when the</span>
|
|
<span
|
|
class="cmr-12">linear system comes from a standard discretization of basic scalar elliptic PDE</span>
|
|
<span
|
|
class="cmr-12">problems. However, this does not necessarily correspond to the shortest execution time</span>
|
|
<span
|
|
class="cmr-12">on parallel</span><span
|
|
class="cmr-12"> computers.</span>
|
|
<h4 class="subsectionHead"><span class="titlemark"><span
|
|
class="cmr-12">4.1 </span></span> <a
|
|
id="x7-140004.1"></a><span
|
|
class="cmr-12">Examples</span></h4>
|
|
<!--l. 116--><p class="noindent" ><span
|
|
class="cmr-12">The code reported in Figure</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-14001r1"><span
|
|
class="cmr-12">1</span><!--tex4ht:ref: fig:ex1 --></a> <span
|
|
class="cmr-12">shows how to set and apply the default multilevel</span>
|
|
<span
|
|
class="cmr-12">preconditioner available in the real double precision version of AMG4PSBLAS</span>
|
|
<span
|
|
class="cmr-12">(see Table</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-13015r1"><span
|
|
class="cmr-12">1</span><!--tex4ht:ref: tab:precinit --></a><span
|
|
class="cmr-12">). This preconditioner is chosen by simply specifying </span><code class="lstinline"><span style="color:#000000">’</span><span style="color:#000000">ML</span><span style="color:#000000">’</span></code> <span
|
|
class="cmr-12">as the</span>
|
|
<span
|
|
class="cmr-12">second argument of </span><code class="lstinline"><span style="color:#000000">P</span><span style="color:#000000">%</span><span style="color:#000000">init</span></code> <span
|
|
class="cmr-12">(a call to </span><code class="lstinline"><span style="color:#000000">P</span><span style="color:#000000">%</span><span style="color:#000000">set</span></code> <span
|
|
class="cmr-12">is not needed) and is applied</span>
|
|
<span
|
|
class="cmr-12">with the CG solver provided by PSBLAS (the matrix of the system to be</span>
|
|
<span
|
|
class="cmr-12">solved is assumed to be positive definite). As previously observed, the modules</span>
|
|
<code class="lstinline"><span style="color:#000000">psb_base_mod</span></code><span
|
|
class="cmr-12">, </span><code class="lstinline"><span style="color:#000000">amg_prec_mod</span></code> <span
|
|
class="cmr-12">and </span><code class="lstinline"><span style="color:#000000">psb_krylov_mod</span></code> <span
|
|
class="cmr-12">must be used by the example</span>
|
|
<span
|
|
class="cmr-12">program.</span>
|
|
<!--l. 126--><p class="indent" > <span
|
|
class="cmr-12">The part of the code dealing with reading and assembling the sparse matrix and the</span>
|
|
<span
|
|
class="cmr-12">right-hand side vector and the deallocation of the relevant data structures, performed</span>
|
|
<span
|
|
class="cmr-12">through the PSBLAS routines for sparse matrix and vector management,</span>
|
|
<span
|
|
class="cmr-12">is not reported here for the sake of conciseness. The complete code can be</span>
|
|
<span
|
|
class="cmr-12">found in the example program file </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">amg_dexample_ml.f90</span></span></span><span
|
|
class="cmr-12">, in the directory</span>
|
|
<span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">samples/simple/file</span></span></span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">read</span></span></span> <span
|
|
class="cmr-12">of the AMG4PSBLAS implementation (see Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="userhtmlse3.html#x6-120003.5"><span
|
|
class="cmr-12">3.5</span><!--tex4ht:ref: sec:ex_and_test --></a><span
|
|
class="cmr-12">). A</span>
|
|
<span
|
|
class="cmr-12">sample test problem along with the relevant input data is available in</span>
|
|
<span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">samples/simple/fileread/runs</span></span></span><span
|
|
class="cmr-12">. For details on the use of the PSBLAS routines, see</span>
|
|
<span
|
|
class="cmr-12">the PSBLAS User’s Guide</span><span
|
|
class="cmr-12"> </span><span class="cite"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlli3.html#XPSBLASGUIDE"><span
|
|
class="cmr-12">21</span></a><span
|
|
class="cmr-12">]</span></span><span
|
|
class="cmr-12">.</span>
|
|
<!--l. 138--><p class="indent" > <span
|
|
class="cmr-12">The setup and application of the default multilevel preconditioner for the real single</span>
|
|
<span
|
|
class="cmr-12">precision and the complex, single and double precision, versions are obtained</span>
|
|
<span
|
|
class="cmr-12">with straightforward modifications of the previous example (see Section</span><span
|
|
class="cmr-12"> </span><a
|
|
href="userhtmlse5.html#x8-160005"><span
|
|
class="cmr-12">5</span><!--tex4ht:ref: sec:userinterface --></a> <span
|
|
class="cmr-12">for</span>
|
|
<span
|
|
class="cmr-12">details). If these versions are installed, the corresponding codes are available in</span>
|
|
<span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">samples/simple/file</span></span></span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">read</span></span></span><span
|
|
class="cmr-12">.</span>
|
|
|
|
|
|
|
|
<!--l. 144--><p class="indent" > <a
|
|
id="x7-14001r1"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 145--><p class="noindent" >
|
|
|
|
|
|
|
|
<div class="minipage"><pre class="verbatim" id="verbatim-7">
|
|
  use psb_base_mod
|
|
  use amg_prec_mod
|
|
  use psb_krylov_mod
|
|
... ...
|
|
!
|
|
! sparse matrix
|
|
  type(psb_dspmat_type) :: A
|
|
! sparse matrix descriptor
|
|
  type(psb_desc_type)   :: desc_A
|
|
! preconditioner
|
|
  type(amg_dprec_type)  :: P
|
|
! right-hand side and solution vectors
|
|
  type(psb_d_vect_type) :: b, x
|
|
... ...
|
|
!
|
|
! initialize the parallel environment
|
|
  call psb_init(ctxt)
|
|
  call psb_info(ctxt,iam,np)
|
|
... ...
|
|
!
|
|
! read and assemble the spd matrix A and the right-hand side b
|
|
! using PSBLAS routines for sparse matrix / vector management
|
|
... ...
|
|
!
|
|
! initialize the default multilevel preconditioner, i.e. V-cycle
|
|
! with basic smoothed aggregation, 1 hybrid forward/backward
|
|
! GS sweep as pre/post-smoother and UMFPACK as coarsest-level
|
|
! solver
|
|
  call P%init(ctxt,’ML’,info)
|
|
!
|
|
! build the preconditioner
|
|
  call P%hierarchy_build(A,desc_A,info)
|
|
  call P%smoothers_build(A,desc_A,info)
|
|
|
|
!
|
|
! set the solver parameters and the initial guess
|
|
  ... ...
|
|
!
|
|
! solve Ax=b with preconditioned FCG
|
|
  call psb_krylov(’FCG’,A,P,b,x,tol,desc_A,info)
|
|
  ... ...
|
|
!
|
|
! deallocate the preconditioner
|
|
  call P%free(info)
|
|
!
|
|
! deallocate other data structures
|
|
  ... ...
|
|
!
|
|
! exit the parallel environment
|
|
  call psb_exit(ctxt)
|
|
  stop
|
|
</pre>
|
|
<!--l. 255--><p class="nopar" > </div>
|
|
|
|
|
|
|
|
<br /> <div class="caption"
|
|
><span class="id">Listing 1: </span><span
|
|
class="content">setup and application of the default multilevel preconditioner (example 1).
|
|
</span></div><!--tex4ht:label?: x7-14001r1 -->
|
|
</div>
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
<!--l. 264--><p class="indent" > <span
|
|
class="cmr-12">Different versions of the multilevel preconditioner can be obtained by changing the</span>
|
|
<span
|
|
class="cmr-12">default values of the preconditioner parameters. The code reported in Figure</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-14002r2"><span
|
|
class="cmr-12">2</span><!--tex4ht:ref: fig:ex2 --></a> <span
|
|
class="cmr-12">shows</span>
|
|
<span
|
|
class="cmr-12">how to set a V-cycle preconditioner which applies 1 block-Jacobi sweep as pre-</span>
|
|
<span
|
|
class="cmr-12">and post-smoother, and solves the coarsest-level system with 8 block-Jacobi</span>
|
|
<span
|
|
class="cmr-12">sweeps. Note that the ILU(0) factorization (plus triangular solve) is used as</span>
|
|
<span
|
|
class="cmr-12">local solver for the block-Jacobi sweeps, since this is the default associated</span>
|
|
<span
|
|
class="cmr-12">with block-Jacobi and set by</span><span
|
|
class="cmr-12"> </span><code class="lstinline"><span style="color:#000000">P</span><span style="color:#000000">%</span><span style="color:#000000">init</span></code><span
|
|
class="cmr-12">. Furthermore, specifying block-Jacobi as</span>
|
|
<span
|
|
class="cmr-12">coarsest-level solver implies that the coarsest-level matrix is distributed among</span>
|
|
<span
|
|
class="cmr-12">the processes. Figure</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-14003r3"><span
|
|
class="cmr-12">3</span><!--tex4ht:ref: fig:ex3 --></a> <span
|
|
class="cmr-12">shows how to set a W-cycle preconditioner using the</span>
|
|
<span
|
|
class="cmr-12">Coarsening based on Compatible Weighted Matching, aggregates of size at</span>
|
|
<span
|
|
class="cmr-12">most 8 and smoothed prolongators. It applies 2 hybrid Gauss-Seidel sweeps as</span>
|
|
<span
|
|
class="cmr-12">pre- and post-smoother, and solves the coarsest-level system with the parallel</span>
|
|
<span
|
|
class="cmr-12">flexible Conjugate Gradient method (KRM) coupled with the block-Jacobi</span>
|
|
<span
|
|
class="cmr-12">preconditioner having ILU(0) on the blocks. Default parameters are used for stopping</span>
|
|
<span
|
|
class="cmr-12">criterion of the coarsest solver. Note that, also in this case, specifying KRM as</span>
|
|
<span
|
|
class="cmr-12">coarsest-level solver implies that the coarsest-level matrix is distributed among the</span>
|
|
<span
|
|
class="cmr-12">processes.</span>
|
|
<!--l. 291--><p class="indent" > <span
|
|
class="cmr-12">The code fragments shown in Figures</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-14002r2"><span
|
|
class="cmr-12">2</span><!--tex4ht:ref: fig:ex2 --></a> <span
|
|
class="cmr-12">and </span><a
|
|
href="#x7-14003r3"><span
|
|
class="cmr-12">3</span><!--tex4ht:ref: fig:ex3 --></a> <span
|
|
class="cmr-12">are included in the example program</span>
|
|
<span
|
|
class="cmr-12">file </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">amg_dexample_ml.f90</span></span></span> <span
|
|
class="cmr-12">too.</span>
|
|
<!--l. 294--><p class="indent" > <span
|
|
class="cmr-12">Finally, Figure</span><span
|
|
class="cmr-12"> </span><a
|
|
href="#x7-14004r4"><span
|
|
class="cmr-12">4</span><!--tex4ht:ref: fig:ex4 --></a> <span
|
|
class="cmr-12">shows the setup of a one-level additive Schwarz preconditioner,</span>
|
|
<span
|
|
class="cmr-12">i.e., RAS with overlap 2. Note also that a Krylov method different from CG</span>
|
|
<span
|
|
class="cmr-12">must be used to solve the preconditioned system, since the preconditione in</span>
|
|
<span
|
|
class="cmr-12">nonsymmetric. The corresponding example program is available in the file</span>
|
|
<span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">amg_dexample_1lev.f90</span></span></span><span
|
|
class="cmr-12">.</span>
|
|
<!--l. 301--><p class="indent" > <span
|
|
class="cmr-12">For all the previous preconditioners, example programs where the sparse matrix</span>
|
|
<span
|
|
class="cmr-12">and the right-hand side are generated by discretizing a PDE with Dirichlet</span>
|
|
<span
|
|
class="cmr-12">boundary conditions are also available in the directory </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">samples/simple/pdegen</span></span></span><span
|
|
class="cmr-12">.</span>
|
|
|
|
|
|
|
|
<!--l. 304--><p class="indent" > <a
|
|
id="x7-14002r2"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 318--><p class="noindent" >
|
|
<div class="minipage"><pre class="verbatim" id="verbatim-8">
|
|
... ...
|
|
! build a V-cycle preconditioner with 1 block-Jacobi sweep (with
|
|
! ILU(0) on the blocks) as pre- and post-smoother, and 8  block-Jacobi
|
|
! sweeps (with ILU(0) on the blocks) as coarsest-level solver
|
|
  call P%init(ctxt,’ML’,info)
|
|
  call P%set(’SMOOTHER_TYPE’,’BJAC’,info)
|
|
  call P%set(’COARSE_SOLVE’,’BJAC’,info)
|
|
  call P%set(’COARSE_SWEEPS’,8,info)
|
|
  call P%hierarchy_build(A,desc_A,info)
|
|
  call P%smoothers_build(A,desc_A,info)
|
|
... ...
|
|
</pre>
|
|
<!--l. 333--><p class="nopar" > </div></div>
|
|
<br /><div class="caption"
|
|
><span class="id">Listing 2: </span><span
|
|
class="content">setup of a multilevel preconditioner based on the default decoupled coarsening</span></div><!--tex4ht:label?: x7-14002r2 -->
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
|
|
|
|
|
|
<!--l. 340--><p class="indent" > <a
|
|
id="x7-14003r3"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 362--><p class="noindent" >
|
|
<div class="minipage"><pre class="verbatim" id="verbatim-9">
|
|
... ...
|
|
! build a W-cycle preconditioner with 2 hybrid Gauss-Seidel sweeps
|
|
! as pre- and post-smoother, a distributed coarsest
|
|
! matrix, and MUMPS as coarsest-level solver
|
|
  call P%init(ctxt,’ML’,info)
|
|
  call P%set(’PAR_AGGR_ALG’,’COUPLED’,info)
|
|
  call P%set(’AGGR_TYPE’,’MATCHBOXP’,info)
|
|
  call P%set(’AGGR_SIZE’,8,info)
|
|
  call P%set(’ML_CYCLE’,’WCYCLE’,info)
|
|
  call P%set(’SMOOTHER_TYPE’,’FBGS’,info)
|
|
  call P%set(’SMOOTHER_SWEEPS’,2,info)
|
|
  call P%set(’COARSE_SOLVE’,’KRM’,info)
|
|
  call P%set(’COARSE_MAT’,’DIST’,info)
|
|
  call P%set(’KRM_METHOD’,’FCG’,info)
|
|
  call P%hierarchy_build(A,desc_A,info)
|
|
  call P%smoothers_build(A,desc_A,info)
|
|
... ...
|
|
</pre>
|
|
<!--l. 383--><p class="nopar" > </div></div>
|
|
<br /> <div class="caption"
|
|
><span class="id">Listing 3: </span><span
|
|
class="content">setup of a multilevel preconditioner based on the coupled coarsening using
|
|
weighted matching</span></div><!--tex4ht:label?: x7-14003r3 -->
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
|
|
|
|
|
|
<!--l. 390--><p class="indent" > <a
|
|
id="x7-14004r4"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 402--><p class="noindent" >
|
|
<div class="minipage"><pre class="verbatim" id="verbatim-10">
|
|
... ...
|
|
! set RAS with overlap 2 and ILU(0) on the local blocks
|
|
  call P%init(ctxt,’AS’,info)
|
|
  call P%set(’SUB_OVR’,2,info)
|
|
  call P%bld(A,desc_A,info)
|
|
... ...
|
|
! solve Ax=b with preconditioned BiCGSTAB
|
|
  call psb_krylov(’BICGSTAB’,A,P,b,x,tol,desc_A,info)
|
|
</pre>
|
|
<!--l. 414--><p class="nopar" > </div></div>
|
|
<br /> <div class="caption"
|
|
><span class="id">Listing 4: </span><span
|
|
class="content">setup of a one-level Schwarz preconditioner.</span></div><!--tex4ht:label?: x7-14004r4 -->
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
<h4 class="subsectionHead"><span class="titlemark"><span
|
|
class="cmr-12">4.2 </span></span> <a
|
|
id="x7-150004.2"></a><span
|
|
class="cmr-12">GPU example</span></h4>
|
|
<!--l. 426--><p class="noindent" ><span
|
|
class="cmr-12">The code discussed here shows how to set up a program exploiting the combined GPU</span>
|
|
<span
|
|
class="cmr-12">capabilities of PSBLAS and AMG4PSBLAS. The code example is available in the</span>
|
|
<span
|
|
class="cmr-12">source distribution directory </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">amg4psblas/examples/gpu</span></span></span><span
|
|
class="cmr-12">.</span>
|
|
<!--l. 431--><p class="indent" > <span
|
|
class="cmr-12">First of all, we need to include the appropriate modules and declare some auxiliary</span>
|
|
<span
|
|
class="cmr-12">variables:</span>
|
|
|
|
|
|
|
|
<!--l. 433--><p class="indent" > <a
|
|
id="x7-15001r5"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 452--><p class="noindent" >
|
|
<div class="minipage"><pre class="verbatim" id="verbatim-11">
|
|
program amg_dexample_gpu
|
|
  use psb_base_mod
|
|
  use amg_prec_mod
|
|
  use psb_krylov_mod
|
|
  use psb_util_mod
|
|
  use psb_gpu_mod
|
|
  use data_input
|
|
  use amg_d_pde_mod
|
|
  implicit none
|
|
  .......
|
|
  ! GPU variables
|
|
  type(psb_d_hlg_sparse_mat) :: agmold
|
|
  type(psb_d_vect_gpu)       :: vgmold
|
|
  type(psb_i_vect_gpu)       :: igmold
|
|
|
|
 
|
|
</pre>
|
|
<!--l. 471--><p class="nopar" > </div></div>
|
|
<br /> <div class="caption"
|
|
><span class="id">Listing 5: </span><span
|
|
class="content">setup of a GPU-enabled test program part one.</span></div><!--tex4ht:label?: x7-15001r5 -->
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
<!--l. 478--><p class="indent" > <span
|
|
class="cmr-12">In this particular example we are choosing to employ a </span><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">HLG</span></span></span> <span
|
|
class="cmr-12">data structure for</span>
|
|
<span
|
|
class="cmr-12">sparse matrices on GPUs; for more information please refer to the PSBLAS-EXT users’</span>
|
|
<span
|
|
class="cmr-12">guide.</span>
|
|
<!--l. 482--><p class="indent" > <span
|
|
class="cmr-12">We then have to initialize the GPU environment, and pass the appropriate MOLD</span>
|
|
<span
|
|
class="cmr-12">variables to the build methods (see also the PSBLAS and PSBLAS-EXT users’</span>
|
|
<span
|
|
class="cmr-12">guides).</span>
|
|
|
|
|
|
|
|
<!--l. 485--><p class="indent" > <a
|
|
id="x7-15002r6"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 501--><p class="noindent" >
|
|
<div class="minipage"><pre class="verbatim" id="verbatim-12">
|
|
  call psb_init(ctxt)
|
|
  call psb_info(ctxt,iam,np)
|
|
  !
|
|
  ! BEWARE: if you have NGPUS  per node, the default is to
|
|
  ! attach to mod(IAM,NGPUS)
|
|
  !
|
|
  call psb_gpu_init(ictxt)
|
|
  ......
|
|
  t1 = psb_wtime()
|
|
  call prec%smoothers_build(a,desc_a,info, amold=agmold, vmold=vgmold, imold=igmold)
|
|
|
|
 
|
|
</pre>
|
|
<!--l. 516--><p class="nopar" > </div></div>
|
|
<br /> <div class="caption"
|
|
><span class="id">Listing 6: </span><span
|
|
class="content">setup of a GPU-enabled test program part two.</span></div><!--tex4ht:label?: x7-15002r6 -->
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
<!--l. 523--><p class="indent" > <span
|
|
class="cmr-12">Finally, we convert the input matrix, the descriptor and the vectors to use a</span>
|
|
<span
|
|
class="cmr-12">GPU-enabled internal storage format. We then preallocate the preconditioner</span>
|
|
<span
|
|
class="cmr-12">workspace before entering the Krylov method. At the end of the code, we close the</span>
|
|
<span
|
|
class="cmr-12">GPU environment</span>
|
|
|
|
|
|
|
|
<!--l. 527--><p class="indent" > <a
|
|
id="x7-15003r7"></a><hr class="float"><div class="float"
|
|
>
|
|
|
|
|
|
|
|
<div class="center"
|
|
>
|
|
<!--l. 557--><p class="noindent" >
|
|
<div class="minipage"><pre class="verbatim" id="verbatim-13">
|
|
  call desc_a%cnv(mold=igmold)
|
|
  call a%cscnv(info,mold=agmold)
|
|
  call psb_geasb(x,desc_a,info,mold=vgmold)
|
|
  call psb_geasb(b,desc_a,info,mold=vgmold)
|
|
|
|
  !
|
|
  ! iterative method parameters
|
|
  !
|
|
  call psb_barrier(ctxt)
|
|
  call prec%allocate_wrk(info)
|
|
  t1 = psb_wtime()
|
|
  call psb_krylov(s_choice%kmethd,a,prec,b,x,s_choice%eps,&
|
|
       & desc_a,info,itmax=s_choice%itmax,iter=iter,err=err,itrace=s_choice%itrace,&
|
|
       & istop=s_choice%istopc,irst=s_choice%irst)
|
|
  call prec%deallocate_wrk(info)
|
|
  call psb_barrier(ctxt)
|
|
  tslv = psb_wtime() - t1
|
|
|
|
  ......
|
|
  call psb_gpu_exit()
|
|
  call psb_exit(ctxt)
|
|
  stop
|
|
|
|
 
|
|
</pre>
|
|
<!--l. 584--><p class="nopar" > </div></div>
|
|
<br /> <div class="caption"
|
|
><span class="id">Listing 7: </span><span
|
|
class="content">setup of a GPU-enabled test program part three.</span></div><!--tex4ht:label?: x7-15003r7 -->
|
|
|
|
|
|
|
|
</div><hr class="endfloat" />
|
|
<!--l. 592--><p class="indent" > <span
|
|
class="cmr-12">It is very important to employ smoothers and coarsest solvers that are suited to the</span>
|
|
<span
|
|
class="cmr-12">GPU, i.e. methods that do NOT employ triangular system solve kernels. Methods that</span>
|
|
<span
|
|
class="cmr-12">satisfy this constraint include:</span>
|
|
<ul class="itemize1">
|
|
<li class="itemize">
|
|
<!--l. 596--><p class="noindent" ><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">JACOBI</span></span></span>
|
|
</li>
|
|
<li class="itemize">
|
|
<!--l. 597--><p class="noindent" ><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">BJAC</span></span></span> <span
|
|
class="cmr-12">with the following methods on the local blocks:</span>
|
|
<ul class="itemize2">
|
|
<li class="itemize">
|
|
<!--l. 599--><p class="noindent" ><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">INVK</span></span></span>
|
|
</li>
|
|
<li class="itemize">
|
|
<!--l. 600--><p class="noindent" ><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">INVT</span></span></span>
|
|
</li>
|
|
<li class="itemize">
|
|
<!--l. 601--><p class="noindent" ><span class="obeylines-h"><span class="verb"><span
|
|
class="cmtt-12">AINV</span></span></span></li></ul>
|
|
</li></ul>
|
|
<!--l. 604--><p class="noindent" ><span
|
|
class="cmr-12">and their </span><span
|
|
class="cmmi-12">ℓ</span><sub><span
|
|
class="cmr-8">1</span></sub> <span
|
|
class="cmr-12">variants.</span>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<!--l. 1--><div class="crosslinks"><p class="noindent"><span
|
|
class="cmr-12">[</span><a
|
|
href="userhtmlse5.html" ><span
|
|
class="cmr-12">next</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlse3.html" ><span
|
|
class="cmr-12">prev</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlse3.html#tailuserhtmlse3.html" ><span
|
|
class="cmr-12">prev-tail</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtmlse4.html" ><span
|
|
class="cmr-12">front</span></a><span
|
|
class="cmr-12">] [</span><a
|
|
href="userhtml.html#userhtmlse4.html" ><span
|
|
class="cmr-12">up</span></a><span
|
|
class="cmr-12">] </span></p></div>
|
|
<!--l. 1--><p class="indent" > <a
|
|
id="tailuserhtmlse4.html"></a>
|
|
</body></html>
|