< !DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
< html >
< head > < title > Getting Started< / title >
< meta http-equiv = "Content-Type" content = "text/html; charset=iso-8859-1" >
< meta name = "generator" content = "TeX4ht (https://tug.org/tex4ht/)" >
< meta name = "originator" content = "TeX4ht (https://tug.org/tex4ht/)" >
<!-- html,3 -->
< meta name = "src" content = "userhtml.tex" >
< link rel = "stylesheet" type = "text/css" href = "userhtml.css" >
< / head > < body
>
<!-- l. 1 --> < div class = "crosslinks" > < p class = "noindent" > < span
class="cmr-12">[< / span > < a
href="userhtmlse5.html" >< span
class="cmr-12">next< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="userhtmlse3.html" >< span
class="cmr-12">prev< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="userhtmlse3.html#tailuserhtmlse3.html" >< span
class="cmr-12">prev-tail< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="#tailuserhtmlse4.html">< span
class="cmr-12">tail< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="userhtml.html#userhtmlse4.html" >< span
class="cmr-12">up< / span > < / a > < span
class="cmr-12">] < / span > < / p > < / div >
< h3 class = "sectionHead" > < span class = "titlemark" > < span
class="cmr-12">4 < / span > < / span > < a
id="x7-130004">< / a > < span
class="cmr-12">Getting Started< / span > < / h3 >
<!-- l. 5 --> < p class = "noindent" > < span
class="cmr-12">This section describes the basics for building and applying AMG4PSBLAS one-level< / span >
< span
class="cmr-12">and multilevel (i.e., AMG) preconditioners with the Krylov solvers included in< / span >
< span
class="cmr-12">PSBLAS< / span > < span
class="cmr-12">  < / span > < span class = "cite" > < span
class="cmr-12">[< / span > < a
href="userhtmlli3.html#XPSBLASGUIDE">< span
class="cmr-12">21< / span > < / a > < span
class="cmr-12">]< / span > < / span > < span
class="cmr-12">.< / span >
<!-- l. 9 --> < p class = "indent" > < span
class="cmr-12">The following steps are required:< / span >
< ol class = "enumerate1" >
< li
class="enumerate" id="x7-13002x1">
<!-- l. 11 --> < p class = "noindent" > < span
class="cmti-12">Declare the preconditioner data structure< / span > < span
class="cmr-12">. It is a derived data type,< / span >
< span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">amg_< / span > < / span > < / span > < span
class="cmti-12">x< / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">prec_< / span > < / span > < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">type< / span > < / span > < / span > < span
class="cmr-12">, where < / span > < span
class="cmti-12">x < / span > < span
class="cmr-12">may be < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">s< / span > < / span > < / span > < span
class="cmr-12">, < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">d< / span > < / span > < / span > < span
class="cmr-12">, < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">c< / span > < / span > < / span > < span
class="cmr-12">or < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">z< / span > < / span > < / span > < span
class="cmr-12">, according to the basic data< / span >
< span
class="cmr-12">type of the sparse matrix (< / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">s< / span > < / span > < / span > < span
class="cmr-12">= real single precision; < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">d< / span > < / span > < / span > < span
class="cmr-12">= real double precision;< / span >
< span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">c< / span > < / span > < / span > < span
class="cmr-12">= complex single precision; < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">z< / span > < / span > < / span > < span
class="cmr-12">= complex double precision). This data< / span >
< span
class="cmr-12">structure is accessed by the user only through the AMG4PSBLAS routines,< / span >
< span
class="cmr-12">following an object-oriented approach.< / span >
< / li >
< li
class="enumerate" id="x7-13004x2">
<!-- l. 19 --> < p class = "noindent" > < span
class="cmti-12">Allocate and initialize the preconditioner data structure, according to a< / span >
< span
class="cmti-12">preconditioner type chosen by the user< / span > < span
class="cmr-12">. This is performed by the routine< / span >
< code class = "lstinline" > < span style = "color:#000000" > init< / span > < / code > < span
class="cmr-12">, which also sets defaults for each preconditioner type selected by< / span >
< span
class="cmr-12">the user. The preconditioner types and the defaults associated with them< / span >
< span
class="cmr-12">are given in Table< / span > < span
class="cmr-12">  < / span > < a
href="#x7-13015r1">< span
class="cmr-12">1< / span > <!-- tex4ht:ref: tab:precinit --> < / a > < span
class="cmr-12">, where the strings used by < / span > < code class = "lstinline" > < span style = "color:#000000" > init< / span > < / code > < span
class="cmr-12">to identify the< / span >
< span
class="cmr-12">preconditioner types are also given. Note that these strings are valid also if< / span >
< span
class="cmr-12">uppercase letters are substituted by corresponding lowercase ones.< / span >
< / li >
< li
class="enumerate" id="x7-13006x3">
<!-- l. 29 --> < p class = "noindent" > < span
class="cmti-12">Modify the selected preconditioner type, by properly setting preconditioner< / span >
< span
class="cmti-12">parameters. < / span > < span
class="cmr-12">This is performed by the routine < / span > < code class = "lstinline" > < span style = "color:#000000" > set< / span > < / code > < span
class="cmr-12">. This routine must be< / span >
< span
class="cmr-12">called if the user wants to modify the default values of the parameters< / span >
< span
class="cmr-12">associated with the selected preconditioner type, to obtain a variant of that< / span >
< span
class="cmr-12">preconditioner. Examples of use of < / span > < code class = "lstinline" > < span style = "color:#000000" > set< / span > < / code > < span
class="cmr-12">are given in Section< / span > < span
class="cmr-12">  < / span > < a
href="#x7-140004.1">< span
class="cmr-12">4.1< / span > <!-- tex4ht:ref: sec:examples --> < / a > < span
class="cmr-12">; a complete< / span >
< span
class="cmr-12">list of all the preconditioner parameters and their allowed and default values< / span >
< span
class="cmr-12">is provided in Section< / span > < span
class="cmr-12">  < / span > < a
href="userhtmlse5.html#x8-160005">< span
class="cmr-12">5< / span > <!-- tex4ht:ref: sec:userinterface --> < / a > < span
class="cmr-12">, Tables< / span > < span
class="cmr-12">  < / span > < a
href="userhtmlse5.html#x8-18009r2">< span
class="cmr-12">2< / span > <!-- tex4ht:ref: tab:p_cycle --> < / a > < span
class="cmr-12">-< / span > < a
href="userhtmlse5.html#x8-18015r8">< span
class="cmr-12">8< / span > <!-- tex4ht:ref: tab:p_smoother_1 --> < / a > < span
class="cmr-12">.< / span >
< / li >
< li
class="enumerate" id="x7-13008x4">
<!-- l. 40 --> < p class = "noindent" > < span
class="cmti-12">Build the preconditioner for a given matrix< / span > < span
class="cmr-12">. If the selected preconditioner is< / span >
< span
class="cmr-12">multilevel, then two steps must be performed, as specified next.< / span >
< ol class = "enumerate2" >
< li
class="enumerate" id="x7-13009x0">
<!-- l. 43 --> < p class = "noindent" > < span
class="cmti-12">Build the AMG hierarchy for a given matrix. < / span > < span
class="cmr-12">This is performed by the< / span >
< span
class="cmr-12">routine < / span > < code class = "lstinline" > < span style = "color:#000000" > hierarchy_build< / span > < / code > < span
class="cmr-12">.< / span >
< / li >
< li
class="enumerate" id="x7-13010x0">
<!-- l. 45 --> < p class = "noindent" > < span
class="cmti-12">Build the preconditioner for a given matrix. < / span > < span
class="cmr-12">This is performed by the< / span >
< span
class="cmr-12">routine < / span > < code class = "lstinline" > < span style = "color:#000000" > smoothers_build< / span > < / code > < span
class="cmr-12">.< / span > < / li > < / ol >
<!-- l. 48 --> < p class = "noindent" > < span
class="cmr-12">If the selected preconditioner is one-level, it is built in a single step, performed by< / span >
< span
class="cmr-12">the routine < / span > < code class = "lstinline" > < span style = "color:#000000" > bld< / span > < / code > < span
class="cmr-12">.< / span >
< / li >
< li
class="enumerate" id="x7-13012x5">
<!-- l. 50 --> < p class = "noindent" > < span
class="cmti-12">Apply the preconditioner at each iteration of a Krylov solver. < / span > < span
class="cmr-12">This is performed by< / span >
< span
class="cmr-12">the method < / span > < code class = "lstinline" > < span style = "color:#000000" > apply< / span > < / code > < span
class="cmr-12">. When using the PSBLAS Krylov solvers, this step is< / span >
< span
class="cmr-12">completely transparent to the user, since < / span > < code class = "lstinline" > < span style = "color:#000000" > apply< / span > < / code > < span
class="cmr-12">is called by the PSBLAS routine< / span >
< span
class="cmr-12">implementing the Krylov solver (< / span > < code class = "lstinline" > < span style = "color:#000000" > psb_krylov< / span > < / code > < span
class="cmr-12">).< / span >
< / li >
< li
class="enumerate" id="x7-13014x6">
<!-- l. 54 --> < p class = "noindent" > < span
class="cmti-12">Free the preconditioner data structure< / span > < span
class="cmr-12">. This is performed by the routine < / span > < code class = "lstinline" > < span style = "color:#000000" > free< / span > < / code > < span
class="cmr-12">.< / span >
< span
class="cmr-12">This step is complementary to step 1 and should be performed when the< / span >
< span
class="cmr-12">preconditioner is no more used.< / span > < / li > < / ol >
<!-- l. 59 --> < p class = "indent" > < span
class="cmr-12">All the previous routines are available as methods of the preconditioner object. A< / span >
< span
class="cmr-12">detailed description of them is given in Section< / span > < span
class="cmr-12">  < / span > < a
href="userhtmlse5.html#x8-160005">< span
class="cmr-12">5< / span > <!-- tex4ht:ref: sec:userinterface --> < / a > < span
class="cmr-12">. Examples showing the basic use of< / span >
< span
class="cmr-12">AMG4PSBLAS are reported in Section< / span > < span
class="cmr-12">  < / span > < a
href="#x7-140004.1">< span
class="cmr-12">4.1< / span > <!-- tex4ht:ref: sec:examples --> < / a > < span
class="cmr-12">.< / span >
< div class = "table" >
<!-- l. 63 --> < p class = "indent" > < a
id="x7-13015r1">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 64 --> < p class = "noindent" >
< div class = "tabular" > < table id = "TBL-1" class = "tabular"
>< colgroup id = "TBL-1-1g" > < col
id="TBL-1-1">< / colgroup > < colgroup id = "TBL-1-2g" > < col
id="TBL-1-2">< / colgroup > < colgroup id = "TBL-1-3g" > < col
id="TBL-1-3">< / colgroup > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-1-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-1-1"
class="td11">< span
class="cmcsc-10x-x-109">< span
class="small-caps">t< / span > < span
class="small-caps">y< / span > < span
class="small-caps">p< / span > < span
class="small-caps">e< / span > < / span > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-1-2"
class="td11"><!-- l. 68 --> < p class = "noindent" > < span
class="cmcsc-10x-x-109">< span
class="small-caps">s< / span > < span
class="small-caps">t< / span > < span
class="small-caps">r< / span > < span
class="small-caps">i< / span > < span
class="small-caps">n< / span > < span
class="small-caps">g< / span > < / span > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-1-3"
class="td11"><!-- l. 68 --> < p class = "noindent" > < span
class="cmcsc-10x-x-109">< span
class="small-caps">d< / span > < span
class="small-caps">e< / span > < span
class="small-caps">f< / span > < span
class="small-caps">a< / span > < span
class="small-caps">u< / span > < span
class="small-caps">l< / span > < span
class="small-caps">t< / span > < span
class="small-caps">p< / span > < span
class="small-caps">r< / span > < span
class="small-caps">e< / span > < span
class="small-caps">c< / span > < span
class="small-caps">o< / span > < span
class="small-caps">n< / span > < span
class="small-caps">d< / span > < span
class="small-caps">i< / span > < span
class="small-caps">t< / span > < span
class="small-caps">i< / span > < span
class="small-caps">o< / span > < span
class="small-caps">n< / span > < span
class="small-caps">e< / span > < span
class="small-caps">r< / span > < / span > < / td > < / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-2-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-2-1"
class="td11">No preconditioner < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-2-2"
class="td11"><!-- l. 69 --> < p class = "noindent" > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > NONE< / span > < span style = "color:#000000" > ’ < / span > < / code > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-2-3"
class="td11"><!-- l. 69 --> < p class = "noindent" > Considered to use the PSBLAS Krylov
solvers with no preconditioner. < / td >
< / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-3-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-3-1"
class="td11">Diagonal < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-3-2"
class="td11"><!-- l. 71 --> < p class = "noindent" > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > DIAG< / span > < span style = "color:#000000" > ’ < / span > < / code > ,
< code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > JACOBI< / span > < span style = "color:#000000" > ’ < / span > < / code > ,
< code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > L1< / span > < span style = "color:#000000" > -< / span > < span style = "color:#000000" > JACOBI< / span > < span style = "color:#000000" > ’ < / span > < / code > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-3-3"
class="td11"><!-- l. 71 --> < p class = "noindent" > Diagonal preconditioner. For any zero
diagonal entry of the matrix to be
preconditioned, the corresponding entry
of the preconditioner is set to  1. < / td >
< / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-4-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-4-1"
class="td11">Gauss-Seidel < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-4-2"
class="td11"><!-- l. 74 --> < p class = "noindent" > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > GS< / span > < span style = "color:#000000" > ’ < / span > < / code > ,
< code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > L1< / span > < span style = "color:#000000" > -< / span > < span style = "color:#000000" > GS< / span > < span style = "color:#000000" > ’ < / span > < / code > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-4-3"
class="td11"><!-- l. 74 --> < p class = "noindent" > Hybrid Gauss-Seidel (forward), that is,
global block Jacobi with Gauss-Seidel as
local solver. < / td >
< / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-5-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-5-1"
class="td11">Symmetrized Gauss-Seidel< / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-5-2"
class="td11"><!-- l. 77 --> < p class = "noindent" > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > FBGS< / span > < span style = "color:#000000" > ’ < / span > < / code > ,
< code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > L1< / span > < span style = "color:#000000" > -< / span > < span style = "color:#000000" > FBGS< / span > < span style = "color:#000000" > ’ < / span > < / code > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-5-3"
class="td11"><!-- l. 77 --> < p class = "noindent" > Symmetrized hybrid Gauss-Seidel, that
is, forward Gauss-Seidel followed by
backward Gauss-Seidel. < / td >
< / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-6-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-6-1"
class="td11">Block Jacobi < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-6-2"
class="td11"><!-- l. 80 --> < p class = "noindent" > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > BJAC< / span > < span style = "color:#000000" > ’ < / span > < / code > ,
< code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > L1< / span > < span style = "color:#000000" > -< / span > < span style = "color:#000000" > BJAC< / span > < span style = "color:#000000" > ’ < / span > < / code > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-6-3"
class="td11"><!-- l. 80 --> < p class = "noindent" > Block-Jacobi with ILU(0) on the local
blocks. < / td >
< / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-7-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-7-1"
class="td11">Additive Schwarz < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-7-2"
class="td11"><!-- l. 81 --> < p class = "noindent" > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > AS< / span > < span style = "color:#000000" > ’ < / span > < / code > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-7-3"
class="td11"><!-- l. 81 --> < p class = "noindent" > Additive Schwarz (AS), with overlap  1
and ILU(0) on the local blocks. < / td >
< / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-8-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-8-1"
class="td11">Multilevel < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-8-2"
class="td11"><!-- l. 83 --> < p class = "noindent" > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > ML< / span > < span style = "color:#000000" > ’ < / span > < / code > < / td > < td style = "white-space:normal; text-align:left;" id = "TBL-1-8-3"
class="td11"><!-- l. 83 --> < p class = "noindent" > V-cycle with one hybrid
forward Gauss-Seidel (GS) sweep as
pre-smoother and one hybrid backward
GS sweep as post-smoother, decoupled
smoothed aggregation as coarsening
algorithm, and LU (plus triangular solve)
as coarsest-level solver. See the default
values in Tables  < a
href="userhtmlse5.html#x8-18009r2">2<!-- tex4ht:ref: tab:p_cycle --> < / a > -< a
href="userhtmlse5.html#x8-18015r8">8<!-- tex4ht:ref: tab:p_smoother_1 --> < / a > for further details of
the preconditioner. < / td >
< / tr > < tr
class="hline">< td > < hr > < / td > < td > < hr > < / td > < td > < hr > < / td > < / tr > < tr
style="vertical-align:baseline;" id="TBL-1-9-">< td style = "white-space:nowrap; text-align:left;" id = "TBL-1-9-1"
class="td11"> < / td > < / tr > < / table > < / div >
< br / > < div class = "caption"
>< span class = "id" > Table  1: < / span > < span
class="content">Preconditioner types, corresponding strings and default choices. < / span > < / div > <!-- tex4ht:label?: x7 - 13015r1 -->
< / div >
< / div > < hr class = "endfloat" / >
< / div >
<!-- l. 98 --> < p class = "indent" > < span
class="cmr-12">Note that the module < / span > < code class = "lstinline" > < span style = "color:#000000" > amg_prec_mod< / span > < / code > < span
class="cmr-12">, containing the definition of the preconditioner< / span >
< span
class="cmr-12">data type and the interfaces to the routines of AMG4PSBLAS, must be used< / span >
< span
class="cmr-12">in any program calling such routines. The modules < / span > < code class = "lstinline" > < span style = "color:#000000" > psb_base_mod< / span > < / code > < span
class="cmr-12">, for the< / span >
< span
class="cmr-12">sparse matrix and communication descriptor data types, and < / span > < code class = "lstinline" > < span style = "color:#000000" > psb_krylov_mod< / span > < / code > < span
class="cmr-12">,< / span >
< span
class="cmr-12">for interfacing with the Krylov solvers, must be also used (see Section< / span > < span
class="cmr-12">  < / span > < a
href="#x7-140004.1">< span
class="cmr-12">4.1< / span > <!-- tex4ht:ref: sec:examples --> < / a > < span
class="cmr-12">).< / span >
< br
class="newline" />
<!-- l. 105 --> < p class = "indent" > < span
class="cmbx-12">Remark 1. < / span > < span
class="cmr-12">Coarsest-level solvers based on the LU factorization, such as those< / span >
< span
class="cmr-12">implemented in UMFPACK, MUMPS, SuperLU, and SuperLU< / span > < span
class="cmr-12">_Dist, usually lead to< / span >
< span
class="cmr-12">smaller numbers of preconditioned Krylov iterations than inexact solvers, when the< / span >
< span
class="cmr-12">linear system comes from a standard discretization of basic scalar elliptic PDE< / span >
< span
class="cmr-12">problems. However, this does not necessarily correspond to the shortest execution time< / span >
< span
class="cmr-12">on parallel< / span > < span
class="cmr-12">  computers.< / span >
< h4 class = "subsectionHead" > < span class = "titlemark" > < span
class="cmr-12">4.1 < / span > < / span > < a
id="x7-140004.1">< / a > < span
class="cmr-12">Examples< / span > < / h4 >
<!-- l. 116 --> < p class = "noindent" > < span
class="cmr-12">The code reported in Figure< / span > < span
class="cmr-12">  < / span > < a
href="#x7-14001r1">< span
class="cmr-12">1< / span > <!-- tex4ht:ref: fig:ex1 --> < / a > < span
class="cmr-12">shows how to set and apply the default multilevel< / span >
< span
class="cmr-12">preconditioner available in the real double precision version of AMG4PSBLAS< / span >
< span
class="cmr-12">(see Table< / span > < span
class="cmr-12">  < / span > < a
href="#x7-13015r1">< span
class="cmr-12">1< / span > <!-- tex4ht:ref: tab:precinit --> < / a > < span
class="cmr-12">). This preconditioner is chosen by simply specifying < / span > < code class = "lstinline" > < span style = "color:#000000" > ’ < / span > < span style = "color:#000000" > ML< / span > < span style = "color:#000000" > ’ < / span > < / code > < span
class="cmr-12">as the< / span >
< span
class="cmr-12">second argument of < / span > < code class = "lstinline" > < span style = "color:#000000" > P< / span > < span style = "color:#000000" > %< / span > < span style = "color:#000000" > init< / span > < / code > < span
class="cmr-12">(a call to < / span > < code class = "lstinline" > < span style = "color:#000000" > P< / span > < span style = "color:#000000" > %< / span > < span style = "color:#000000" > set< / span > < / code > < span
class="cmr-12">is not needed) and is applied< / span >
< span
class="cmr-12">with the CG solver provided by PSBLAS (the matrix of the system to be< / span >
< span
class="cmr-12">solved is assumed to be positive definite). As previously observed, the modules< / span >
< code class = "lstinline" > < span style = "color:#000000" > psb_base_mod< / span > < / code > < span
class="cmr-12">, < / span > < code class = "lstinline" > < span style = "color:#000000" > amg_prec_mod< / span > < / code > < span
class="cmr-12">and < / span > < code class = "lstinline" > < span style = "color:#000000" > psb_krylov_mod< / span > < / code > < span
class="cmr-12">must be used by the example< / span >
< span
class="cmr-12">program.< / span >
<!-- l. 126 --> < p class = "indent" > < span
class="cmr-12">The part of the code dealing with reading and assembling the sparse matrix and the< / span >
< span
class="cmr-12">right-hand side vector and the deallocation of the relevant data structures, performed< / span >
< span
class="cmr-12">through the PSBLAS routines for sparse matrix and vector management,< / span >
< span
class="cmr-12">is not reported here for the sake of conciseness. The complete code can be< / span >
< span
class="cmr-12">found in the example program file < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">amg_dexample_ml.f90< / span > < / span > < / span > < span
class="cmr-12">, in the directory< / span >
< span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">samples/simple/file< / span > < / span > < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">read< / span > < / span > < / span > < span
class="cmr-12">of the AMG4PSBLAS implementation (see Section< / span > < span
class="cmr-12">  < / span > < a
href="userhtmlse3.html#x6-120003.5">< span
class="cmr-12">3.5< / span > <!-- tex4ht:ref: sec:ex_and_test --> < / a > < span
class="cmr-12">). A< / span >
< span
class="cmr-12">sample test problem along with the relevant input data is available in< / span >
< span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">samples/simple/fileread/runs< / span > < / span > < / span > < span
class="cmr-12">. For details on the use of the PSBLAS routines, see< / span >
< span
class="cmr-12">the PSBLAS User’ s Guide< / span > < span
class="cmr-12">  < / span > < span class = "cite" > < span
class="cmr-12">[< / span > < a
href="userhtmlli3.html#XPSBLASGUIDE">< span
class="cmr-12">21< / span > < / a > < span
class="cmr-12">]< / span > < / span > < span
class="cmr-12">.< / span >
<!-- l. 138 --> < p class = "indent" > < span
class="cmr-12">The setup and application of the default multilevel preconditioner for the real single< / span >
< span
class="cmr-12">precision and the complex, single and double precision, versions are obtained< / span >
< span
class="cmr-12">with straightforward modifications of the previous example (see Section< / span > < span
class="cmr-12">  < / span > < a
href="userhtmlse5.html#x8-160005">< span
class="cmr-12">5< / span > <!-- tex4ht:ref: sec:userinterface --> < / a > < span
class="cmr-12">for< / span >
< span
class="cmr-12">details). If these versions are installed, the corresponding codes are available in< / span >
< span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">samples/simple/file< / span > < / span > < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">read< / span > < / span > < / span > < span
class="cmr-12">.< / span >
<!-- l. 144 --> < p class = "indent" > < a
id="x7-14001r1">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 145 --> < p class = "noindent" >
< div class = "minipage" > < pre class = "verbatim" id = "verbatim-7" >
    use  psb_base_mod
    use  amg_prec_mod
    use  psb_krylov_mod
...  ...
!
!  sparse  matrix
    type(psb_dspmat_type)  ::  A
!  sparse  matrix  descriptor
    type(psb_desc_type)      ::  desc_A
!  preconditioner
    type(amg_dprec_type)    ::  P
!  right-hand  side  and  solution  vectors
    type(psb_d_vect_type)  ::  b,  x
...  ...
!
!  initialize  the  parallel  environment
    call  psb_init(ctxt)
    call  psb_info(ctxt,iam,np)
...  ...
!
!  read  and  assemble  the  spd  matrix  A  and  the  right-hand  side  b
!  using  PSBLAS  routines  for  sparse  matrix  /  vector  management
...  ...
!
!  initialize  the  default  multilevel  preconditioner,  i.e.  V-cycle
!  with  basic  smoothed  aggregation,  1  hybrid  forward/backward
!  GS  sweep  as  pre/post-smoother  and  UMFPACK  as  coarsest-level
!  solver
    call  P%init(ctxt,’ ML’ ,info)
!
!  build  the  preconditioner
    call  P%hierarchy_build(A,desc_A,info)
    call  P%smoothers_build(A,desc_A,info)
!
!  set  the  solver  parameters  and  the  initial  guess
    ...  ...
!
!  solve  Ax=b  with  preconditioned  FCG
    call  psb_krylov(’ FCG’ ,A,P,b,x,tol,desc_A,info)
    ...  ...
!
!  deallocate  the  preconditioner
    call  P%free(info)
!
!  deallocate  other  data  structures
    ...  ...
!
!  exit  the  parallel  environment
    call  psb_exit(ctxt)
    stop
< / pre >
<!-- l. 255 --> < p class = "nopar" > < / div >
< br / > < div class = "caption"
>< span class = "id" > Listing 1: < / span > < span
class="content">setup and application of the default multilevel preconditioner (example 1).
< / span > < / div > <!-- tex4ht:label?: x7 - 14001r1 -->
< / div >
< / div > < hr class = "endfloat" / >
<!-- l. 264 --> < p class = "indent" > < span
class="cmr-12">Different versions of the multilevel preconditioner can be obtained by changing the< / span >
< span
class="cmr-12">default values of the preconditioner parameters. The code reported in Figure< / span > < span
class="cmr-12">  < / span > < a
href="#x7-14002r2">< span
class="cmr-12">2< / span > <!-- tex4ht:ref: fig:ex2 --> < / a > < span
class="cmr-12">shows< / span >
< span
class="cmr-12">how to set a V-cycle preconditioner which applies 1 block-Jacobi sweep as pre-< / span >
< span
class="cmr-12">and post-smoother, and solves the coarsest-level system with 8 block-Jacobi< / span >
< span
class="cmr-12">sweeps. Note that the ILU(0) factorization (plus triangular solve) is used as< / span >
< span
class="cmr-12">local solver for the block-Jacobi sweeps, since this is the default associated< / span >
< span
class="cmr-12">with block-Jacobi and set by< / span > < span
class="cmr-12">  < / span > < code class = "lstinline" > < span style = "color:#000000" > P< / span > < span style = "color:#000000" > %< / span > < span style = "color:#000000" > init< / span > < / code > < span
class="cmr-12">. Furthermore, specifying block-Jacobi as< / span >
< span
class="cmr-12">coarsest-level solver implies that the coarsest-level matrix is distributed among< / span >
< span
class="cmr-12">the processes. Figure< / span > < span
class="cmr-12">  < / span > < a
href="#x7-14003r3">< span
class="cmr-12">3< / span > <!-- tex4ht:ref: fig:ex3 --> < / a > < span
class="cmr-12">shows how to set a W-cycle preconditioner using the< / span >
< span
class="cmr-12">Coarsening based on Compatible Weighted Matching, aggregates of size at< / span >
< span
class="cmr-12">most 8 and smoothed prolongators. It applies 2 hybrid Gauss-Seidel sweeps as< / span >
< span
class="cmr-12">pre- and post-smoother, and solves the coarsest-level system with the parallel< / span >
< span
class="cmr-12">flexible Conjugate Gradient method (KRM) coupled with the block-Jacobi< / span >
< span
class="cmr-12">preconditioner having ILU(0) on the blocks. Default parameters are used for stopping< / span >
< span
class="cmr-12">criterion of the coarsest solver. Note that, also in this case, specifying KRM as< / span >
< span
class="cmr-12">coarsest-level solver implies that the coarsest-level matrix is distributed among the< / span >
< span
class="cmr-12">processes.< / span >
<!-- l. 291 --> < p class = "indent" > < span
class="cmr-12">The code fragments shown in Figures< / span > < span
class="cmr-12">  < / span > < a
href="#x7-14002r2">< span
class="cmr-12">2< / span > <!-- tex4ht:ref: fig:ex2 --> < / a > < span
class="cmr-12">and < / span > < a
href="#x7-14003r3">< span
class="cmr-12">3< / span > <!-- tex4ht:ref: fig:ex3 --> < / a > < span
class="cmr-12">are included in the example program< / span >
< span
class="cmr-12">file < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">amg_dexample_ml.f90< / span > < / span > < / span > < span
class="cmr-12">too.< / span >
<!-- l. 294 --> < p class = "indent" > < span
class="cmr-12">Finally, Figure< / span > < span
class="cmr-12">  < / span > < a
href="#x7-14004r4">< span
class="cmr-12">4< / span > <!-- tex4ht:ref: fig:ex4 --> < / a > < span
class="cmr-12">shows the setup of a one-level additive Schwarz preconditioner,< / span >
< span
class="cmr-12">i.e., RAS with overlap 2. Note also that a Krylov method different from CG< / span >
< span
class="cmr-12">must be used to solve the preconditioned system, since the preconditione in< / span >
< span
class="cmr-12">nonsymmetric. The corresponding example program is available in the file< / span >
< span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">amg_dexample_1lev.f90< / span > < / span > < / span > < span
class="cmr-12">.< / span >
<!-- l. 301 --> < p class = "indent" > < span
class="cmr-12">For all the previous preconditioners, example programs where the sparse matrix< / span >
< span
class="cmr-12">and the right-hand side are generated by discretizing a PDE with Dirichlet< / span >
< span
class="cmr-12">boundary conditions are also available in the directory < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">samples/simple/pdegen< / span > < / span > < / span > < span
class="cmr-12">.< / span >
<!-- l. 304 --> < p class = "indent" > < a
id="x7-14002r2">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 318 --> < p class = "noindent" >
< div class = "minipage" > < pre class = "verbatim" id = "verbatim-8" >
...  ...
!  build  a  V-cycle  preconditioner  with  1  block-Jacobi  sweep  (with
!  ILU(0)  on  the  blocks)  as  pre-  and  post-smoother,  and  8    block-Jacobi
!  sweeps  (with  ILU(0)  on  the  blocks)  as  coarsest-level  solver
    call  P%init(ctxt,’ ML’ ,info)
    call  P%set(’ SMOOTHER_TYPE’ ,’ BJAC’ ,info)
    call  P%set(’ COARSE_SOLVE’ ,’ BJAC’ ,info)
    call  P%set(’ COARSE_SWEEPS’ ,8,info)
    call  P%hierarchy_build(A,desc_A,info)
    call  P%smoothers_build(A,desc_A,info)
...  ...
< / pre >
<!-- l. 333 --> < p class = "nopar" > < / div > < / div >
< br / > < div class = "caption"
>< span class = "id" > Listing 2: < / span > < span
class="content">setup of a multilevel preconditioner based on the default decoupled coarsening< / span > < / div > <!-- tex4ht:label?: x7 - 14002r2 -->
< / div > < hr class = "endfloat" / >
<!-- l. 340 --> < p class = "indent" > < a
id="x7-14003r3">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 362 --> < p class = "noindent" >
< div class = "minipage" > < pre class = "verbatim" id = "verbatim-9" >
...  ...
!  build  a  W-cycle  preconditioner  with  2  hybrid  Gauss-Seidel  sweeps
!  as  pre-  and  post-smoother,  a  distributed  coarsest
!  matrix,  and  MUMPS  as  coarsest-level  solver
    call  P%init(ctxt,’ ML’ ,info)
    call  P%set(’ PAR_AGGR_ALG’ ,’ COUPLED’ ,info)
    call  P%set(’ AGGR_TYPE’ ,’ MATCHBOXP’ ,info)
    call  P%set(’ AGGR_SIZE’ ,8,info)
    call  P%set(’ ML_CYCLE’ ,’ WCYCLE’ ,info)
    call  P%set(’ SMOOTHER_TYPE’ ,’ FBGS’ ,info)
    call  P%set(’ SMOOTHER_SWEEPS’ ,2,info)
    call  P%set(’ COARSE_SOLVE’ ,’ KRM’ ,info)
    call  P%set(’ COARSE_MAT’ ,’ DIST’ ,info)
    call  P%set(’ KRM_METHOD’ ,’ FCG’ ,info)
    call  P%hierarchy_build(A,desc_A,info)
    call  P%smoothers_build(A,desc_A,info)
...  ...
< / pre >
<!-- l. 383 --> < p class = "nopar" > < / div > < / div >
< br / > < div class = "caption"
>< span class = "id" > Listing 3: < / span > < span
class="content">setup of a multilevel preconditioner based on the coupled coarsening using
weighted matching< / span > < / div > <!-- tex4ht:label?: x7 - 14003r3 -->
< / div > < hr class = "endfloat" / >
<!-- l. 390 --> < p class = "indent" > < a
id="x7-14004r4">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 402 --> < p class = "noindent" >
< div class = "minipage" > < pre class = "verbatim" id = "verbatim-10" >
...  ...
!  set  RAS  with  overlap  2  and  ILU(0)  on  the  local  blocks
    call  P%init(ctxt,’ AS’ ,info)
    call  P%set(’ SUB_OVR’ ,2,info)
    call  P%bld(A,desc_A,info)
...  ...
!  solve  Ax=b  with  preconditioned  BiCGSTAB
    call  psb_krylov(’ BICGSTAB’ ,A,P,b,x,tol,desc_A,info)
< / pre >
<!-- l. 414 --> < p class = "nopar" > < / div > < / div >
< br / > < div class = "caption"
>< span class = "id" > Listing 4: < / span > < span
class="content">setup of a one-level Schwarz preconditioner.< / span > < / div > <!-- tex4ht:label?: x7 - 14004r4 -->
< / div > < hr class = "endfloat" / >
< h4 class = "subsectionHead" > < span class = "titlemark" > < span
class="cmr-12">4.2 < / span > < / span > < a
id="x7-150004.2">< / a > < span
class="cmr-12">GPU example< / span > < / h4 >
<!-- l. 426 --> < p class = "noindent" > < span
class="cmr-12">The code discussed here shows how to set up a program exploiting the combined GPU< / span >
< span
class="cmr-12">capabilities of PSBLAS and AMG4PSBLAS. The code example is available in the< / span >
< span
class="cmr-12">source distribution directory < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">amg4psblas/examples/gpu< / span > < / span > < / span > < span
class="cmr-12">.< / span >
<!-- l. 431 --> < p class = "indent" > < span
class="cmr-12">First of all, we need to include the appropriate modules and declare some auxiliary< / span >
< span
class="cmr-12">variables:< / span >
<!-- l. 433 --> < p class = "indent" > < a
id="x7-15001r5">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 452 --> < p class = "noindent" >
< div class = "minipage" > < pre class = "verbatim" id = "verbatim-11" >
program  amg_dexample_gpu
    use  psb_base_mod
    use  amg_prec_mod
    use  psb_krylov_mod
    use  psb_util_mod
    use  psb_gpu_mod
    use  data_input
    use  amg_d_pde_mod
    implicit  none
    .......
    !  GPU  variables
    type(psb_d_hlg_sparse_mat)  ::  agmold
    type(psb_d_vect_gpu)              ::  vgmold
    type(psb_i_vect_gpu)              ::  igmold
 
< / pre >
<!-- l. 471 --> < p class = "nopar" > < / div > < / div >
< br / > < div class = "caption"
>< span class = "id" > Listing 5: < / span > < span
class="content">setup of a GPU-enabled test program part one.< / span > < / div > <!-- tex4ht:label?: x7 - 15001r5 -->
< / div > < hr class = "endfloat" / >
<!-- l. 478 --> < p class = "indent" > < span
class="cmr-12">In this particular example we are choosing to employ a < / span > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">HLG< / span > < / span > < / span > < span
class="cmr-12">data structure for< / span >
< span
class="cmr-12">sparse matrices on GPUs; for more information please refer to the PSBLAS-EXT users’ < / span >
< span
class="cmr-12">guide.< / span >
<!-- l. 482 --> < p class = "indent" > < span
class="cmr-12">We then have to initialize the GPU environment, and pass the appropriate MOLD< / span >
< span
class="cmr-12">variables to the build methods (see also the PSBLAS and PSBLAS-EXT users’ < / span >
< span
class="cmr-12">guides).< / span >
<!-- l. 485 --> < p class = "indent" > < a
id="x7-15002r6">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 501 --> < p class = "noindent" >
< div class = "minipage" > < pre class = "verbatim" id = "verbatim-12" >
    call  psb_init(ctxt)
    call  psb_info(ctxt,iam,np)
    !
    !  BEWARE:  if  you  have  NGPUS    per  node,  the  default  is  to
    !  attach  to  mod(IAM,NGPUS)
    !
    call  psb_gpu_init(ictxt)
    ......
    t1  =  psb_wtime()
    call  prec%smoothers_build(a,desc_a,info,  amold=agmold,  vmold=vgmold,  imold=igmold)
 
< / pre >
<!-- l. 516 --> < p class = "nopar" > < / div > < / div >
< br / > < div class = "caption"
>< span class = "id" > Listing 6: < / span > < span
class="content">setup of a GPU-enabled test program part two.< / span > < / div > <!-- tex4ht:label?: x7 - 15002r6 -->
< / div > < hr class = "endfloat" / >
<!-- l. 523 --> < p class = "indent" > < span
class="cmr-12">Finally, we convert the input matrix, the descriptor and the vectors to use a< / span >
< span
class="cmr-12">GPU-enabled internal storage format. We then preallocate the preconditioner< / span >
< span
class="cmr-12">workspace before entering the Krylov method. At the end of the code, we close the< / span >
< span
class="cmr-12">GPU environment< / span >
<!-- l. 527 --> < p class = "indent" > < a
id="x7-15003r7">< / a > < hr class = "float" > < div class = "float"
>
< div class = "center"
>
<!-- l. 557 --> < p class = "noindent" >
< div class = "minipage" > < pre class = "verbatim" id = "verbatim-13" >
    call  desc_a%cnv(mold=igmold)
    call  a%cscnv(info,mold=agmold)
    call  psb_geasb(x,desc_a,info,mold=vgmold)
    call  psb_geasb(b,desc_a,info,mold=vgmold)
    !
    !  iterative  method  parameters
    !
    call  psb_barrier(ctxt)
    call  prec%allocate_wrk(info)
    t1  =  psb_wtime()
    call  psb_krylov(s_choice%kmethd,a,prec,b,x,s_choice%eps,&
              &   desc_a,info,itmax=s_choice%itmax,iter=iter,err=err,itrace=s_choice%itrace,&
              &   istop=s_choice%istopc,irst=s_choice%irst)
    call  prec%deallocate_wrk(info)
    call  psb_barrier(ctxt)
    tslv  =  psb_wtime()  -  t1
    ......
    call  psb_gpu_exit()
    call  psb_exit(ctxt)
    stop
 
< / pre >
<!-- l. 584 --> < p class = "nopar" > < / div > < / div >
< br / > < div class = "caption"
>< span class = "id" > Listing 7: < / span > < span
class="content">setup of a GPU-enabled test program part three.< / span > < / div > <!-- tex4ht:label?: x7 - 15003r7 -->
< / div > < hr class = "endfloat" / >
<!-- l. 592 --> < p class = "indent" > < span
class="cmr-12">It is very important to employ smoothers and coarsest solvers that are suited to the< / span >
< span
class="cmr-12">GPU, i.e. methods that do NOT employ triangular system solve kernels. Methods that< / span >
< span
class="cmr-12">satisfy this constraint include:< / span >
< ul class = "itemize1" >
< li class = "itemize" >
<!-- l. 596 --> < p class = "noindent" > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">JACOBI< / span > < / span > < / span >
< / li >
< li class = "itemize" >
<!-- l. 597 --> < p class = "noindent" > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">BJAC< / span > < / span > < / span > < span
class="cmr-12">with the following methods on the local blocks:< / span >
< ul class = "itemize2" >
< li class = "itemize" >
<!-- l. 599 --> < p class = "noindent" > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">INVK< / span > < / span > < / span >
< / li >
< li class = "itemize" >
<!-- l. 600 --> < p class = "noindent" > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">INVT< / span > < / span > < / span >
< / li >
< li class = "itemize" >
<!-- l. 601 --> < p class = "noindent" > < span class = "obeylines-h" > < span class = "verb" > < span
class="cmtt-12">AINV< / span > < / span > < / span > < / li > < / ul >
< / li > < / ul >
<!-- l. 604 --> < p class = "noindent" > < span
class="cmr-12">and their < / span > < span
class="cmmi-12">ℓ < / span > < sub > < span
class="cmr-8">1< / span > < / sub > < span
class="cmr-12">variants.< / span >
<!-- l. 1 --> < div class = "crosslinks" > < p class = "noindent" > < span
class="cmr-12">[< / span > < a
href="userhtmlse5.html" >< span
class="cmr-12">next< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="userhtmlse3.html" >< span
class="cmr-12">prev< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="userhtmlse3.html#tailuserhtmlse3.html" >< span
class="cmr-12">prev-tail< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="userhtmlse4.html" >< span
class="cmr-12">front< / span > < / a > < span
class="cmr-12">] [< / span > < a
href="userhtml.html#userhtmlse4.html" >< span
class="cmr-12">up< / span > < / a > < span
class="cmr-12">] < / span > < / p > < / div >
<!-- l. 1 --> < p class = "indent" > < a
id="tailuserhtmlse4.html">< / a >
< / body > < / html >