5 User Interface

The basic user interface of AMG4PBLAS consists of eight methods. The six methods init, set, build, hierarchy_build, smoothers_build and apply encapsulate all the functionalities for the setup and the application of any multilevel and one-level preconditioner implemented in the package. The method free deallocates the preconditioner data structure, while descr prints a description of the preconditioner setup by the user. For backward compatibility, methods are also accessible as stand-alone subroutines.

For each method, the same user interface is overloaded with respect to the real/complex and single/double precision data; arguments with appropriate data types must be passed to the method, i.e.,

A description of each method is given in the remainder of this section.

5.1 Method init

call p%init(contxt,ptype,info)

This method allocates and initializes the preconditioner p, according to the preconditioner type chosen by the user.

Arguments

contxt

type(psb_ctxt_type), intent(in).

The communication context.

ptype

character(len=*), intent(in) .

The type of preconditioner. Its values are specified in Table 1.

Note that strings are case insensitive.

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

5.2 Method set

call p%set(what,val,info [,ilev, ilmax, pos, idx])

This method sets the parameters defining the preconditioner p. More precisely, the parameter identified by what is assigned the value contained in val.

Arguments

what

character(len=*).

The parameter to be set. It can be specified through its name; the string is case-insensitive. See Tables 2-8.

val

integer or character(len=*) or real(psb_spk_) or real(psb_dpk_), intent(in).

The value of the parameter to be set. The list of allowed values and the corresponding data types is given in Tables 2-8. When the value is of type character(len=*), it is also treated as case insensitive.

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

ilev

integer, optional, intent(in).

For the multilevel preconditioner, the level at which the preconditioner parameter has to be set. The levels are numbered in increasing order starting from the finest one, i.e., level 1 is the finest level. If ilev is not present, the parameter identified by what is set at all levels that are appropriate (see Tables 2-8).

ilmax

integer, optional, intent(in).

For the multilevel preconditioner, when both ilev and ilmax are present, the settings are applied at all levels ilev:ilmax. When ilev is present but ilmax is not, then the default is ilmax=ilev. The levels are numbered in increasing order starting from the finest one, i.e., level 1 is the finest level.

pos

character(len=*), optional, intent(in).

Whether the other arguments apply only to the pre-smoother (PRE) or to the post-smoother (POST). If pos is not present, the other arguments are applied to both smoothers. If the preconditioner is one-level or the parameter identified by what does not concern the smoothers, pos is ignored.

idx

integer, optional, intent(in).

An auxiliary input argument that can be passed to the underlying objects.

A variety of preconditioners can be obtained by setting the appropriate preconditioner parameters. These parameters can be logically divided into four groups, i.e., parameters defining

  1. the type of multilevel cycle and how many cycles must be applied;

  2. the coarsening algorithm;

  3. the solver at the coarsest level (for multilevel preconditioners only);

  4. the smoother of the multilevel preconditioners, or the one-level preconditioner.

A list of the parameters that can be set, along with their allowed and default values, is given in Tables 2-8.

Remark 2. A smoother is usually obtained by combining two objects: a smoother (SMOOTHER_TYPE) and a local solver (SUB_SOLVE), as specified in Tables 7-9. For example, the block-Jacobi smoother using ILU(0) on the blocks is obtained by combining the block-Jacobi smoother object with the ILU(0) solver object. Similarly, the hybrid Gauss-Seidel smoother (see Note in Table 7) is obtained by combining the block-Jacobi smoother object with a single sweep of the Gauss-Seidel solver object, while the point-Jacobi smoother is the result of combining the block-Jacobi smoother object with a single sweep of the point-Jacobi solver object. In the same way are obtained the 1-versions of the smoothers. However, for simplicity, shortcuts are provided to set all versions of point-Jacobi, hybrid (forward) Gauss-Seidel, and hybrid backward Gauss-Seidel, i.e., the previous smoothers can be defined just by setting SMOOTHER_TYPE to certain specific values (see Tables 7), without the need to set SUB_SOLVE as well.

The smoother and solver objects are arranged in a hierarchical manner. When specifying a smoother object, its parameters, including the local solver, are set to their default values, and when a solver object is specified, its defaults are also set, overriding in both cases any previous settings even if explicitly specified. Therefore if the user sets a smoother, and wishes to use a solver different from the default one, the call to set the solver must come after the call to set the smoother.

Similar considerations apply to the point-Jacobi, Gauss-Seidel and block-Jacobi coarsest-level solvers, and shortcuts are available in this case too (see Table 5).

Remark 3. The polynomial-accelerated smoother described in Tables 7 and 9 redefines a sweep or iteration as corresponding to the degree of the polynomial used. Consequently, the SMOOTHER_SWEEPS option is overridden by the POLY_DEGREE option. This smoother is paired with a base smoother object, whose iterations are accelerated using the specified polynomial smoothing technique. By default, the 1-Jacobi smoother serves as the base smoother, offering theoretical guarantees on the resulting convergence factor [15, 27]. Alternative combinations are experimental and lack established guarantees.

Remark 4. Many of the coarsest-level solvers apply to a specific coarsest-matrix layout; therefore, setting the solver after the layout may change the layout to either distributed or replicated. Similarly, setting the layout after the solver may change the solver.

More precisely, UMFPACK and SuperLU require the coarsest-level matrix to be replicated, while SuperLU_Dist and KRM require it to be distributed. In these cases, setting the coarsest-level solver implies that the layout is redefined according to the solver, ovverriding any previous settings. MUMPS, point-Jacobi, hybrid Gauss-Seidel and block-Jacobi can be applied to replicated and distributed matrices, thus their choice does not modify any previously specified layout. It is worth noting that, when the matrix is replicated, the point-Jacobi, hybrid Gauss-Seidel and block-Jacobi solvers and their 1- versions reduce to the corresponding local solver objects (see Remark 2). For the point-Jacobi and Gauss-Seidel solvers, these objects correspond to a single point-Jacobi sweep and a single Gauss-Seidel sweep, respectively, which are very poor solvers.

On the other hand, the distributed layout can be used with any solver but UMFPACK and SuperLU; therefore, if any of these two solvers has already been selected, the coarsest-level solver is changed to block-Jacobi, with the previously chosen solver applied to the local blocks. Likewise, the replicated layout can be used with any solver but SuperLu_Dist and KRM; therefore, if SuperLu_Dist or KRM have been previously set, the coarsest-level solver is changed to the default sequential solver.

In a parallel setting with many cores, we suggest to the users to change the default coarsest solver for using the KRM choice, i.e. a parallel distributed iterative solution of the coarsest system based on Krylov methods.

Remark 4. The argument idx can be used to allow finer control for those solvers; for instance, by specifying the keyword MUMPS_IPAR_ENTRY and an appropriate value for idx, it is possible to set any entry in the MUMPS integer control array. See also Sec. 6.







what

data type

val

default

comments






ML_CYCLE

character(len=*)

VCYCLE

WCYCLE

KCYCLE

ADD

VCYCLE

Multilevel cycle: V-cycle, W-cycle, K-cycle, and additive composition.






CYCLE_SWEEPS

integer

Any integer

number 1

1

Number of multilevel cycles.







Table 2: Parameters defining the multilevel cycle and the number of cycles to be applied.


Note. The aggregation algorithm stops when at least one of the following criteria is met: the coarse size threshold,
the minimum coarsening ratio, or the maximum number of levels is reached.
Therefore, the actual number of levels may be smaller than the specified maximum number of levels.





what

data type

val

default

comments






MIN_COARSE_SIZE_PER_PROCESS

integer

Any number

> 0

200

Coarse size threshold per process. The aggregation stops if the global number of variables of the computed coarsest matrix is lower than or equal to this threshold multiplied by the number of processes (see Note).






MIN_COARSE_SIZE

integer

Any number

> 0

-1

Coarse size threshold. The aggregation stops if the global number of variables of the computed coarsest matrix is lower than or equal to this threshold (see Note). If negative, it is ignored in favour of the default for MIN_COARSE_SIZE_PER_PROCESS.






MIN_CR_RATIO

real

Any number

> 1

1.5

Minimum coarsening ratio. The aggregation stops if the ratio between the global matrix dimensions at two consecutive levels is lower than or equal to this threshold (see Note).






MAX_LEVS

integer

Any integer

number > 1

20

Maximum number of levels. The aggregation stops if the number of levels reaches this value (see Note).






PAR_AGGR_ALG

character(len=*)

’DEC’, ’SYMDEC’, ’COUPLED’

’DEC’

Parallel aggregation algorithm.

the SYMDEC option applies decoupled aggregation to the sparsity pattern of A + AT .






AGGR_TYPE

character(len=*)

SOC1, SOC2, MATCHBOXP

SOC1

Type of aggregation algorithm: currently, for the decoupled aggregation we implement two measures of strength of connection, the one by Vaněk, Mandel and Brezina [35], and the one by Gratton et al [24]. The coupled aggregation is based on a parallel version of the half-approximate matching implemented in the MatchBox-P software package [9].






AGGR_SIZE

integer

Any integer

power of 2, with aggr_size 2

4

Maximum size of aggregates when the coupled aggregation based on matching is applied. For aggressive coarsening with size of aggregate larger than 8 we recommend the use of smoothed prolongators. Used only with ’COUPLED’ and ’MATCHBOXP’






AGGR_PROL

character(len=*)

SMOOTHED, UNSMOOTHED

SMOOTHED

Prolongator used by the aggregation algorithm: smoothed or unsmoothed (i.e., tentative prolongator).












Table 3: Parameters defining the aggregation algorithm.


Note. Different thresholds at different levels, such as those used in [35, Section 5.1], can be easily set by invoking the rou-
tine set with the parameter ilev.





what

data type

val

default

comments






AGGR_ORD

character(len=*)

’NATURAL’

’DEGREE’

’NATURAL’

Initial ordering of indices for the decoupled aggregation algorithm: either natural ordering or sorted by descending degrees of the nodes in the matrix graph.






AGGR_THRESH

real(kind_parameter)

Any real

number  [0,1]

0.01

The threshold θ in the strength of connection algorithm. See also the note at the bottom of this table.






AGGR_FILTER

character(len=*)

’FILTER’

’NOFILTER’

’NOFILTER’

Matrix used in computing the smoothed prolongator: filtered or unfiltered.












Table 4: Parameters defining the aggregation algorithm (continued).


Note. Defaults for COARSE_SOLVE and COARSE_SUBSOLVE are chosen in the following order:
single precision version – MUMPS if installed, then SLU if installed, ILU otherwise;
double precision version – UMF if installed, then MUMPS if installed, then SLU if installed, ILU otherwise.
Note. Further options for coarse solvers are contained in Table 6.
For a first use it is suggested to use the default options obtained by simply selecting the solver type.





what

data type

val

default

comments






COARSE_MAT

character(len=*)

DIST

REPL

REPL

Coarsest matrix layout: distributed among the processes or replicated on each of them.






COARSE_SOLVE

character(len=*)

MUMPS

UMF

SLU

SLUDIST

ILU

JACOBI

GS

BJAC

KRM

L1-JACOBI

L1-BJAC

L1-FBGS

See Note.

Solver used at the coarsest level: sequential LU from MUMPS, UMFPACK, or SuperLU (plus triangular solve); distributed LU from MUMPS or SuperLU_Dist (plus triangular solve); point-Jacobi, hybrid Gauss-Seidel or block-Jacobi and related 1-versions; Krylov Method (flexible Conjugate Gradient) coupled with the block-Jacobi preconditioner with ILU(0) on the blocks. Note that UMF and SLU require the coarsest matrix to be replicated, SLUDIST, JACOBI, GS, BJAC and KRM require it to be distributed, MUMPS can be used with either a replicated or a distributed matrix. When any of the previous solvers is specified, the matrix layout is set to a default value which allows the use of the solver (see Remark 4, p. 21). Note also that UMFPACK and SuperLU_Dist are available only in double precision.






COARSE_SUBSOLVE

character(len=*)

ILU

ILUT

MILU

MUMPS

SLU

UMF

INVT

INVK

AINV

See Note.

Solver for the diagonal blocks of the coarsest matrix, in case the block Jacobi solver is chosen as coarsest-level solver: ILU(p), ILU(p,t), MILU(p), LU from MUMPS, SuperLU or UMFPACK (plus triangular solve), Approximate Inverses INVK(p,q), INVT(p1,p2,t1,t2) and AINV(t); note that approximate inverses are specifically suited for GPUs since they do not employ triangular system solve kernels, see [3]. Note that UMFPACK and SuperLU_Dist are available only in double precision.











what

data type

val

default

comments






COARSE_SWEEPS

integer

Any integer

number > 0

10

Number of sweeps when JACOBI, GS or BJAC is chosen as coarsest-level solver.






COARSE_FILLIN

integer

Any integer

number 0

0

Fill-in level p of the ILU factorizations and first fill-in for the approximate inverses.






COARSE_ILUTHRS

real(kind_parameter)

Any real

number 0

0

Drop tolerance t in the ILU(p,t) factorization and first drop-tolerance for the approximate inverses.












Table 5: Parameters defining the solver at the coarsest level (continued).







what

data type

val

default

comments






BJAC_STOP

character(len=*)

FALSE

TRUE

FALSE

Select whether to use a stopping criterion for the Block-Jacobi method used as a coarse solver.






BJAC_TRACE

character(len=*)

FALSE

TRUE

FALSE

Select whether to print a trace for the calculated residual for the Block-Jacobi method used as a coarse solver.






BJAC_ITRACE

integer

Any integer

> 0

-1

Number of iterations after which a trace is to be printed.






BJAC_RESCHECK

integer

Any integer

> 0

-1

Number of iterations after which a residual is to be calculated.






BJAC_STOPTOL

real(kind_parameter)

Any real

< 1

0

Tolerance for the stopping criterion on the residual.






KRM_METHOD

character(len=*)

CG

FCG

CGS

CGR

BICG

BICGSTAB

BICGSTABL

RGMRES

FCG

A string that defines the iterative method to be used when employing a Krylov method KRM as a coarse solver. CG the Conjugate Gradient method; FCG the Flexible Conjugate Gradient method; CGS the Conjugate Gradient Stabilized method; GCR the Generalized Conjugate Residual method; FCG the Flexible Conjugate Gradient method; BICG the Bi-Conjugate Gradient method; BICGSTAB the Bi-Conjugate Gradient Stabilized method; BICGSTABL the Bi-Conjugate Gradient Stabilized method with restarting; RGMRES the Generalized Minimal Residual method with restarting. Refer to the PSBLAS guide [21] for further information.






KRM_KPREC

character(len=*)

Table 1

BJAC

The one-level preconditioners from the Table 1 can be used for the coarse Krylov solver.






KRM_SUB_SOLVE

character(len=*)

Table 5

ILU

Solver for the diagonal blocks of the coarsest matrix preconditioner, in case the block Jacobi solver is chosen as KRM_KPREC: ILU(p), ILU(p,t), MILU(p), LU from MUMPS, SuperLU or UMFPACK (plus triangular solve), Approximate Inverses INVK(p,q), INVT(p1,p2,t1,t2) and AINV(t); The same caveat from Table 5 applies here.






KRM_GLOBAL

character(len=*)

TRUE, FALSE

FALSE

Choose between a global Krylov solver, all unknowns on a single node, or a distributed one. The default choice is the distributed solver.






KRM_EPS

real(kind_parameter)

Real < 1

10-6

The stopping tolerance.






KRM_IRST

integer

Integer

1

30

An integer specifying the restart parameter. This is employed for the BiCGSTABL or RGMRES methods, otherwise it is ignored.






KRM_ISTOPC

integer

Integers 1,2,3

2

If 1 then the method uses the normwise backward error in the infinity norm; if 2, the it uses the relative residual in the 2-norm; if 3 the relative residual reduction in the 2-norm is used instead; refer to the PSBLAS [21] guide for the details.






KRM_ITMAX

integer

Integer

1

40

The maximum number of iterations to perform.






KRM_ITRACE

integer

Integer

0

-1

If > 0 print out an informational message about convergence every KRM_ITRACE iterations. If = 0 print a message in case of convergence failure.






KRM_FILLIN

integer

Integer

0

0

Fill-in level p of the ILU factorizations and first fill-in for the approximate inverses.







Table 6: Additional parameters defining the solver at the coarsest level.







what

data type

val

default

comments






SMOOTHER_TYPE

character(len=*)

JACOBI

GS

BGS

BJAC

AS

L1-JACOBI

L1-BJAC

L1-FBGS

POLY

FBGS

Type of smoother used in the multilevel preconditioner: point-Jacobi, hybrid (forward) Gauss-Seidel, hybrid backward Gauss-Seidel, block-Jacobi, 1-Jacobi, 1–hybrid (forward) Gauss-Seidel, 1-point-Jacobi and Additive Schwarz, polynomial accelerators; see [15] and Remark 3 (p. 21).

It is ignored by one-level preconditioners.






SUB_SOLVE

character(len=*)

JACOBI GS

BGS

ILU

ILUT

MILU

MUMPS

SLU

UMF

INVT

INVK

AINV

GS and BGS for pre- and post-smoothers of multilevel preconditioners, respectively

ILU for block-Jacobi and Additive Schwarz one-level preconditioners

The local solver to be used with the smoother or one-level preconditioner (see Remark 2, page 24): point-Jacobi, hybrid (forward) Gauss-Seidel, hybrid backward Gauss-Seidel, ILU(p), ILU(p,t), MILU(p), LU from MUMPS, SuperLU or UMFPACK (plus triangular solve), Approximate Inverses INVK(p,q), INVT(p1,p2,t1,t2) and AINV(t); note that approximate inverses are specifically suited for GPUs since they do not employ triangular system solve kernels, see [3]. See Note for details on hybrid Gauss-Seidel.






SMOOTHER_SWEEPS

integer

Any integer

number 0

1

Number of sweeps of the smoother or one-level preconditioner. In the multilevel case, no pre-smother or post-smoother is used if this parameter is set to 0 together with pos=PRE or pos=POST, respectively. Is ignored if the smoother is POLY






POLY_DEGREE

integer

Any integer

number 1 and 30

1

Degree of the polynomial accelerator, is equal to the number of matrix-vector products performed by the smoother. Is ignored if the smoother is not POLY







Table 7: Parameters defining the smoother or the details of the one-level preconditioner.







what

data type

val

default

comments






SUB_OVR

integer

Any integer

number 0

1

Number of overlap layers, for Additive Schwarz only.

SUB_RESTR

character(len=*)

HALO

NONE

HALO

Type of restriction operator, for Additive Schwarz only: HALO for taking into account the overlap, NONE for neglecting it.

Note that HALO must be chosen for the classical Addditive Schwarz smoother and its RAS variant.






SUB_PROL

character(len=*)

SUM

NONE

NONE

Type of prolongation operator, for Additive Schwarz only: SUM for adding the contributions from the overlap, NONE for neglecting them.

Note that SUM must be chosen for the classical Additive Schwarz smoother, and NONE for its RAS variant.






SUB_FILLIN

integer

Any integer

number 0

0

Fill-in level p of the incomplete LU factorizations.






SUB_ILUTHRS

real(kind_parameter)

Any real number 0

0

Drop tolerance t in the ILU(p,t) factorization.






MUMPS_LOC_GLOB

character(len=*)

LOCAL_SOLVER

GLOBAL_SOLVER

GLOBAL_SOLVER

Whether MUMPS should be used as a distributed solver, or as a serial solver acting only on the part of the matrix local to each process.






MUMPS_IPAR_ENTRY

integer

Any integer number

0

Set an entry in the MUMPS integer control array, as chosen via the idx optional argument.






MUMPS_RPAR_ENTRY

real

Any real number

0

Set an entry in the MUMPS real control array, as chosen via the idx optional argument.







Table 8: Parameters defining the smoother or the details of the one-level preconditioner (continued).







what

data type

val

default

comments






POLY_VARIANT

character(len=*)

CHEB_4

CHEB_4_OPT

CHEB_1_OPT

CHEB_4

Select the type of polynomial accelerator. The CHEB_4 and CHEB_4_OPT types are those based on the Chebyshev polynomials of the 4th-kind described in [27]. The CHEB_1_OPT version is the one described in [15] and based on the Chebyshev polynomials of the 1st-kind.






POLY_RHO_ESTIMATE

character(len=*)

POLY_RHO_EST_POWER

POLY_RHO_EST_POWER

Algorithm for estimating the spectral radius of the smoother to which the polynomial acceleration is applied. The only implemented algorithm is the power method; see also the two following options.






POLY_RHO_ESTIMATE_ITERATIONS

integer

Any integer

number 1

20

Number of iterations for the spectral radius estimate.






POLY_RHO_BA

real(kind_parameter)

Any real

number (0,1]

1

Sets an estimate of the spectral radius of the base smoother to which the polynomial accelerator is applied.







Table 9: Parameters defining the smoother or the details of the one-level preconditioner (continued).

5.3 Method hierarchy_build

call p%hierarchy_build(a,desc_a,info)

This method builds the hierarchy of matrices and restriction/prolongation operators for the multilevel preconditioner p, according to the requirements made by the user through the methods init and set.

Arguments

a

type(psb_xspmat_type), intent(in).

The sparse matrix structure containing the local part of the matrix to be preconditioned. Note that x must be chosen according to the real/complex, single/double precision version of AMG4PSBLAS under use. See the PSBLAS User’s Guide for details [21].

desc_a

type(psb_desc_type), intent(in).

The communication descriptor of a. See the PSBLAS User’s Guide for details [21].

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

5.4 Method smoothers_build

call p%smoothers_build(a,desc_a,p,info[,amold,vmold,imold])

This method builds the smoothers and the coarsest-level solvers for the multilevel preconditioner p, according to the requirements made by the user through the methods init and set, and based on the aggregation hierarchy produced by a previous call to hierarchy_build (see Section 5.3).

Arguments

a

type(psb_xspmat_type), intent(in).

The sparse matrix structure containing the local part of the matrix to be preconditioned. Note that x must be chosen according to the real/complex, single/double precision version of AMG4PSBLAS under use. See the PSBLAS User’s Guide for details [21].

desc_a

type(psb_desc_type), intent(in).

The communication descriptor of a. See the PSBLAS User’s Guide for details [21].

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

amold

class(psb_x_base_sparse_mat), intent(in), optional.

The desired dynamic type for internal matrix components; this allows e.g. running on GPUs; it needs not be the same on all processes. See the PSBLAS User’s Guide for details [21].

vmold

class(psb_x_base_vect_type), intent(in), optional.

The desired dynamic type for internal vector components; this allows e.g. running on GPUs.

imold

class(psb_i_base_vect_type), intent(in), optional.

The desired dynamic type for internal integer vector components; this allows e.g. running on GPUs.

5.5 Method build

call p%build(a,desc_a,info[,amold,vmold,imold])

This method builds the preconditioner p according to the requirements made by the user through the methods init and set (see Sections 5.3 and 5.4 for multilevel preconditioners). It is mostly provided for backward compatibility; indeed, it is internally implemented by invoking the two previous methods hierarchy_build and smoothers_build, whose nomenclature would however be somewhat unnatural when dealing with simple one-level preconditioners.

Arguments

a

type(psb_xspmat_type), intent(in).

The sparse matrix structure containing the local part of the matrix to be preconditioned. Note that x must be chosen according to the real/complex, single/double precision version of AMG4PSBLAS under use. See the PSBLAS User’s Guide for details [21].

desc_a

type(psb_desc_type), intent(in).

The communication descriptor of a. See the PSBLAS User’s Guide for details [21].

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

amold

class(psb_x_base_sparse_mat), intent(in), optional.

The desired dynamic type for internal matrix components; this allows e.g. running on GPUs; it needs not be the same on all processes. See the PSBLAS User’s Guide for details [21].

vmold

class(psb_x_base_vect_type), intent(in), optional.

The desired dynamic type for internal vector components; this allows e.g. running on GPUs.

imold

class(psb_i_base_vect_type), intent(in), optional.

The desired dynamic type for internal integer vector components; this allows e.g. running on GPUs.

The method can be used to build multilevel preconditioners too.

5.6 Method apply

call p%apply(x,y,desc_a,info [,trans,work])

This method computes y = op(B-1) x, where B is a previously built preconditioner, stored into p, and op denotes the preconditioner itself or its transpose, according to the value of trans. Note that, when AMG4PSBLAS is used with a Krylov solver from PSBLAS, p%apply is called within the PSBLAS method psb_krylov and hence it is completely transparent to the user.

Arguments

x

type(kind_parameter), dimension(:), intent(in)—.

The local part of the vector x. Note that type and kind_parameter must be chosen according to the real/complex, single/double precision version of AMG4PSBLAS under use.

y

type(kind_parameter), dimension(:), intent(out)—.

The local part of the vector y. Note that type and kind_parameter must be chosen according to the real/complex, single/double precision version of AMG4PSBLAS under use.

desc_a

type(psb_desc_type), intent(in).

The communication descriptor associated to the matrix to be preconditioned.

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

trans

character(len=1), optional, intent(in).

If trans = N,n then op(B-1) = B-1; if trans = T,t then op(B-1) = B-T (transpose of B-1); if trans = C,c then op(B-1) = B-C (conjugate transpose of B-1).

work

type(kind_parameter), dimension(:), optional, target—.

Workspace. Its size should be at least 4 * psb_cd_get_local_ cols(desc_a) (see the PSBLAS User’s Guide). Note that type and kind_parameter must be chosen according to the real/complex, single/double precision version of AMG4PSBLAS under use.

5.7 Method free

call p%free(p,info)

This method deallocates the preconditioner data structure p.

Arguments

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

5.8 Method descr

call p%descr(info, [iout, root, verbosity])

This method prints a description of the preconditioner p to the standard output or to a file. It must be called after hierachy_build and smoothers_build, or build, have been called.

Arguments

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

iout

integer, intent(in), optional.

The id of the file where the preconditioner description will be printed; the default is the standard output.

root

integer, intent(in), optional.

The id of the process where the preconditioner description will be printed; the default is psb_root_.

verbosity

integer, intent(in), optional.

The verbosity level of the description. Default value is 0. For values higher than 0, it prints out further information, e.g., for a distributed multilevel preconditioner the size of the coarse matrices on every process.

5.9 Auxiliary Methods

Various functionalities are implemented as additional methods of the preconditioner object.

5.9.1 Method: dump

call p%dump(info[,istart,iend,prefix,head,ac,rp,smoother,solver,global_num])

Dump on file.

Arguments

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

amold

class(psb_x_base_sparse_mat), intent(in), optional.

The desired dynamic type for internal matrix components; this allows e.g. running on GPUs; it needs not be the same on all processes. See the PSBLAS User’s Guide for details [21].

5.9.2 Method: clone

call p%clone(pout,info)

Create a (deep) copy of the preconditioner object.

Arguments

pout

type(amg_xprec_type), intent(out).

The copy of the preconditioner data structure. Note that x must be chosen according to the real/complex, single/double precision version of AMG4PSBLAS under use.

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

5.9.3 Method: sizeof

sz = p%sizeof([global])

global

logical, optional.

Whether the global or local preconditioner memory occupation is desired. Default: .false..

Return memory footprint in bytes.

5.9.4 Method: allocate_wrk

call p%allocate_wrk(info[, vmold])

Allocate internal work vectors. Each application of the preconditioner uses a number of work vectors which are allocated internally as necessary; therefore allocation and deallocation of memory occurs multiple times during the execution of a Krylov method. In most cases this strategy is perfectly acceptable, but on some platforms, most notably GPUs, memory allocation is a slow operation, and the default behaviour would lead to a slowdown. This method allows to trade space for time by preallocating the internal workspace outside of the invocation of a Krylov method. When using GPUs or other specialized devices, the vmold argument is also necessary to ensure the internal work vectors are of the appropriate dynamic type to exploit the accelerator hardware; when allocation occurs internally this is taken care of based on the dynamic type of the x argument to the apply method.

Arguments

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.

vmold

class(psb_x_base_vect_type), intent(in), optional.

The desired dynamic type for internal vector components; this allows e.g. running on GPUs.

5.9.5 Method: free_wrk

call p%free_wrk(info)

Deallocate internal work vectors.

Arguments

info

integer, intent(out).

Error code. If no error, 0 is returned. See Section 7 for details.