You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1014 lines
50 KiB
TeX
1014 lines
50 KiB
TeX
\section{User Interface\label{sec:userinterface}}
|
|
\markboth{\textsc{AMG4PSBLAS User's and Reference Guide}}
|
|
{\textsc{\ref{sec:userinterface} User Interface}}
|
|
|
|
The basic user interface of AMG4PBLAS consists of eight methods. The six
|
|
methods \fortinline|init|, \fortinline|set|, \fortinline|build|,
|
|
\fortinline|hierarchy_build|, \fortinline|smoothers_build| and \fortinline|apply|
|
|
encapsulate all the functionalities for the setup and the application
|
|
of any multilevel and one-level preconditioner implemented in the
|
|
package.
|
|
The method \fortinline|free| deallocates the preconditioner data structure, while
|
|
\fortinline|descr| prints a description of the preconditioner setup by the user.
|
|
For backward compatibility, methods are also accessible as
|
|
stand-alone subroutines.
|
|
|
|
For each method, the same user interface is overloaded with
|
|
respect to the real/\-com\-plex and single/double precision data;
|
|
arguments with appropriate data types must be passed to the method, i.e.,
|
|
\begin{itemize}
|
|
\item the sparse matrix data structure, containing the matrix to be
|
|
preconditioned, must be of type \verb|psb_|\emph{x}\verb|spmat_type|
|
|
with \emph{x} = \verb|s| for real single precision, \emph{x} =
|
|
\verb|d| for real double precision, \emph{x} = \verb|c| for complex
|
|
single precision, \emph{x} = \verb|z| for complex double precision;
|
|
\item the preconditioner data structure must be of type
|
|
\verb|amg_|\emph{x}\verb|prec_type|, with \emph{x} =
|
|
\verb|s|, \verb|d|, \verb|c|, \verb|z|, according to the sparse
|
|
matrix data structure;
|
|
\item the arrays containing the vectors $v$ and $w$ involved in
|
|
the preconditioner application $w=B^{-1}v$ must be of type
|
|
\verb|psb_|\emph{x}\verb|vect_type| with \emph{x} =
|
|
\verb|s|, \verb|d|, \verb|c|, \verb|z|, in a manner completely
|
|
analogous to the sparse matrix type;
|
|
\item real parameters defining the preconditioner must be declared
|
|
according to the precision of the sparse matrix and preconditioner
|
|
data structures (see Section~\ref{sec:precset}).
|
|
\end{itemize}
|
|
A description of each method is given in the remainder of this section.
|
|
|
|
\clearpage
|
|
|
|
\subsection{Method init\label{sec:precinit}}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%init(contxt,ptype,info)|
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method allocates and initializes the preconditioner
|
|
\fortinline|p|, according to the preconditioner type chosen by the user.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
|
|
\fortinline|contxt| & \fortinline|type(psb_ctxt_type), intent(in)|.\\
|
|
& The communication context.\\
|
|
\fortinline|ptype| & \fortinline|character(len=*), intent(in)|.\\
|
|
& The type of preconditioner. Its values are specified
|
|
in Table~\ref{tab:precinit}.\\
|
|
& Note that strings are case insensitive.\\
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
|
|
\end{tabular}
|
|
|
|
|
|
|
|
|
|
\clearpage
|
|
|
|
\subsection{Method set\label{sec:precset}}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%set(what,val,info [,ilev, ilmax, pos, idx])|
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method sets the parameters defining the preconditioner \fortinline|p|. More
|
|
precisely, the parameter identified by \fortinline|what| is assigned the value
|
|
contained in \fortinline|val|.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|what| & \fortinline|character(len=*)|. \\
|
|
& The parameter to be set. It can be specified through its name;
|
|
the string is case-insensitive. See
|
|
Tables~\ref{tab:p_cycle}-\ref{tab:p_smoother_1}.\\
|
|
\fortinline|val | & \fortinline|integer| \emph{or} \fortinline|character(len=*)| \emph{or}
|
|
\fortinline|real(psb_spk_)| \emph{or} \fortinline|real(psb_dpk_)|,
|
|
\fortinline|intent(in)|.\\
|
|
& The value of the parameter to be set. The list of allowed
|
|
values and the corresponding data types is given in
|
|
Tables~\ref{tab:p_cycle}-\ref{tab:p_smoother_1}.
|
|
When the value is of type \fortinline|character(len=*)|,
|
|
it is also treated as case insensitive.\\
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors}
|
|
for details.\\
|
|
\fortinline|ilev| & \fortinline|integer, optional, intent(in)|.\\
|
|
& For the multilevel preconditioner, the level at which the
|
|
preconditioner parameter has to be set.
|
|
The levels are numbered in increasing
|
|
order starting from the finest one, i.e., level 1 is the finest level.
|
|
If \fortinline|ilev| is not present, the parameter identified by \fortinline|what|
|
|
is set at all levels that are appropriate (see
|
|
Tables~\ref{tab:p_cycle}-\ref{tab:p_smoother_1}).\\
|
|
\fortinline|ilmax| & \fortinline|integer, optional, intent(in)|.\\
|
|
& For the multilevel preconditioner, when both
|
|
\fortinline|ilev| and \fortinline|ilmax| are present, the settings
|
|
are applied at all levels \fortinline|ilev:ilmax|. When
|
|
\fortinline|ilev| is present but \fortinline|ilmax| is not, then
|
|
the default is \fortinline|ilmax=ilev|.
|
|
The levels are numbered in increasing
|
|
order starting from the finest one, i.e., level 1 is the finest level. \\
|
|
\fortinline|pos| & \fortinline|character(len=*), optional, intent(in)|.\\
|
|
& Whether the other arguments apply only to the pre-smoother (\fortinline|'PRE'|)
|
|
or to the post-smoother (\fortinline|'POST'|). If \fortinline|pos| is not present,
|
|
the other arguments are applied to both smoothers.
|
|
If the preconditioner is one-level or the parameter identified by \fortinline|what|
|
|
does not concern the smoothers, \fortinline|pos| is ignored.\\
|
|
\fortinline|idx| & \fortinline|integer, optional, intent(in)|.\\
|
|
& An auxiliary input argument that can be passed to the
|
|
underlying objects.
|
|
\end{tabular}
|
|
|
|
|
|
\noindent
|
|
%However, in this case the optional arguments \fortinline|ilev|,
|
|
%\fortinline|ilmax|, \fortinline|pos| and \fortinline|idx|
|
|
%cannot be used. \\
|
|
|
|
A variety of preconditioners can be obtained by setting the
|
|
appropriate preconditioner parameters. These parameters can be
|
|
logically divided into four groups, i.e., parameters defining
|
|
\begin{enumerate}
|
|
\item the type of multilevel cycle and how many cycles must be applied;
|
|
\item the coarsening algorithm;
|
|
\item the solver at the coarsest level (for multilevel
|
|
preconditioners only);
|
|
\item the smoother of the multilevel preconditioners, or the one-level
|
|
preconditioner.
|
|
\end{enumerate}
|
|
A list of the parameters that can be set, along with their allowed and
|
|
default values, is given in Tables~\ref{tab:p_cycle}-\ref{tab:p_smoother_1}.\\
|
|
|
|
\textbf{Remark 2.} A smoother is usually obtained by combining two objects:
|
|
a smoother (\fortinline|'SMOOTHER_TYPE'|) and a local solver (\fortinline|'SUB_SOLVE'|),
|
|
as specified in Tables~\ref{tab:p_smoother}-\ref{tab:p_smoother_1}.
|
|
For example, the block-Jacobi smoother using
|
|
ILU(0) on the blocks is obtained by combining the block-Jacobi smoother
|
|
object with the ILU(0) solver object. Similarly,
|
|
the hybrid Gauss-Seidel smoother (see Note in Table~\ref{tab:p_smoother})
|
|
is obtained by combining the block-Jacobi smoother object with a single sweep
|
|
of the Gauss-Seidel solver object, while the point-Jacobi smoother is the
|
|
result of combining the block-Jacobi smoother object with a single sweep
|
|
of the point-Jacobi solver object. In the same way are obtained the $\ell_1$-versions of
|
|
the smoothers. However, for simplicity, shortcuts are
|
|
provided to set all versions of point-Jacobi, hybrid (forward) Gauss-Seidel, and
|
|
hybrid backward Gauss-Seidel, i.e., the previous smoothers can be defined
|
|
just by setting \fortinline|'SMOOTHER_TYPE'| to certain specific
|
|
values (see Tables~\ref{tab:p_smoother}), without the need to set
|
|
\fortinline|'SUB_SOLVE'| as well.
|
|
|
|
The smoother and solver objects are arranged in a
|
|
hierarchical manner. When specifying a smoother object, its parameters,
|
|
including the local solver, are set to their default values, and when a solver
|
|
object is specified, its defaults are also set, overriding in both
|
|
cases any previous settings even if explicitly specified. Therefore if
|
|
the user sets a smoother, and wishes to use a solver
|
|
different from the default one, the call to set the solver must come
|
|
\emph{after} the call to set the smoother.
|
|
|
|
Similar considerations apply to the point-Jacobi, Gauss-Seidel and block-Jacobi
|
|
coarsest-level solvers, and shortcuts are available
|
|
in this case too (see Table~\ref{tab:p_coarse_1}). \\
|
|
|
|
\textbf{Remark 3.} Many of the coarsest-level solvers cannot be used
|
|
with both replicated and distributed coarsest-matrix layouts;
|
|
therefore, setting the solver after the layout may change the layout.
|
|
Similarly, setting the layout after the solver may change the solver.
|
|
|
|
More precisely, UMFPACK and SuperLU require the coarsest-level
|
|
matrix to be replicated, while SuperLU\_Dist and KRM require it to be distributed.
|
|
In these cases, setting the coarsest-level solver implies that
|
|
the layout is redefined according to the solver, ovverriding any
|
|
previous settings. MUMPS, point-Jacobi,
|
|
hybrid Gauss-Seidel and block-Jacobi can be applied to
|
|
replicated and distributed matrices, thus their choice
|
|
does not modify any previously specified layout.
|
|
It is worth noting that, when the matrix is replicated,
|
|
the point-Jacobi, hybrid Gauss-Seidel and block-Jacobi solvers and their $\ell_1-$ versions
|
|
reduce to the corresponding local solver objects (see Remark~2).
|
|
For the point-Jacobi and Gauss-Seidel solvers, these objects
|
|
correspond to a \emph{single} point-Jacobi sweep and a \emph{single}
|
|
Gauss-Seidel sweep, respectively, which are very poor solvers.
|
|
|
|
On the other hand, the distributed layout can be used with any solver
|
|
but UMFPACK and SuperLU; therefore, if any of these two solvers has already
|
|
been selected, the coarsest-level solver is changed to block-Jacobi,
|
|
with the previously chosen solver applied to the local blocks.
|
|
Likewise, the replicated layout can be used with any solver but SuperLu\_Dist and KRM;
|
|
therefore, if SuperLu\_Dist or KRM have been previously set, the coarsest-level
|
|
solver is changed to the default sequential solver.
|
|
|
|
In a parallel setting with many cores, we suggest to the users to change the default
|
|
coarsest solver for using the KRM choice, i.e. a parallel distributed iterative solution of the
|
|
coarsest system based on Krylov methods.
|
|
|
|
\textbf{Remark 4.} The argument \fortinline|idx| can be used to allow finer
|
|
control for those solvers; for instance, by specifying the keyword
|
|
\fortinline|'MUMPS_IPAR_ENTRY'| and an appropriate value for \fortinline|idx|, it is
|
|
possible to set any entry in the MUMPS integer control array.
|
|
See also Sec.~\ref{sec:adding}.
|
|
%The \verb|what,val| pairs described here are those of the predefined
|
|
%moother/solver objects; newly developed solvers may define new pairs
|
|
%according to their needs.
|
|
|
|
|
|
\bsideways
|
|
\begin{center}
|
|
%\begin{tabular}{|p{5cm}|l|p{2.4cm}|p{2.5cm}|p{5cm}|}
|
|
\begin{tabular}{|p{3.6cm}|l|p{2.4cm}|p{2.4cm}|p{7.2cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
\fortinline|'ML_CYCLE'| & \fortinline|character(len=*)|
|
|
& \fortinline|'VCYCLE'| \par \fortinline|'WCYCLE'| \par \fortinline|'KCYCLE'| \par \fortinline|'ADD'|
|
|
& \fortinline|'VCYCLE'|
|
|
&Multilevel cycle: V-cycle, W-cycle, K-cycle, and additive composition. \\ \hline
|
|
\fortinline|'CYCLE_SWEEPS'| & \fortinline|integer| &
|
|
Any integer \par number $\ge 1$ & 1 &
|
|
Number of multilevel cycles. \\ \hline
|
|
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Parameters defining the multilevel cycle and the number of cycles to
|
|
be applied.
|
|
\label{tab:p_cycle}}
|
|
\esideways
|
|
|
|
\bsideways
|
|
\begin{center}
|
|
%\begin{tabular}{|p{5cm}|l|p{2.4cm}|p{2.5cm}|p{5cm}|}
|
|
\begin{tabular}{|p{5.7cm}|l|p{2.3cm}|p{2.5cm}|p{6.9cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
\fortinline|'MIN_COARSE_SIZE_PER_PROCESS'| & \fortinline|integer|
|
|
& Any number \par $> 0$
|
|
& $200$
|
|
& Coarse size threshold per process. The aggregation stops
|
|
if the global number of variables of the
|
|
computed coarsest matrix
|
|
is lower than or equal to this threshold
|
|
multiplied by the number of processes (see Note).
|
|
\\ \hline
|
|
\fortinline|'MIN_COARSE_SIZE'| & \fortinline|integer|
|
|
& Any number \par $> 0$
|
|
& -1
|
|
& Coarse size threshold. The aggregation stops
|
|
if the global number of variables of the
|
|
computed coarsest matrix
|
|
is lower than or equal to this threshold
|
|
(see Note). If negative, it is ignored in
|
|
favour of the default for
|
|
\fortinline|'MIN_COARSE_SIZE_PER_PROCESS'|.
|
|
\\ \hline
|
|
|
|
\fortinline|'MIN_CR_RATIO'| & \fortinline|real|
|
|
& Any number \par $> 1$
|
|
& 1.5
|
|
& Minimum coarsening ratio. The aggregation stops
|
|
if the ratio between the global matrix dimensions
|
|
at two consecutive levels is lower than or equal to this
|
|
threshold (see Note).\\ \hline
|
|
|
|
\fortinline|'MAX_LEVS'| & \fortinline|integer|
|
|
& Any integer \par number $> 1$
|
|
& 20
|
|
& Maximum number of levels. The aggregation stops
|
|
if the number of levels reaches this value (see Note). \\ \hline
|
|
\fortinline|'PAR_AGGR_ALG'| & \fortinline|character(len=*)| \hspace*{-3mm}
|
|
& \texttt{'DEC'}, \texttt{'SYMDEC'}, \texttt{'COUPLED'}
|
|
& \texttt{'DEC'}
|
|
& Parallel aggregation algorithm. \par the
|
|
\fortinline|SYMDEC| option applies decoupled
|
|
aggregation to the sparsity pattern
|
|
of $A+A^T$.\\\hline%
|
|
\ifpdf
|
|
\end{tabular}
|
|
\end{center}
|
|
\phantomcaption
|
|
\esideways
|
|
\bsideways
|
|
\ContinuedFloat
|
|
\begin{center}
|
|
\begin{tabular}{|p{5.7cm}|l|p{2.3cm}|p{2.5cm}|p{5.9cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
\fi
|
|
\fortinline|'AGGR_TYPE'| & \fortinline|character(len=*)| \hspace*{-3mm}
|
|
&\fortinline|'SOC1'|,
|
|
\fortinline|'SOC2'|,
|
|
\fortinline|'MATCHBOXP'|
|
|
& \fortinline|'SOC1'|
|
|
& Type of aggregation algorithm: currently,
|
|
for the decoupled aggregation we implement two measures of strength of
|
|
connection, the one by Van\v{e}k, Mandel
|
|
and Brezina~\cite{VANEK_MANDEL_BREZINA},
|
|
and the one by Gratton et al~\cite{GrHeJi:16}. The coupled
|
|
aggregation is based on a parallel version of the half-approximate
|
|
matching implemented in the MatchBox-P software package~\cite{MatchBoxP}.\\ \hline
|
|
|
|
\fortinline|'AGGR_SIZE'| & \fortinline|integer| \hspace*{-3mm}
|
|
& Any integer \par power of $2$, with $\texttt{aggr\_size} \ge 2$
|
|
& 4
|
|
& Maximum size of aggregates when the coupled aggregation based on
|
|
matching is applied. For aggressive coarsening with size of
|
|
aggregate larger than $8$ we recommend the use of smoothed prolongators.
|
|
Used only with \texttt{'COUPLED'} and \texttt{'MATCHBOXP'}\\ \hline
|
|
|
|
\fortinline|'AGGR_PROL'| & \fortinline|character(len=*)| \hspace*{-3mm}
|
|
& \fortinline|'SMOOTHED'|, \fortinline|'UNSMOOTHED'| & \fortinline|'SMOOTHED'|
|
|
& Prolongator used by the aggregation algorithm: smoothed or unsmoothed
|
|
(i.e., tentative prolongator). \\
|
|
\hline
|
|
\multicolumn{5}{|l|}{{\bfseries Note.} The aggregation algorithm stops when
|
|
at least one of the following criteria is met:
|
|
the coarse size threshold, } \\
|
|
\multicolumn{5}{|l|}{the minimum coarsening ratio, or the maximum number
|
|
of levels is reached.} \\
|
|
\multicolumn{5}{|l|}{Therefore, the actual number of levels may be smaller than the specified maximum number
|
|
of levels. } \\
|
|
\hline
|
|
\end{tabular}
|
|
\caption{Parameters defining the aggregation algorithm.
|
|
\label{tab:p_aggregation}}
|
|
\end{center}
|
|
\esideways
|
|
|
|
\bsideways
|
|
\begin{center}
|
|
%\begin{tabular}{|p{5cm}|l|p{2.4cm}|p{2.5cm}|p{5cm}|}
|
|
\begin{tabular}{|p{3.8cm}|l|p{2.5cm}|p{2.3cm}|p{6.6cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
|
|
\fortinline|'AGGR_ORD'| & \fortinline|character(len=*)|
|
|
& \texttt{'NATURAL'} \par \texttt{'DEGREE'}
|
|
& \texttt{'NATURAL'}
|
|
& Initial ordering of indices for the decoupled aggregation
|
|
algorithm: either natural ordering or sorted by
|
|
descending degrees of the nodes in the
|
|
matrix graph. \\ \hline
|
|
%Since aggregation is
|
|
%heuristic, results will be different.
|
|
|
|
\fortinline|'AGGR_THRESH'| & \fortinline|real(kind_parameter)|
|
|
& Any~real \par number~$\in [0, 1]$
|
|
& 0.01
|
|
& The threshold $\theta$ in the strength of
|
|
connection algorithm.
|
|
See also the note at the bottom of this table. \\ \hline
|
|
\fortinline|'AGGR_FILTER'|
|
|
& \fortinline|character(len=*)|
|
|
& \texttt{'FILTER'} \par \texttt{'NOFILTER'}
|
|
& \texttt{'NOFILTER'} &
|
|
Matrix used in computing the smoothed
|
|
prolongator: filtered or unfiltered. % (see~(\ref{eq:filtered}) in Section~\ref{sec:aggregation})
|
|
\\
|
|
\hline
|
|
\multicolumn{5}{|l|}{{\bfseries Note.} Different thresholds at different levels, such as
|
|
those used in \cite[Section~5.1]{VANEK_MANDEL_BREZINA}, can be easily set by
|
|
invoking the rou-} \\
|
|
\multicolumn{5}{|l|}{tine \texttt{set} with
|
|
the parameter \texttt{ilev}.} \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Parameters defining the aggregation algorithm (continued).
|
|
\label{tab:p_aggregation_1}}
|
|
\esideways
|
|
|
|
\bsideways
|
|
\begin{center}
|
|
\begin{tabular}{|p{3.9cm}|l|p{1.9cm}|p{1.7cm}|p{8.6cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
\fortinline|'COARSE_MAT'| & \fortinline|character(len=*)|
|
|
& \fortinline|'DIST'| \par \fortinline|'REPL'|
|
|
& \fortinline|'REPL'|
|
|
& Coarsest matrix layout: distributed among the processes or
|
|
replicated on each of them. \\ \hline
|
|
\fortinline|'COARSE_SOLVE'| & \fortinline|character(len=*)|
|
|
& \fortinline|'MUMPS'| \par \fortinline|'UMF'| \par
|
|
\fortinline|'SLU'| \par \fortinline|'SLUDIST'| \par
|
|
\fortinline|'JACOBI'| \par \fortinline|'GS'| \par \fortinline|'BJAC'| \par \fortinline|'KRM'|
|
|
\fortinline|'L1-JACOBI'| \par \fortinline|'L1-BJAC'| \par \fortinline|'L1-FBGS'|
|
|
& See~Note.
|
|
& Solver used at the coarsest level: sequential
|
|
LU from MUMPS, UMFPACK, or SuperLU
|
|
(plus tri\-an\-gular solve);
|
|
distributed LU from MUMPS or SuperLU\_Dist
|
|
(plus triangular solve);
|
|
point-Jacobi, hybrid Gauss-Seidel or
|
|
block-Jacobi and related $\ell_1$-versions;
|
|
Krylov Method (flexible Conjugate Gradient) coupled with
|
|
the block-Jacobi preconditioner
|
|
with ILU(0) on the blocks.
|
|
Note that \texttt{UMF} and \texttt{SLU} require the coarsest
|
|
matrix to be replicated, \texttt{SLUDIST}, \texttt{JACOBI},
|
|
\texttt{GS}, \texttt{BJAC} and \texttt{KRM} require it to be
|
|
distributed, \texttt{MUMPS} can be used with either
|
|
a replicated or a distributed matrix. When any of the previous
|
|
solvers is specified, the matrix layout is set to a default
|
|
value which allows the use of the solver (see Remark 3, p.~24).
|
|
Note also that UMFPACK and SuperLU\_Dist
|
|
are available only in double precision. \\\hline%
|
|
\ifpdf
|
|
\end{tabular}
|
|
\end{center}
|
|
\phantomcaption
|
|
\esideways
|
|
\bsideways
|
|
\ContinuedFloat
|
|
\begin{center}
|
|
\begin{tabular}{|p{3.9cm}|l|p{1.7cm}|p{1.7cm}|p{8.6cm}|}
|
|
\hline
|
|
\fi
|
|
\fortinline|'COARSE_SUBSOLVE'| & \fortinline|character(len=*)|
|
|
& \fortinline|'ILU'| \par \fortinline|'ILUT'| \par \fortinline|'MILU'| \par
|
|
\fortinline|'MUMPS'| \par \fortinline|'SLU'| \par \fortinline|'UMF'| \par
|
|
\fortinline|'INVT'| \par \fortinline|'INVK'| \par \fortinline|'AINV'|
|
|
& See~Note.
|
|
& Solver for the diagonal blocks of the coarsest matrix,
|
|
in case the block Jacobi solver
|
|
is chosen as coarsest-level solver: ILU($p$), ILU($p,t$),
|
|
MILU($p$), LU from MUMPS, SuperLU or UMFPACK
|
|
(plus triangular solve), Approximate
|
|
Inverses INVK($p,q$), INVT($p_1,p2,t_1,t_2$) and
|
|
AINV($t$); note that approximate inverses
|
|
are specifically suited for GPUs since they
|
|
do not employ triangular system solve kernels, see~\cite{BERTACCINIFILIPPONE}.
|
|
Note that UMFPACK and SuperLU\_Dist
|
|
are available only in double precision. \\
|
|
\hline
|
|
\multicolumn{5}{|l|}{{\bfseries Note.} Defaults for \texttt{COARSE\_SOLVE} and
|
|
\texttt{COARSE\_SUBSOLVE} are chosen in the following order:} \\
|
|
\multicolumn{5}{|l|}{single precision version -- \texttt{MUMPS} if installed,
|
|
then \texttt{SLU} if installed,
|
|
\texttt{ILU} otherwise;}\\
|
|
\multicolumn{5}{|l|}{double precision version -- \texttt{UMF} if installed,
|
|
then \texttt{MUMPS} if installed, then \texttt{SLU} if
|
|
installed, \texttt{ILU} otherwise.}\\
|
|
\hline
|
|
%\end{tabular}
|
|
%\end{center}
|
|
%\caption{Parameters defining the coarse-space correction at the coarsest
|
|
%level.\label{tab:p_coarse}}
|
|
%\esideways
|
|
%
|
|
%\bsideways
|
|
%\begin{center}
|
|
%\begin{tabular}{|p{3.9cm}|l|p{2cm}|p{1.5cm}|p{7.5cm}|}
|
|
%\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
\fortinline|'COARSE_SWEEPS'| & \fortinline|integer|
|
|
& Any integer \par number $> 0$
|
|
& 10
|
|
& Number of sweeps when \fortinline|JACOBI|, \fortinline|GS| or \fortinline|BJAC|
|
|
is chosen as coarsest-level solver.\\ \hline
|
|
\fortinline|'COARSE_FILLIN'| & \fortinline|integer|
|
|
& Any integer \par number $\ge 0$
|
|
& 0
|
|
& Fill-in level $p$ of the ILU factorizations
|
|
and first fill-in for the approximate inverses. \\ \hline
|
|
\fortinline|'COARSE_ILUTHRS'|
|
|
& \fortinline|real(kind_parameter)|
|
|
& Any real \par number $\ge 0$
|
|
& 0
|
|
& Drop tolerance $t$ in the ILU($p,t$)
|
|
factorization and first drop-tolerance for the approximate inverses. \\
|
|
\hline
|
|
\multicolumn{5}{|l|}{{\bfseries Note.} Further options for coarse solvers are contained in Table~\ref{tab:p_coarse_2}.} \\
|
|
\multicolumn{5}{|l|}{For a first use it is suggested to use the default options obtained by simply selecting the solver type.} \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Parameters defining the solver at the coarsest
|
|
level (continued).\label{tab:p_coarse_1}}
|
|
\esideways
|
|
|
|
\bsideways
|
|
\begin{center}
|
|
\begin{tabular}{|p{3.9cm}|l|p{2cm}|p{1.7cm}|p{7.7cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
\fortinline|'BJAC_STOP'| & \fortinline|character(len=*)| & \fortinline|'FALSE'| \par \fortinline|'TRUE'| & \fortinline|'FALSE'| & Select whether to use a stopping criterion for the Block-Jacobi method used as a coarse solver. \\ \hline
|
|
\fortinline|'BJAC_TRACE'| & \fortinline|character(len=*)| & \fortinline|'FALSE'| \par \fortinline|'TRUE'| & \fortinline|'FALSE'| & Select whether to print a trace for the calculated residual for the Block-Jacobi method used as a coarse solver. \\ \hline
|
|
\fortinline|'BJAC_ITRACE'| & \fortinline|integer| & Any integer\par $>0$ & -1 & Number of iterations after which a trace is to be printed. \\ \hline
|
|
\fortinline|'BJAC_RESCHECK'|& \fortinline|integer| & Any integer\par $>0$ & -1 & Number of iterations after which a residual is to be calculated. \\ \hline
|
|
\fortinline|'BJAC_STOPTOL'| & \fortinline|real(kind_parameter)| & Any real\par $<1$ & 0 & Tolerance for the stopping criterion on the residual. \\ \hline
|
|
\fortinline|'KRM_METHOD'| & \fortinline|character(len=*)| & \fortinline|'CG'| \par \fortinline|'FCG'| \par \fortinline|'CGS'| \par \fortinline|'CGR'| \par \fortinline|'BICG'| \par \fortinline|'BICGSTAB'| \par \fortinline|'BICGSTABL'| \par \fortinline|'RGMRES'| & \fortinline|'FCG'| & A string that defines the iterative method to be
|
|
used when employing a Krylov method \fortinline|'KRM'| as a coarse solver. \texttt{CG} the Conjugate Gradient method;
|
|
\texttt{FCG} the Flexible Conjugate Gradient method;
|
|
\texttt{CGS} the Conjugate Gradient Stabilized method;
|
|
\texttt{GCR} the Generalized Conjugate Residual method;
|
|
\texttt{FCG} the Flexible Conjugate Gradient method;
|
|
\texttt{BICG} the Bi-Conjugate Gradient method;
|
|
\texttt{BICGSTAB} the Bi-Conjugate Gradient Stabilized method;
|
|
\texttt{BICGSTABL} the Bi-Conjugate Gradient Stabilized method with restarting;
|
|
\texttt{RGMRES} the Generalized Minimal Residual method with restarting. Refer to the PSBLAS guide~\cite{PSBLASGUIDE} for further information. \\ \hline
|
|
\fortinline|'KRM_KPREC'| & \fortinline|character(len=*)| & Table~\ref{tab:precinit} & \fortinline|'BJAC'| & The one-level preconditioners from the Table~\ref{tab:precinit} can be used for the coarse Krylov solver.\\\hline%
|
|
\ifpdf
|
|
\end{tabular}
|
|
\end{center}
|
|
\phantomcaption
|
|
\esideways
|
|
\bsideways
|
|
\ContinuedFloat
|
|
\begin{center}
|
|
\begin{tabular}{|p{3.9cm}|l|p{1.7cm}|p{1.7cm}|p{8.6cm}|}
|
|
\hline
|
|
\fi
|
|
\fortinline|'KRM_SUB_SOLVE'| & \fortinline|character(len=*)| & Table~\ref{tab:p_coarse_1} & \fortinline|'ILU'| & Solver for the diagonal blocks of the coarsest matrix preconditioner,
|
|
in case the block Jacobi solver
|
|
is chosen as \fortinline|'KRM_KPREC'|: ILU($p$), ILU($p,t$),
|
|
MILU($p$), LU from MUMPS, SuperLU or UMFPACK
|
|
(plus triangular solve), Approximate
|
|
Inverses INVK($p,q$), INVT($p_1,p2,t_1,t_2$) and
|
|
AINV($t$); The same caveat from Table~\ref{tab:p_coarse_1} applies here. \\ \hline
|
|
\fortinline|'KRM_GLOBAL'| & \fortinline|character(len=*)| & \fortinline|'TRUE'|, \fortinline|'FALSE'| & \fortinline|'FALSE'| & Choose between a global Krylov solver, all unknowns on a single node, or a distributed one. The default choice is the distributed solver. \\ \hline
|
|
\fortinline|'KRM_EPS'| & \fortinline|real(kind_parameter)| & Real $< 1$ & $10^{-6}$ & The stopping tolerance. \\ \hline
|
|
\fortinline|'KRM_IRST'| & \fortinline|integer| & Integer \par $\ge 1$ & 30 & An integer specifying the restart parameter. This is employed for the \texttt{BiCGSTABL} or \texttt{RGMRES} methods, otherwise it is ignored. \\ \hline
|
|
\fortinline|'KRM_ISTOPC'| & \fortinline|integer| & Integers 1,2,3 & 2 & If \texttt{1} then the method uses the normwise backward error in the infinity
|
|
norm; if \texttt{2}, the it uses the relative residual in the 2-norm; if \texttt{3} the relative residual reduction in the 2-norm is used instead; refer to the PSBLAS~\cite{PSBLASGUIDE} guide for the details. \\ \hline
|
|
\fortinline|'KRM_ITMAX'| & \fortinline|integer| & Integer \par $\ge 1$ & 40 & The maximum number of iterations to perform. \\ \hline
|
|
\fortinline|'KRM_ITRACE'| & \fortinline|integer| & Integer \par $\ge 0$ & -1 & If $>0$ print out an informational message about
|
|
convergence every \fortinline|'KRM_ITRACE'| iterations. If $=0$ print a message in
|
|
case of convergence failure. \\ \hline
|
|
% \fortinline|'KRM_SUB_SOLVE'| & \fortinline|integer| & Integer \par $\ge 1$ & -1 & If $>0$ the number of iteration made by the solver for the diagonal blocks. \\ \hline
|
|
\fortinline|'KRM_FILLIN'| & \fortinline|integer| & Integer \par $\ge 0$
|
|
& 0
|
|
& Fill-in level $p$ of the ILU factorizations
|
|
and first fill-in for the approximate inverses. \\ \hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Additional parameters defining the solver at the coarsest
|
|
level.\label{tab:p_coarse_2}}
|
|
\esideways
|
|
|
|
\bsideways
|
|
\begin{center}
|
|
\small
|
|
\begin{tabular}{|p{3.6cm}|l|p{1.9cm}|p{3.6cm}|p{6.5cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
|
|
\fortinline|'SMOOTHER_TYPE'| & \fortinline|character(len=*)|
|
|
& \fortinline|'JACOBI'| \par \fortinline|'GS'| \par \fortinline|'BGS'| \par \fortinline|'BJAC'|
|
|
\par \fortinline|'AS'| \par \fortinline|'L1-JACOBI'| \par \fortinline|'L1-BJAC'| \par \fortinline|'L1-FBGS'|
|
|
& \fortinline|'FBGS'|
|
|
& Type of smoother used in the multilevel preconditioner:
|
|
point-Jacobi, hybrid (forward) Gauss-Seidel,
|
|
hybrid backward Gauss-Seidel, block-Jacobi, $\ell_1$-Jacobi, $\ell_1$--hybrid (forward) Gauss-Seidel, $\ell_1$-point-Jacobi and
|
|
Additive Schwarz. \par
|
|
It is ignored by one-level preconditioners. \\ \hline
|
|
\fortinline|'SUB_SOLVE'| & \fortinline|character(len=*)|
|
|
& \fortinline|'JACOBI'|
|
|
\fortinline|'GS'| \par \fortinline|'BGS'| \par \fortinline|'ILU'| \par
|
|
\fortinline|'ILUT'| \par \fortinline|'MILU'| \par
|
|
\par \fortinline|'MUMPS'| \par
|
|
\fortinline|'SLU'| \par \fortinline|'UMF'|
|
|
\par \fortinline|'INVT'| \par \fortinline|'INVK'| \par \fortinline|'AINV'|
|
|
& \texttt{GS} and \texttt{BGS} for pre- and post-smoothers
|
|
of multilevel preconditioners, respectively \par
|
|
\texttt{ILU} for block-Jacobi and Additive Schwarz
|
|
one-level preconditioners
|
|
& The local solver to be used with the smoother or one-level
|
|
preconditioner (see Remark~2, page~24): point-Jacobi,
|
|
hybrid (forward) Gauss-Seidel, hybrid backward
|
|
Gauss-Seidel, ILU($p$), ILU($p,t$), MILU($p$),
|
|
LU from MUMPS, SuperLU or UMFPACK
|
|
(plus triangular solve), Approximate
|
|
Inverses INVK($p,q$), INVT($p_1,p2,t_1,t_2$) and
|
|
AINV($t$); note that approximate inverses
|
|
are specifically suited for GPUs since they
|
|
do not employ triangular system solve
|
|
kernels,
|
|
see~\cite{BERTACCINIFILIPPONE}. See Note
|
|
for details on hybrid
|
|
Gauss-Seidel. \\ \hline
|
|
\fortinline|'SMOOTHER_SWEEPS'| & \fortinline|integer|
|
|
& Any integer \par number~$\ge 0$
|
|
& 1
|
|
& Number of sweeps of the smoother or one-level preconditioner.
|
|
In the multilevel case, no pre-smother or
|
|
post-smoother is used if this parameter is set to 0
|
|
together with \fortinline|pos='PRE'| or \fortinline|pos='POST'|,
|
|
respectively. \\ \hline
|
|
\fortinline|'SUB_OVR'| & \fortinline|integer|
|
|
& Any integer \par number~$\ge 0$
|
|
& 1
|
|
& Number of overlap layers, for Additive Schwarz only. \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Parameters defining the smoother or the details of the one-level preconditioner.
|
|
\label{tab:p_smoother}}
|
|
\esideways
|
|
|
|
\bsideways
|
|
\begin{center}
|
|
\small
|
|
\begin{tabular}{|p{3.2cm}|l|p{2.6cm}|p{2.6cm}|p{6.7cm}|}
|
|
\hline
|
|
\fortinline|what| & \textsc{data type} & \fortinline|val| & \textsc{default} &
|
|
\textsc{comments} \\ \hline
|
|
\fortinline|'SUB_RESTR'| & \fortinline|character(len=*)|
|
|
& \fortinline|'HALO'| \par \fortinline|'NONE'|
|
|
& \fortinline|'HALO'|
|
|
& Type of restriction operator, for Additive Schwarz only:
|
|
\texttt{HALO} for taking into account the overlap, \fortinline|'NONE'|
|
|
for neglecting it. \par
|
|
Note that \texttt{HALO} must be chosen for
|
|
the classical Addditive Schwarz smoother and its RAS variant.\\ \hline
|
|
\fortinline|'SUB_PROL'| & \fortinline|character(len=*)|
|
|
& \fortinline|'SUM'| \par \fortinline|'NONE'|
|
|
& \fortinline|'NONE'|
|
|
& Type of prolongation operator, for Additive Schwarz only:
|
|
\fortinline|'SUM'| for adding the contributions from the overlap, \fortinline|'NONE'|
|
|
for neglecting them. \par
|
|
Note that \fortinline|'SUM'| must be chosen for the classical Additive
|
|
Schwarz smoother, and \fortinline|'NONE'| for its RAS variant. \\ \hline
|
|
\fortinline|'SUB_FILLIN'| & \fortinline|integer|
|
|
& Any integer \par number~$\ge 0$
|
|
& 0
|
|
& Fill-in level $p$ of the incomplete LU factorizations. \\ \hline
|
|
\fortinline|'SUB_ILUTHRS'| & \fortinline|real(kind_parameter)|
|
|
& Any real number~$\ge 0$
|
|
& 0
|
|
& Drop tolerance $t$ in the ILU($p,t$) factorization. \\ \hline
|
|
\fortinline|'MUMPS_LOC_GLOB'| & \fortinline|character(len=*)|
|
|
& \fortinline|'LOCAL_SOLVER'| \par \fortinline|'GLOBAL_SOLVER'|
|
|
& \fortinline|'GLOBAL_SOLVER'|
|
|
& Whether MUMPS should be used as a
|
|
distributed solver, or as a serial solver
|
|
acting only on the part of the matrix local
|
|
to each process. \\ \hline
|
|
\fortinline|'MUMPS_IPAR_ENTRY'| & \fortinline|integer|
|
|
& Any integer number
|
|
& 0
|
|
& Set an entry in the MUMPS integer control array, as
|
|
chosen via the \fortinline|idx| optional argument. \\ \hline
|
|
\fortinline|'MUMPS_RPAR_ENTRY'| & \fortinline|real|
|
|
& Any real number
|
|
& 0
|
|
& Set an entry in the MUMPS real control array, as
|
|
chosen via the \fortinline|idx| optional argument. \\ %\hline
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\caption{Parameters defining the smoother or the details of the one-level preconditioner
|
|
(continued).\label{tab:p_smoother_1}}
|
|
\esideways
|
|
|
|
|
|
\clearpage
|
|
|
|
\subsection{Method hierarchy\_build\label{sec:hier_bld}}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%hierarchy_build(a,desc_a,info)|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method builds the hierarchy of matrices and restriction/prolongation
|
|
operators for the multilevel preconditioner \fortinline|p|, according to the requirements
|
|
made by the user through the methods \fortinline|init| and \fortinline|set|.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|a| & \fortinline|type(psb_|\emph{x}\fortinline|spmat_type), intent(in)|. \\
|
|
& The sparse matrix structure containing the local part of the
|
|
matrix to be preconditioned. Note that \emph{x} must be chosen according
|
|
to the \fortinline|real|/\fortinline|complex|,
|
|
single/double precision version of AMG4PSBLAS under use.
|
|
See the PSBLAS User's Guide for details \cite{PSBLASGUIDE}.\\
|
|
\fortinline|desc_a| & \fortinline|type(psb_desc_type), intent(in)|. \\
|
|
& The communication descriptor of \fortinline|a|. See the PSBLAS User's Guide for
|
|
details \cite{PSBLASGUIDE}.\\
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\end{tabular}
|
|
|
|
|
|
\clearpage
|
|
|
|
\subsection{Method smoothers\_build\label{sec:smooth_bld}}
|
|
|
|
|
|
\begin{center}
|
|
\fortinline|call p%smoothers_build(a,desc_a,p,info[,amold,vmold,imold])|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method builds the smoothers and the coarsest-level solvers for the
|
|
multilevel preconditioner \fortinline|p|, according to the requirements made by
|
|
the user through the methods \fortinline|init| and \fortinline|set|, and based on the aggregation
|
|
hierarchy produced by a previous call to \fortinline|hierarchy_build|
|
|
(see Section~\ref{sec:hier_bld}).
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|a| & \fortinline|type(psb_|\emph{x}\fortinline|spmat_type), intent(in)|. \\
|
|
& The sparse matrix structure containing the local part of the
|
|
matrix to be preconditioned. Note that \emph{x} must be chosen according
|
|
to the \fortinline|real|/\fortinline|complex|, single/double precision version of AMG4PSBLAS under use.
|
|
See the PSBLAS User's Guide for details \cite{PSBLASGUIDE}.\\
|
|
\fortinline|desc_a| & \fortinline|type(psb_desc_type), intent(in)|. \\
|
|
& The communication descriptor of \fortinline|a|. See the PSBLAS User's Guide for
|
|
details \cite{PSBLASGUIDE}.\\
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\fortinline|amold| & \fortinline|class(psb_|\emph{x}\fortinline|_base_sparse_mat), intent(in), optional|. \\
|
|
& The desired dynamic type for internal matrix
|
|
components; this allows e.g. running on GPUs; it needs not be the
|
|
same on all processes. See the PSBLAS User's Guide for
|
|
details \cite{PSBLASGUIDE}. \\
|
|
\fortinline|vmold| & \fortinline|class(psb_|\emph{x}\fortinline|_base_vect_type), intent(in), optional|. \\
|
|
& The desired dynamic type for internal vector
|
|
components; this allows e.g. running on GPUs. \\
|
|
\fortinline|imold| & \fortinline|class(psb_i_base_vect_type), intent(in), optional|. \\
|
|
& The desired dynamic type for internal integer vector
|
|
components; this allows e.g. running on GPUs. \\
|
|
\end{tabular}
|
|
|
|
\clearpage
|
|
|
|
\subsection{Method build\label{sec:precbld}}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%build(a,desc_a,info[,amold,vmold,imold])|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method builds the preconditioner \fortinline|p| according to the requirements
|
|
made by the user through the methods \fortinline|init| and \fortinline|set|
|
|
(see Sections~\ref{sec:hier_bld} and~\ref{sec:smooth_bld} for
|
|
multilevel preconditioners). It is mostly provided for backward
|
|
compatibility; indeed, it is internally implemented by invoking the
|
|
two previous methods \fortinline|hierarchy_build| and
|
|
\fortinline|smoothers_build|, whose nomenclature would however be somewhat
|
|
unnatural when dealing with simple one-level preconditioners.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|a| & \fortinline|type(psb_|\emph{x}\fortinline|spmat_type), intent(in)|. \\
|
|
& The sparse matrix structure containing the local part of the
|
|
matrix to be preconditioned. Note that \emph{x} must be chosen according
|
|
to the \fortinline|real|/\fortinline|complex|, single/double precision version of AMG4PSBLAS under use.
|
|
See the PSBLAS User's Guide for details \cite{PSBLASGUIDE}.\\
|
|
\fortinline|desc_a| & \fortinline|type(psb_desc_type), intent(in)|. \\
|
|
& The communication descriptor of \fortinline|a|. See the PSBLAS User's Guide for
|
|
details \cite{PSBLASGUIDE}.\\
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\fortinline|amold| & \fortinline|class(psb_|\emph{x}\fortinline|_base_sparse_mat), intent(in), optional|. \\
|
|
& The desired dynamic type for internal matrix
|
|
components; this allows e.g. running on GPUs; it needs not be the
|
|
same on all processes. See the PSBLAS User's Guide for
|
|
details \cite{PSBLASGUIDE}. \\
|
|
\fortinline|vmold| & \fortinline|class(psb_|\emph{x}\fortinline|_base_vect_type), intent(in), optional|. \\
|
|
& The desired dynamic type for internal vector
|
|
components; this allows e.g. running on GPUs. \\
|
|
\fortinline|imold| & \fortinline|class(psb_i_base_vect_type), intent(in), optional|. \\
|
|
& The desired dynamic type for internal integer vector
|
|
components; this allows e.g. running on GPUs. \\
|
|
\end{tabular}
|
|
|
|
|
|
\noindent
|
|
The method can be used to build multilevel preconditioners too.
|
|
|
|
|
|
\clearpage
|
|
\subsection{Method apply\label{sec:precapply}}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%apply(x,y,desc_a,info [,trans,work])|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method computes $y = op(B^{-1})\, x$, where $B$ is a previously built
|
|
preconditioner, stored into \fortinline|p|, and $op$
|
|
denotes the preconditioner itself or its transpose, according to
|
|
the value of \fortinline|trans|.
|
|
Note that, when AMG4PSBLAS is used with a Krylov solver from PSBLAS,
|
|
\fortinline|p%apply| is called within the PSBLAS method \fortinline|psb_krylov|
|
|
and hence it is completely transparent to the user.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|x| & \emph{type}\fortinline|(kind_parameter)|, dimension(:), intent(in)|.\\
|
|
& The local part of the vector $x$. Note that \emph{type} and
|
|
\emph{kind\_parameter} must be chosen according
|
|
to the \fortinline|real|/\fortinline|complex|, single/double precision version of AMG4PSBLAS under use.\\
|
|
\fortinline|y| & \emph{type}\fortinline|(kind_parameter)|, dimension(:), intent(out)|.\\
|
|
& The local part of the vector $y$. Note that \emph{type} and
|
|
\emph{kind\_parameter} must be chosen according
|
|
to the \fortinline|real|/\fortinline|complex|, single/double precision version of AMG4PSBLAS under use.\\
|
|
\fortinline|desc_a| & \fortinline|type(psb_desc_type), intent(in)|. \\
|
|
& The communication descriptor associated to the matrix to be
|
|
preconditioned.\\
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\fortinline|trans| & \fortinline|character(len=1), optional, intent(in).|\\
|
|
& If \fortinline|trans| = \fortinline|'N','n'| then $op(B^{-1}) = B^{-1}$;
|
|
if \fortinline|trans| = \fortinline|'T','t'| then $op(B^{-1}) = B^{-T}$
|
|
(transpose of $B^{-1})$; if \fortinline|trans| = \fortinline|'C','c'| then $op(B^{-1}) = B^{-C}$
|
|
(conjugate transpose of $B^{-1})$.\\
|
|
\fortinline|work| & \emph{type}\fortinline|(kind_parameter)|, dimension(:), optional, target|.\\
|
|
& Workspace. Its size should be at
|
|
least \fortinline|4 * psb_cd_get_local_| \fortinline|cols(desc_a)| (see the PSBLAS User's Guide).
|
|
Note that \emph{type} and \emph{kind\_parameter} must be chosen according
|
|
to the \fortinline|real|/\fortinline|complex|, single/double precision version of AMG4PSBLAS under use.\\
|
|
\end{tabular}
|
|
|
|
|
|
\clearpage
|
|
|
|
\subsection{Method free\label{sec:precfree}}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%free(p,info)|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method deallocates the preconditioner data structure \fortinline|p|.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{10.5cm}}
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\end{tabular}
|
|
|
|
|
|
\clearpage
|
|
|
|
\subsection{Method descr\label{sec:precdescr}}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%descr(info, [iout, root, verbosity])|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
This method prints a description of the preconditioner \fortinline|p| to the standard output or
|
|
to a file. It must be called after \fortinline|hierachy_build| and \fortinline|smoothers_build|,
|
|
or \fortinline|build|, have been called.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{2.2cm}p{11cm}}
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\fortinline|iout| & \fortinline|integer, intent(in), optional|.\\
|
|
& The id of the file where the preconditioner description
|
|
will be printed; the default is the standard output.\\
|
|
\fortinline|root| & \fortinline|integer, intent(in), optional|.\\
|
|
& The id of the process where the preconditioner description
|
|
will be printed; the default is \fortinline|psb_root_|.\\
|
|
\fortinline|verbosity| & \fortinline|integer, intent(in), optional|.\\
|
|
& The verbosity level of the description. Default value
|
|
is 0. For values higher than 0, it prints out further
|
|
information, e.g., for a distributed multilevel preconditioner
|
|
the size of the coarse matrices on every process.\\
|
|
\end{tabular}
|
|
|
|
|
|
\subsection{Auxiliary Methods\label{sec:auxil}}
|
|
Various functionalities are implemented as additional methods of the
|
|
preconditioner object.
|
|
|
|
\subsubsection{Method: dump}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%dump(info[,istart,iend,prefix,head,ac,rp,smoother,solver,global_num])|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
Dump on file.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\fortinline|amold| & \fortinline|class(psb_|\emph{x}\fortinline|_base_sparse_mat), intent(in), optional|. \\
|
|
& The desired dynamic type for internal matrix
|
|
components; this allows e.g. running on GPUs; it needs not be the
|
|
same on all processes. See the PSBLAS User's Guide for
|
|
details \cite{PSBLASGUIDE}. \\
|
|
\end{tabular}
|
|
|
|
|
|
\subsubsection{Method: clone}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%clone(pout,info)|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
Create a (deep) copy of the preconditioner object.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|pout| & \fortinline|type(amg_|\emph{x}\fortinline|prec_type), intent(out)|.\\
|
|
& The copy of the preconditioner data structure. Note
|
|
that \emph{x} must be chosen according
|
|
to the \fortinline|real|/\fortinline|complex|, single/double precision version of AMG4PSBLAS under use.\\
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\end{tabular}
|
|
|
|
|
|
|
|
\subsubsection{Method: sizeof}
|
|
|
|
\begin{center}
|
|
\fortinline|sz = p%sizeof([global])|\\
|
|
\end{center}
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|global| & \fortinline|logical, optional|.\\
|
|
& Whether the global or local preconditioner memory
|
|
occupation is
|
|
desired. Default: \fortinline|.false.|.\\
|
|
\end{tabular}
|
|
\noindent
|
|
Return memory footprint in bytes.
|
|
|
|
\subsubsection{Method: allocate\_wrk}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%allocate_wrk(info[, vmold])|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
Allocate internal work vectors. Each application of the preconditioner
|
|
uses a number of work vectors which are allocated internally as
|
|
necessary; therefore allocation and deallocation of memory occurs
|
|
multiple times during the execution of a Krylov method. In most cases
|
|
this strategy is perfectly acceptable, but
|
|
on some platforms, most notably GPUs, memory allocation is
|
|
a slow operation, and the default behaviour would lead to a
|
|
slowdown. This method allows to trade space for time by preallocating
|
|
the internal workspace outside of the invocation of a Krylov
|
|
method. When using GPUs or other specialized devices, the \fortinline|vmold|
|
|
argument is also necessary to ensure the internal work vectors are of
|
|
the appropriate dynamic type to exploit the accelerator hardware; when
|
|
allocation occurs internally this is taken care of based on the dynamic
|
|
type of the \fortinline|x| argument to the \fortinline|apply| method.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\fortinline|vmold| & \fortinline|class(psb_|\emph{x}\fortinline|_base_vect_type), intent(in), optional|. \\
|
|
& The desired dynamic type for internal vector
|
|
components; this allows e.g. running on GPUs. \\
|
|
\end{tabular}
|
|
|
|
|
|
|
|
\subsubsection{Method: free\_wrk}
|
|
|
|
\begin{center}
|
|
\fortinline|call p%free_wrk(info)|\\
|
|
\end{center}
|
|
|
|
\noindent
|
|
Deallocate internal work vectors.
|
|
|
|
{\vskip1.5\baselineskip\noindent\large\bfseries Arguments} \smallskip
|
|
|
|
\begin{tabular}{p{1.2cm}p{12cm}}
|
|
\fortinline|info| & \fortinline|integer, intent(out)|.\\
|
|
& Error code. If no error, 0 is returned. See Section~\ref{sec:errors} for details.\\
|
|
\end{tabular}
|
|
|
|
|
|
|
|
|
|
|
|
%%% Local Variables:
|
|
%%% mode: latex
|
|
%%% TeX-master: "userguide"
|
|
%%% End:
|