|
|
|
\section{General Overview\label{sec:overview}}
|
|
|
|
\markboth{\textsc{AMG4PSBLAS User's and Reference Guide}}
|
|
|
|
{\textsc{\ref{sec:overview} General Overview}}
|
|
|
|
|
|
|
|
The \textsc{Algebraic MultiGrid Preconditioners Package based on
|
|
|
|
PSBLAS} (\textsc{AMG\-4\-PSBLAS}) provides parallel Algebraic MultiGrid (AMG) preconditioners (see, e.g., \cite{Briggs2000,Stuben_01}),
|
|
|
|
to be used in the iterative solution of linear systems,
|
|
|
|
\begin{equation}
|
|
|
|
Ax=b,
|
|
|
|
\label{system1}
|
|
|
|
\end{equation}
|
|
|
|
where $A$ is a square, real or complex, sparse symmetric positive definite (s.p.d) matrix.
|
|
|
|
%
|
|
|
|
%\textbf{NOTA: Caso non simmetrico, aggregazione con $(A+A^T)$ fatta!
|
|
|
|
%Dovremmo implementare uno smoothed prolongator
|
|
|
|
%adeguato e fare qualcosa di consistente anche con 1-lev Schwarz.}
|
|
|
|
%
|
|
|
|
|
|
|
|
The preconditioners implemented in AMG4PSBLAS are obtained by combining
|
|
|
|
3 different types of AMG cycles with smoothers and coarsest-level
|
|
|
|
solvers. Available multigrid cycles include the V-, W-, and a version of a Krylov-type cycle
|
|
|
|
(K-cycle)~\cite{Briggs2000,Notay2008}; they can be
|
|
|
|
combined with Jacobi hybrid
|
|
|
|
%\footnote{see Note 2 in Table~\ref{tab:p_coarse}, p.~28.}
|
|
|
|
forward/backward Gauss-Seidel, block-Jacobi, and additive Schwarz
|
|
|
|
smoothers. The Jacobi, block-Jacobi and
|
|
|
|
Gauss-Seidel smoothers are also available in the $\ell_1$ version.
|
|
|
|
|
|
|
|
An algebraic approach is used to generate a hierarchy of
|
|
|
|
coarse-level matrices and operators, without explicitly using any information on the
|
|
|
|
geometry of the original problem, e.g., the discretization of a PDE. To this end,
|
|
|
|
two different coarsening strategies, based on aggregation, are available:
|
|
|
|
\begin{itemize}
|
|
|
|
\item a decoupled version of the smoothed aggregation procedure
|
|
|
|
proposed in~\cite{BREZINA_VANEK,VANEK_MANDEL_BREZINA}, and already
|
|
|
|
included in the previous versions of the
|
|
|
|
package~\cite{BDDF2007,MLD2P4_TOMS};
|
|
|
|
\item a coupled, parallel implementation of the Coarsening based on
|
|
|
|
Compatible Weighted Matching introduced in~\cite{DV2013,DFV2018}
|
|
|
|
and described in detail in~\cite{DDF2020};
|
|
|
|
\end{itemize}
|
|
|
|
Either exact or approximate solvers can be used on the coarsest-level
|
|
|
|
system. We provide interfaces to various sparse LU factorizations from external
|
|
|
|
packages, native incomplete LU and approximate inverse factorizations,
|
|
|
|
weighted Jacobi, hybrid Gauss-Seidel, block-Jacobi solvers and
|
|
|
|
a recursive call to preconditioned Krylov methods; all
|
|
|
|
smoothers can be also exploited as one-level preconditioners.
|
|
|
|
|
|
|
|
AMG4PSBLAS is written in Fortran~2003, following an
|
|
|
|
object-oriented design through the exploitation of features
|
|
|
|
such as abstract data type creation, type extension, functional overloading, and
|
|
|
|
dynamic memory management.
|
|
|
|
The parallel implementation is based on a Single Program Multiple Data
|
|
|
|
(SPMD) paradigm. Single and
|
|
|
|
double precision implementations of AMG4PSBLAS are available for both the
|
|
|
|
real and the complex case, which can be used through a single
|
|
|
|
interface.
|
|
|
|
|
|
|
|
AMG4PSBLAS has been designed to implement scalable and easy-to-use
|
|
|
|
multilevel preconditioners in the context of the PSBLAS (Parallel Sparse BLAS)
|
|
|
|
computational framework~\cite{psblas_00,PSBLAS3}. PSBLAS provides basic linear algebra
|
|
|
|
operators and data management facilities for distributed sparse matrices,
|
|
|
|
kernels for sequential incomplete factorizations needed for the
|
|
|
|
parallel block-Jacobi and additive Schwarz smoothers, and
|
|
|
|
parallel Krylov solvers which can be used with the AMG4PSBLAS preconditioners.
|
|
|
|
The choice of PSBLAS has been mainly motivated by the need of having
|
|
|
|
a portable and efficient software infrastructure implementing ``de facto'' standard
|
|
|
|
parallel sparse linear algebra kernels, to pursue goals such as performance,
|
|
|
|
portability, modularity ed extensibility in the development of the preconditioner
|
|
|
|
package. On the other hand, the implementation of AMG4PSBLAS, which
|
|
|
|
was driven by the need to face the exascale challenge, has led to some
|
|
|
|
important revisions and extentions of the PSBLAS infrastructure.
|
|
|
|
The inter-process comunication required by AMG4PSBLAS is encapsulated
|
|
|
|
in the PSBLAS routines;
|
|
|
|
therefore, AMG4PSBLAS can be run on any parallel machine where PSBLAS
|
|
|
|
implementations are available. In the most recent version of PSBLAS
|
|
|
|
(release 3.7), a plug-in for GPU is included; it includes CUDA
|
|
|
|
versions of main vector operations and of sparse matrix-vector
|
|
|
|
multiplication, so that Krylov methods coupled with AMG4PBLAS
|
|
|
|
preconditioners relying on Jacobi and block-Jacobi smoothers with
|
|
|
|
sparse approximate inverses on the blocks can be efficiently executed
|
|
|
|
on cluster of GPUs.
|
|
|
|
|
|
|
|
AMG4PSBLAS has a layered and modular software architecture where three main layers can be
|
|
|
|
identified. The lower layer consists of the PSBLAS kernels, the middle one implements
|
|
|
|
the construction and application phases of the preconditioners, and the upper one
|
|
|
|
provides a uniform interface to all the preconditioners.
|
|
|
|
This architecture allows for different levels of use of the package:
|
|
|
|
few black-box routines at the upper layer allow all users to easily
|
|
|
|
build and apply any preconditioner available in AMG4PSBLAS;
|
|
|
|
facilities are also available allowing expert users to extend the set of smoothers
|
|
|
|
and solvers for building new versions of the preconditioners (see
|
|
|
|
Section~\ref{sec:adding}).
|
|
|
|
|
|
|
|
This guide is organized as follows. General information on the distribution of the source
|
|
|
|
code is reported in Section~\ref{sec:distribution}, while details on the configuration
|
|
|
|
and installation of the package are given in Section~\ref{sec:building}. The basics for building and applying the
|
|
|
|
preconditioners with the Krylov solvers implemented in PSBLAS are reported
|
|
|
|
in~Section~\ref{sec:started}, where the Fortran codes of a few sample programs
|
|
|
|
are also shown. A reference guide for the user interface routines is provided
|
|
|
|
in Section~\ref{sec:userinterface}. Information on the extension of the package
|
|
|
|
through the addition of new smoothers and solvers is reported in Section~\ref{sec:adding}.
|
|
|
|
The error handling mechanism used by the package
|
|
|
|
is briefly described in Section~\ref{sec:errors}. The copyright terms concerning the
|
|
|
|
distribution and modification of AMG4PSBLAS are reported in Appendix~\ref{sec:license}.
|
|
|
|
|
|
|
|
%%% Local Variables:
|
|
|
|
%%% mode: latex
|
|
|
|
%%% TeX-master: "userguide"
|
|
|
|
%%% End:
|