mld2p4:
docs/pdf/Makefile docs/pdf/abstract.tex docs/pdf/advanced.tex docs/pdf/background.tex docs/pdf/bibliography.tex docs/pdf/building.tex docs/pdf/conventions.tex docs/pdf/distribution.tex docs/pdf/errors.tex docs/pdf/gettingstarted.tex docs/pdf/highlevelview.tex docs/pdf/listofroutines.tex docs/pdf/overview.tex docs/pdf/userguide.tex docs/userguide.pdf New documentation, first set-up.stopcriterion
parent
7f1858a775
commit
9eeef87a3a
@ -0,0 +1,19 @@
|
|||||||
|
\begin{abstract}
|
||||||
|
\emph{MLD2P4 (Multi-Level Domain Decomposition Parallel Preconditioners Package based on
|
||||||
|
PSBLAS}) is a package of parallel algebraic multi-level preconditioners.
|
||||||
|
It implements various versions of one-level additive and of multi-level additive
|
||||||
|
and hybrid Schwarz algorithms. In the multi-level case, a purely algebraic approach
|
||||||
|
is applied to generate coarse-level corrections, so that no geometric background is needed
|
||||||
|
concerning the matrix to be preconditioned. The matrix is required to be square, real or complex, with a symmetric sparsity pattern \textbf{Non consideriamo anche il caso non simmetrico
|
||||||
|
con $(A+A^T)/2$?}.
|
||||||
|
|
||||||
|
MLD2P4 has been designed to provide scalable and easy-to-use preconditioners in the
|
||||||
|
context of the PSBLAS (Parallel Sparse Basic Linear Algebra Subprograms)
|
||||||
|
computational framework and can be used in conjuction with the Krylov solvers
|
||||||
|
available in this framework. MLD2P4 enables the user to easily specify different aspects
|
||||||
|
of a generic algebraic multilevel Schwarz preconditioner, thus allowing to search
|
||||||
|
for the ``best'' preconditioner for the problem at hand. The package has been designed
|
||||||
|
employing object-oriented techniques, using Fortran 95 and MPI, with interfaces to
|
||||||
|
additional external libraries such as UMFPACK, SuperLU and SuperLU\_Dist, that
|
||||||
|
can be exploited in building multi-level preconditioners.
|
||||||
|
\end{abstract}
|
@ -0,0 +1,12 @@
|
|||||||
|
\section{Advanced Use}\label{sec:advanced}
|
||||||
|
|
||||||
|
- MLD2P4 software architecture \\
|
||||||
|
- preconditioner data structure (descrizione "dettagliata") + possibilita' di settare singolarmente
|
||||||
|
i vari livelli (possibilita' accennata solamente nella precedente descrizione di precset) \\
|
||||||
|
- descrizione routine medium level (con introduzione sulle potenzialita' di ampliamento (?), offerte
|
||||||
|
da queto strato software) \\
|
||||||
|
|
||||||
|
%%% Local Variables:
|
||||||
|
%%% mode: latex
|
||||||
|
%%% TeX-master: "userguide"
|
||||||
|
%%% End:
|
@ -0,0 +1,291 @@
|
|||||||
|
\section{Multi-level Domain Decomposition Background\label{sec:background}}
|
||||||
|
|
||||||
|
\emph{Domain Decomposition} (DD) preconditioners, coupled with Krylov iterative
|
||||||
|
solvers, are widely used in the parallel solution of large and sparse linear systems.
|
||||||
|
These preconditioners are based on the divide and conquer technique: the matrix
|
||||||
|
to be preconditioned is divided into submatrices, a ``local linear system''
|
||||||
|
involving each submatrix is (approximately) solved, and the local solutions are used
|
||||||
|
to build a preconditioner for the whole original matrix. This process
|
||||||
|
often corresponds to dividing a physical domain associated to the original matrix
|
||||||
|
into subdomains, e.g. in a PDE discretization, to (approximately) solving the
|
||||||
|
subproblems corresponding to the subdomains and to building an approximate
|
||||||
|
solution of the original problem from the local solutions
|
||||||
|
\cite{Cai_Widlund_92,dd1_94,dd2_96}.
|
||||||
|
|
||||||
|
\emph{Additive Schwarz} preconditioners are DD preconditioners using overlapping
|
||||||
|
submatrices, i.e.\ with some common rows, to couple the local information
|
||||||
|
related to the submatrices (see, e.g., \cite{dd2_96}).
|
||||||
|
The main motivations for choosing Additive Schwarz preconditioners are their
|
||||||
|
intrinsic parallelism and good \textbf{(dire good e' un po' "`forte"', dato che
|
||||||
|
subito dopo diciamo che la convergenza dipende dal numero di sottomatrici)}
|
||||||
|
convergence properties. A drawback of these
|
||||||
|
preconditioners is that the number of iterations of the preconditioned solvers
|
||||||
|
generally grows with the number of submatrices. This may be a serious limitation
|
||||||
|
on parallel computers, since the number of submatrices usually matches the number
|
||||||
|
of available processors. Optimal convergence rates, i.e.\ iteration numbers
|
||||||
|
independent of the number of submatrices, can be obtained by correcting the
|
||||||
|
preconditioner through a suitable approximation of the original linear system
|
||||||
|
in a coarse space, which globally couples the information related to the single
|
||||||
|
submatrices.
|
||||||
|
|
||||||
|
\emph{Two-level Schwarz} preconditioners are obtained
|
||||||
|
by combining basic (one-level) Schwarz preconditioners with coarse-level
|
||||||
|
corrections. In this context, the one-level preconditioner is often
|
||||||
|
called smoother. Different two-level preconditioners are obtained by varying the
|
||||||
|
choice of the smoother, of the coarse-level correction and the
|
||||||
|
way they are combined \cite{dd2_96}. The same reasoning can be applied starting
|
||||||
|
from the coarse-level system, i.e.\ a coarse-space correction can be built
|
||||||
|
from this system, thus obtaining \emph{multi-level} preconditioners.
|
||||||
|
|
||||||
|
It is worth noting that optimal preconditioners do not necessarily correspond
|
||||||
|
to minimum execution times. Indeed, to obtain effective multilevel preconditioners
|
||||||
|
a tradeoff between optimality of convergence and the cost of building and applying
|
||||||
|
the coarse-space corrections must be achieved. The choice of the number of levels,
|
||||||
|
i.e.\ of the coarse-space corrections, also affects the effectiveness of the
|
||||||
|
preconditioners. One more goal is to get convergence rates as less sensitive
|
||||||
|
as possible to variations in the matrix coefficients.
|
||||||
|
|
||||||
|
Two main approaches can be used to build coarse-space corrections. The geometric approach
|
||||||
|
applies coarsening strategies based on the knowledge of some physical grid associated
|
||||||
|
to the matrix and requires the user to define grid transfer operators from the fine
|
||||||
|
to the coarse levels and vice versa. This may result difficult for complex geometries;
|
||||||
|
furthermore, suitable one-level preconditioners may be required to get efficient
|
||||||
|
interplay between fine and coarse levels, e.g.\ when matrices with highly varying coefficients
|
||||||
|
are considered. The algebraic approach builds coarse-space corrections using only matrix
|
||||||
|
information. It performs a fully automatic coarsening and enforces the interplay between
|
||||||
|
the fine and coarse levels by suitably choosing the coarse space and the coarse-to-fine
|
||||||
|
interpolation \cite{StubenGMD69_99}.
|
||||||
|
|
||||||
|
MLD2P4 uses a pure algebraic approach for building the sequence of coarse matrices
|
||||||
|
starting from the original matrix. The algebraic approach is based on the \emph{smoothed
|
||||||
|
aggregation} algorithm \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}. A decoupled version
|
||||||
|
of this algorithm is implemented, where the smoothed aggregation is applied locally
|
||||||
|
to each submatrix \cite{Tuminaro_Tong_00}. In the next two subsections we provide
|
||||||
|
a brief description of the multi-level Schwarz preconditioners and on the smoothed
|
||||||
|
aggregation technique as implemented in MLD2P4. For further details the user
|
||||||
|
is referred to \cite{para_04,apnum_07,aaecc_07,dd2_96}.
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{Multi-level Schwarz Preconditioners\label{sec:multilevel}}
|
||||||
|
|
||||||
|
The Multilevel preconditioners implemented in MLD2P4 are obtained by combining
|
||||||
|
Additive Schwarz preconditioners with coarse-space corrections; therefore
|
||||||
|
we first provide a sketch of the Additive Schwarz preconditioners.
|
||||||
|
|
||||||
|
Given a linear system
|
||||||
|
\[ Ax=b, \]
|
||||||
|
where $A=(a_{ij}) \in \Re^{n \times n}$ is a
|
||||||
|
nonsingular sparse matrix with a symmetric non-zero pattern,
|
||||||
|
let $G=(W,E)$ be the adjacency graph of $A$, where $W=\{1, 2, \ldots, n\}$
|
||||||
|
and $E=\{(i,j) : a_{ij} \neq 0\}$ are the vertex set and the edge set of $G$,
|
||||||
|
respectively. Two vertices are called adjacent if there is an edge connecting
|
||||||
|
them. For any integer $\delta > 0$, a $\delta$-overlap
|
||||||
|
partition of $W$ can be defined recursively as follows.
|
||||||
|
Given a 0-overlap (or non-overlapping) partition of $W$,
|
||||||
|
i.e.\ a set of $m$ disjoint nonempty sets $W_i^0 \subset W$ such that
|
||||||
|
$\cup_{i=1}^m W_i^0 = W$, a $\delta$-overlap
|
||||||
|
partition of $W$ is obtained by considering the sets
|
||||||
|
$W_i^\delta \supset W_i^{\delta-1}$, obtained by including the vertices that
|
||||||
|
are adjacent to any vertex in $W_i^{\delta-1}$.
|
||||||
|
|
||||||
|
Let $n_i^\delta$ be the size of $W_i^\delta$ and $R_i^{\delta} \in
|
||||||
|
\Re^{n_i^\delta \times n}$ the restriction operator that maps
|
||||||
|
a vector $v \in \Re^n$ onto the vector $v_i^{\delta} \in \Re^{n_i^\delta}$
|
||||||
|
containing the components of $v$ corresponding to the vertices in
|
||||||
|
$W_i^\delta$. The transpose of $R_i^{\delta}$ is a
|
||||||
|
prolongation operator from $\Re^{n_i^\delta}$ to $\Re^n$.
|
||||||
|
The matrix $A_i^\delta=R_i^\delta A (R_i^\delta)^T \in
|
||||||
|
\Re^{n_i^\delta \times n_i^\delta}$ can be considered
|
||||||
|
as a restriction of $A$ corresponding to the set $W_i^{\delta}$.
|
||||||
|
|
||||||
|
The \emph{classical one-level AS} preconditioner is defined by
|
||||||
|
\[
|
||||||
|
M_{AS}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
|
||||||
|
(A_i^\delta)^{-1} R_i^{\delta},
|
||||||
|
\]
|
||||||
|
where $A_i^\delta$ is assumed to be nonsingular. Its application
|
||||||
|
to a vector $v \in \Re^n$ within a Krylov solver requires the following
|
||||||
|
three steps:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item restriction of $v$ as $v_i = R_i^{\delta} v$, $i=1,\ldots,m$;
|
||||||
|
\item (approximate) solution of the linear systems $A_i^\delta w_i = v_i$,
|
||||||
|
$i=1,\ldots,m$;
|
||||||
|
\item prolongation and sum of the $w_i$'s, i.e. $w = \sum_{i=1}^m (R_i^{\delta})^T w_i$.
|
||||||
|
\end{enumerate}
|
||||||
|
A variant of the classical AS preconditioner that outperforms it
|
||||||
|
in terms of both convergence rate and of computation and communication
|
||||||
|
time on parallel distributed-memory computers is the so-called \emph{Restricted AS
|
||||||
|
(RAS)} preconditioner~\cite{Cai_Sarkis,Efstathiou_Gander}. It
|
||||||
|
is obtained by zeroing the components of $w_i$ corresponding to the
|
||||||
|
overlapping vertices when applying the prolongation. Therefore,
|
||||||
|
RAS differs from classical AS by the prolongation operator $(R_i^{\delta})^T$,
|
||||||
|
which is substituted by $(\tilde{R}_i^0)^T \in \Re^{n_i^\delta \times n}$,
|
||||||
|
where $\tilde{R}_i^0$ obtained by zeroing the rows of $R_i^\delta$
|
||||||
|
corresponding to the vertices in $W_i^\delta \backslash W_i^0$:
|
||||||
|
\[
|
||||||
|
M_{RAS}^{-1}= \sum_{i=1}^m (\tilde{R}_i^0)^T
|
||||||
|
(A_i^\delta)^{-1} R_i^{\delta}.
|
||||||
|
\]
|
||||||
|
Analogously, the AS variant called \emph{AS with Harmonic extension (ASH)}
|
||||||
|
is defined by
|
||||||
|
\[ M_{ASH}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
|
||||||
|
(A_i^\delta)^{-1} \tilde{R}_i^0.
|
||||||
|
\]
|
||||||
|
We note that for $\delta=0$ the three variants of the AS preconditioner are
|
||||||
|
all equal to the block-Jacobi preconditioner.
|
||||||
|
|
||||||
|
As already observed, the convergence rate of the one-level Schwarz
|
||||||
|
preconditioned iterative solvers deteriorates as the number $m$ of partitions
|
||||||
|
of $W$ increases \cite{dd1_94,dd2_96}. To reduce the dependency
|
||||||
|
of the number of iterations on the degree of parallelism we may
|
||||||
|
introduce a global coupling among the overlapping partitions by defining
|
||||||
|
a coarse-space approximation $A_C$ of the matrix $A$.
|
||||||
|
In a pure algebraic setting, $A_C$ is usually built with
|
||||||
|
a Galerkin approach. Given a set $W_C$ of \emph{coarse vertices},
|
||||||
|
with size $n_C$, and a suitable restriction operator
|
||||||
|
$R_C \in \Re^{n_C \times n}$, $A_C$ is defined as
|
||||||
|
\[
|
||||||
|
A_C=R_C A R_C^T
|
||||||
|
\]
|
||||||
|
and the coarse-level correction matrix to be combined with a generic
|
||||||
|
one-level AS preconditioner $M_{1L}$ is obtained as
|
||||||
|
\[
|
||||||
|
M_{C}^{-1}= R_C^T A_C^{-1} R_C,
|
||||||
|
\]
|
||||||
|
where $A_C$ is assumed to be nonsingular. The application of $M_{C}^{-1}$
|
||||||
|
to a vector $v$ corresponds to a restriction, a solution and
|
||||||
|
a prolongation step; the solution step, involving the matrix $A_C$,
|
||||||
|
may be carried out also approximately.
|
||||||
|
|
||||||
|
The combination of $M_{C}$ and $M_{1L}$ may be
|
||||||
|
performed in either an additive or a multiplicative framework.
|
||||||
|
In the former case, the \emph{two-level additive} Schwarz preconditioner
|
||||||
|
is obtained:
|
||||||
|
\[
|
||||||
|
M_{2LA}^{-1} = M_{C}^{-1} + M_{1L}^{-1}.
|
||||||
|
\]
|
||||||
|
Applying $M_{2L-A}^{-1}$ to a vector $v$ within a Krylov solver
|
||||||
|
corresponds to applying $M_{C}^{-1}$
|
||||||
|
and $M_{1L}^{-1}$ to $v$ independently and then summing up
|
||||||
|
the results.
|
||||||
|
|
||||||
|
In the multiplicative case, the combination can be
|
||||||
|
performed by first applying the smoother $M_{1L}^{-1}$ and then
|
||||||
|
the coarse-level correction operator $M_{C}^{-1}$:
|
||||||
|
\[
|
||||||
|
\begin{array}{l}
|
||||||
|
w = M_{1L}^{-1} v, \\
|
||||||
|
z = w + M_{C}^{-1} (v-Aw);
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
this corresponds to the following \emph{two-level hybrid pre-smoothed}
|
||||||
|
Schwarz preconditioner:
|
||||||
|
\[
|
||||||
|
M_{2LH-PRE}^{-1} = M_{C}^{-1} + \left( I - M_{C}^{-1}A \right) M_{1L}^{-1}.
|
||||||
|
\]
|
||||||
|
On the other hand, by applying the smoother after the coarse-level correction,
|
||||||
|
i.e.\ by computing
|
||||||
|
\[
|
||||||
|
\begin{array}{l}
|
||||||
|
w = M_{C}^{-1} v , \\
|
||||||
|
z = w + M_{1L}^{-1} (v-Aw) ,
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
the \emph{two-level hybrid post-smoothed}
|
||||||
|
Schwarz preconditioner is obtained:
|
||||||
|
\[
|
||||||
|
M_{2LH-POST}^{-1} = M_{1L}^{-1} + \left( I - M_{1L}^{-1}A \right) M_{C}^{-1}.
|
||||||
|
\]
|
||||||
|
One more variant of two-level hybrid preconditioner is obtained by applying
|
||||||
|
the smoother before and after the coarse-level correction. In this case, the
|
||||||
|
preconditioner is symmetric if $A$, $M_{1L}$ and $M_{C}$ are symmetric.
|
||||||
|
|
||||||
|
As previously noted, on parallel computers the number of sumatrices usually matches
|
||||||
|
the number of available processors. When the size of the system to be preconditioned
|
||||||
|
is very large, the use of many proccessors, i.e.\ of many small submatrices, often
|
||||||
|
leads to a large coarse-level system, whose solution may be computationally expensive.
|
||||||
|
On the other hand, the use of few processors often leads to local sumatrices that
|
||||||
|
are too expensive to be processed on single processors, because of memory and/or
|
||||||
|
computing requirements. Therefore, it seems natural to use a recursive approach,
|
||||||
|
in which the coarse-level correction is re-applied starting from the current
|
||||||
|
coarse-level system. The corresponding preconditioners are called \emph{multi-level}.
|
||||||
|
One more reason for the multi-level approach is that it may significantly
|
||||||
|
reduce the computational cost of preconditioning with respect to the two-level case
|
||||||
|
(see \cite[Chapter 3]{dd2_96}). Additive and hybrid multilevel preconditioners
|
||||||
|
are obtained as direct extensions of the two-level counterparts. Other combinations
|
||||||
|
of the smoothers and coarse-level corrections are possible, leading to variants
|
||||||
|
of the previous algorithms. For a detailed descrition of them, the reader is
|
||||||
|
referred to \cite[Chapter 3]{dd2_96}.
|
||||||
|
\textbf{Secondo me qui ci vorrebbe una descrizione algoritmica, a titolo di esempio,
|
||||||
|
di un precondizionatore multilevel, ad esempio quello ibrido con pre-smoothing, sul tipo
|
||||||
|
della descrizione in figura 1 della guida di Trilinos ML 4.0. CHE NE PENSATE?}
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{Smoothed Aggregation\label{sec:aggregation}}
|
||||||
|
|
||||||
|
To define the restriction operator $R_C$, which is used to compute
|
||||||
|
the coarse-level matrix $A_C$, MLD2P4 uses the \emph{smoothed aggregation}
|
||||||
|
algorithm described in \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}.
|
||||||
|
The basic idea of this algorithm is to build a coarse set of vertices
|
||||||
|
$W_C$ by suitably grouping the vertices of $W$ into disjoint subsets
|
||||||
|
(aggregates), and to define the coarse-to-fine space transfer operator $R_C^T$ by
|
||||||
|
applying a suitable smoother to a simple piecewise constant
|
||||||
|
prolongation operator, to improve the quality of the coarse-space correction.
|
||||||
|
|
||||||
|
Three main steps can be identified in the smoothed aggregation procedure:
|
||||||
|
\begin{itemize}
|
||||||
|
\item coarsening of the vertex set $W$, to obtain $W_C$;
|
||||||
|
\item construction of the prolongator $R_C^T$;
|
||||||
|
\item application of $R_C$ and $R_C^T$ to build $A_C$.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
To perform the coarsening step, we have implemented the aggregation algorithm sketched
|
||||||
|
in \cite{apnum_07}. According to \cite{brezina_vanek}, a modification of this algorithm
|
||||||
|
has been actually considered,
|
||||||
|
in which each aggregate $N_r$ is made of vertices of $W$ that are \emph{strongly coupled}
|
||||||
|
to a certain root vertex $r \in W$, i.e.\
|
||||||
|
\[ N_r = \left\{s \in W: |a_{rs}| \geq \theta \sqrt{|a_{rr}a_{ss}|} \right\} \]
|
||||||
|
for a given $\theta \in [0,1]$.
|
||||||
|
Since the previous algorithm has a sequential nature, a \emph{decoupled} version of
|
||||||
|
it has been chosen, where each processor $i$ independently applies the algorithm to
|
||||||
|
the set of vertices $W_i^0$ assigned to it in the initial data distribution. This
|
||||||
|
version is embarrassingly parallel, since it does not require any data communication.
|
||||||
|
On the other hand, it may produce non-uniform aggregates near boundary vertices,
|
||||||
|
i.e.\ near vertices adjacent to vertices in other processors, and is strongly
|
||||||
|
dependent on the number of processors and on the initial partitioning of the matrix $A$.
|
||||||
|
Nevertheless, this algorithm has been chosen for the implementation in MLD2P4,
|
||||||
|
since it has been shown to produce good results in practice \cite{Tuminaro_Tong_00}.
|
||||||
|
|
||||||
|
The prolongator $P_C=R_C^T$ is built starting from a \emph{tentative prolongator}
|
||||||
|
$P \in \Re^{n \times n_C}$, defined as
|
||||||
|
\begin{equation}
|
||||||
|
P=(p_{ij}), \quad p_{ij}=
|
||||||
|
\left\{ \begin{array}{ll}
|
||||||
|
1 & \quad \mbox{if} \; i \in V^j_C \\
|
||||||
|
0 & \quad \mbox{otherwise}
|
||||||
|
\end{array} \right. .
|
||||||
|
\label{eq:tent_prol}
|
||||||
|
\end{equation}
|
||||||
|
$P_C$ is obtained by
|
||||||
|
applying to $P$ a smoother $S \in \Re^{n \times n}$:
|
||||||
|
\begin{equation}
|
||||||
|
P_C = S P,
|
||||||
|
\label{eq:smoothed_prol}
|
||||||
|
\end{equation}
|
||||||
|
in order to remove oscillatory components from the range of the prolongator
|
||||||
|
and hence to improve the convergence properties of the multi-level
|
||||||
|
Schwarz method \cite{Brezina_Vanek_,StubenGMD69_99}.
|
||||||
|
A simple choice for $S$ is the damped Jacobi smoother:
|
||||||
|
\begin{equation}
|
||||||
|
S = I - \omega D^{-1} A ,
|
||||||
|
\label{eq:jac_smoother}
|
||||||
|
\end{equation}
|
||||||
|
where the value of $\omega$ can be chosen
|
||||||
|
using some estimate of the spectral radius of $D^{-1}A$ \cite{Brezina_Vanek}.
|
||||||
|
\textbf{Cenno al filtering di $A$ nello smoothing, dicendo che pero' non e' stato
|
||||||
|
implementato?}
|
||||||
|
|
||||||
|
%%% Local Variables:
|
||||||
|
%%% mode: latex
|
||||||
|
%%% TeX-master: "userguide"
|
||||||
|
%%% End:
|
@ -0,0 +1,152 @@
|
|||||||
|
\begin{thebibliography}{99}
|
||||||
|
|
||||||
|
%
|
||||||
|
\bibitem{PARA04FOREST}
|
||||||
|
Bella, G., Filippone, S., De Maio, A., Testa, M.:
|
||||||
|
A Simulation Model for Forest Fires.
|
||||||
|
In: Dongarra, J., Madsen, K., Wasniewski, J. (eds.):
|
||||||
|
Proceedings of PARA~04 Workshop on State of the Art
|
||||||
|
in Scientific Computing. Lecture Notes in Computer Science, 3732. Berlin:
|
||||||
|
Springer, 2005
|
||||||
|
%
|
||||||
|
\bibitem{aaecc_07} A. Buttari, D. di Serafino, P. D'Ambra, S. Filippone,\newblock
|
||||||
|
2LEV-D2P4: a package of high-performance preconditioners,\newblock
|
||||||
|
Applicable Algebra in Engineering, Communications and Computing,
|
||||||
|
Volume 18, Number 3, May, 2007, pp. 223-239
|
||||||
|
%Published online: 13 February 2007, {\tt http://dx.doi.org/10.1007/s00200-007-0035-z}
|
||||||
|
%
|
||||||
|
\bibitem{apnum_07} P. D'Ambra, S. Filippone, D. Di Serafino\newblock
|
||||||
|
On the Development of PSBLAS-based Parallel Two-level Schwarz Preconditioners
|
||||||
|
\newblock
|
||||||
|
Applied Numerical Mathematics, Elsevier Science,
|
||||||
|
Volume 57, Issues 11-12, November-December 2007, Pages 1181-1196.
|
||||||
|
%published online 3 February 2007, {\tt
|
||||||
|
% http://dx.doi.org/10.1016/j.apnum.2007.01.006}
|
||||||
|
|
||||||
|
%% \bibitem{DOUGLAS}
|
||||||
|
%% R.E.~Bank and C.C.~Douglas,
|
||||||
|
%% {\em SMMP: Sparse Matrix Multiplication Package},
|
||||||
|
%% Advances in Computational Mathematics, 1993, 1, 127-137.
|
||||||
|
%% (See also {\tt http://www.mgnet.org/~douglas/ccd-codes.html})
|
||||||
|
%
|
||||||
|
%
|
||||||
|
\bibitem{para_04}
|
||||||
|
A.~Buttari, P.~D'Ambra, D.~di Serafino and S.~Filippone,
|
||||||
|
{\em Extending PSBLAS to Build Parallel Schwarz Preconditioners},
|
||||||
|
in , J.~Dongarra, K.~Madsen, J.~Wasniewski, editors,
|
||||||
|
Proceedings of PARA~04 Workshop on State of the Art
|
||||||
|
in Scientific Computing, pp.~593--602, Lecture Notes in Computer Science,
|
||||||
|
Springer, 2005.
|
||||||
|
%
|
||||||
|
%% \bibitem{CAI_SAAD}
|
||||||
|
%% X.~C.~Cai and Y.~Saad,
|
||||||
|
%% {\em Overlapping Domain Decomposition Algorithms for General Sparse Matrices},
|
||||||
|
%% Numerical Linear Algebra with Applications, 3(3), pp.~221--237, 1996.
|
||||||
|
%% %
|
||||||
|
%% \bibitem{CAI_SARKIS}
|
||||||
|
%% X.C.~Cai and M.~Sarkis,
|
||||||
|
%% {\em A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems},
|
||||||
|
%% SIAM Journal on Scientific Computing, 21(2), pp.~792--797, 1999.
|
||||||
|
%
|
||||||
|
\bibitem{Cai_Widlund_92}
|
||||||
|
X.C.~Cai and O.~B.~Widlund,
|
||||||
|
{\em Domain Decomposition Algorithms for Indefinite Elliptic Problems},
|
||||||
|
SIAM Journal on Scientific and Statistical Computing, 13(1), pp.~243--258, 1992.
|
||||||
|
%
|
||||||
|
\bibitem{dd1_94}
|
||||||
|
T.~Chan and T.~Mathew,
|
||||||
|
{\em Domain Decomposition Algorithms},
|
||||||
|
in A.~Iserles, editor, Acta Numerica 1994, pp.~61--143, 1994.
|
||||||
|
Cambridge University Press.
|
||||||
|
%% %
|
||||||
|
%% \bibitem{UMFPACK}
|
||||||
|
%% T.A.~Davis,
|
||||||
|
%% {\em Algorithm 832: UMFPACK - an Unsymmetric-pattern Multifrontal
|
||||||
|
%% Method with a Column Pre-ordering Strategy},
|
||||||
|
%% ACM Transactions on Mathematical Software, 30, pp.~196--199, 2004.
|
||||||
|
%% (See also {\tt http://www.cise.ufl.edu/~davis/})
|
||||||
|
%% %
|
||||||
|
%% \bibitem{SUPERLU}
|
||||||
|
%% J.W.~Demmel, S.C.~Eisenstat, J.R.~Gilbert, X.S.~Li and J.W.H.~Liu,
|
||||||
|
%% A supernodal approach to sparse partial pivoting,
|
||||||
|
%% SIAM Journal on Matrix Analysis and Applications, 20(3), pp.~720--755, 1999.
|
||||||
|
%
|
||||||
|
\bibitem{BLACS}
|
||||||
|
J.~J.~Dongarra and R.~C.~Whaley,
|
||||||
|
{\em A User's Guide to the BLACS v.~1.1},
|
||||||
|
Lapack Working Note 94, Tech.\ Rep.\ UT-CS-95-281, University of
|
||||||
|
Tennessee, March 1995 (updated May 1997).
|
||||||
|
%
|
||||||
|
\bibitem{sblas_97}
|
||||||
|
I.~Duff, M.~Marrone, G.~Radicati and C.~Vittoli,
|
||||||
|
{\em Level 3 Basic Linear Algebra Subprograms for Sparse Matrices:
|
||||||
|
a User Level Interface},
|
||||||
|
ACM Transactions on Mathematical Software, 23(3), pp.~379--401, 1997.
|
||||||
|
%
|
||||||
|
\bibitem{sblas_02}
|
||||||
|
I.~Duff, M.~Heroux and R.~Pozo,
|
||||||
|
{\em An Overview of the Sparse Basic Linear
|
||||||
|
Algebra Subprograms: the New Standard from the BLAS Technical Forum},
|
||||||
|
ACM Transactions on Mathematical Software, 28(2), pp.~239--267, 2002.
|
||||||
|
%
|
||||||
|
\bibitem{psblas_00}
|
||||||
|
S.~Filippone and M.~Colajanni,
|
||||||
|
{\em PSBLAS: A Library for Parallel Linear Algebra
|
||||||
|
Computation on Sparse Matrices},
|
||||||
|
\newblock
|
||||||
|
ACM Transactions on Mathematical Software, 26(4), pp.~527--550, 2000.
|
||||||
|
%
|
||||||
|
\bibitem{KIVA3PSBLAS}
|
||||||
|
S.~Filippone, P.~D'Ambra, M.~Colajanni,
|
||||||
|
{\em Using a Parallel Library of Sparse Linear Algebra in a Fluid Dynamics
|
||||||
|
Applications Code on Linux Clusters},
|
||||||
|
in G.~Joubert, A.~Murli, F.~Peters, M.~Vanneschi, editors,
|
||||||
|
Parallel Computing - Advances \& Current Issues,
|
||||||
|
pp.~441--448, Imperial College Press, 2002.
|
||||||
|
%
|
||||||
|
\bibitem{METIS}
|
||||||
|
Karypis, G. and Kumar, V.,
|
||||||
|
{\em {METIS}: Unstructured Graph Partitioning and Sparse Matrix
|
||||||
|
Ordering System}.
|
||||||
|
Minneapolis, MN 55455: University of Minnesota, Department of
|
||||||
|
Computer Science, 1995.
|
||||||
|
Internet Address: {\verb|http://www.cs.umn.edu/~karypis|}.
|
||||||
|
\bibitem{BLAS1}
|
||||||
|
Lawson, C., Hanson, R., Kincaid, D. and Krogh, F.,
|
||||||
|
Basic {L}inear {A}lgebra {S}ubprograms for {F}ortran usage,
|
||||||
|
{ACM Trans. Math. Softw.} vol.~{5}, 38--329, 1979.
|
||||||
|
|
||||||
|
\bibitem{machiels}
|
||||||
|
{Machiels, L. and Deville, M.}
|
||||||
|
{\em Fortran 90: An entry to object-oriented programming for the solution
|
||||||
|
of partial differential equations.}
|
||||||
|
{ACM Trans. Math. Softw.} vol.~{23}, 32--49.
|
||||||
|
\bibitem{metcalf}
|
||||||
|
{Metcalf, M., Reid, J. and Cohen, M.}
|
||||||
|
{\em Fortran 95/2003 explained.}
|
||||||
|
{Oxford University Press}, 2004.
|
||||||
|
|
||||||
|
\bibitem{dd2_96}
|
||||||
|
B.~Smith, P.~Bjorstad and W.~Gropp,
|
||||||
|
{\em Domain Decomposition: Parallel Multilevel Methods for Elliptic
|
||||||
|
Partial Differential Equations},
|
||||||
|
Cambridge University Press, 1996.
|
||||||
|
|
||||||
|
\bibitem{MPI1}
|
||||||
|
M.~Snir, S.~Otto, S.~Huss-Lederman, D.~Walker and J.~Dongarra,
|
||||||
|
{\em MPI: The Complete Reference. Volume 1 - The MPI Core}, second edition,
|
||||||
|
MIT Press, 1998.
|
||||||
|
%
|
||||||
|
\bibitem{BREZINA_VANEK}
|
||||||
|
M.~Brezina and P.~Van{\v e}k,
|
||||||
|
{\em A Black-Box Iterative Solver Based on a Two-Level Schwarz Method},
|
||||||
|
Computing, 1999, 63, 233-263.
|
||||||
|
%
|
||||||
|
%
|
||||||
|
\bibitem{VANEK_MANDEL_BREZINA}
|
||||||
|
P.~Van{\v e}k, J.~Mandel and M.~Brezina,
|
||||||
|
{\em Algebraic Multigrid by Smoothed Aggregation for Second and Fourth Order Elliptic Problems},
|
||||||
|
Computing, 1996, 56, 179-196.
|
||||||
|
%
|
||||||
|
|
||||||
|
\end{thebibliography}
|
@ -0,0 +1,7 @@
|
|||||||
|
\section{Configuring and Building MLD2P4\label{sec:configuring}}
|
||||||
|
- uso di GNU autoconf e automake \\
|
||||||
|
- software di base necessario (MPI, BLACS, BLAS, PSBLAS - specificare versioni) \\
|
||||||
|
- software opzionale (UMFPACK, SuperLU, SuperLUdist - specificare versioni e opzioni di configure) \\
|
||||||
|
- sistemi operativi e compilatori su cui MLD2P4 e' stato costruito con successo \\
|
||||||
|
- sono previste opzioni di configurazione per il debugging o per il profiling? \\
|
||||||
|
- albero delle directory \\
|
@ -0,0 +1,6 @@
|
|||||||
|
\section{Notational Conventions\label{sec:conventions}}
|
||||||
|
- caratteri tipografici usati nella guida (vedi guida ML recente e guida Aztec) \\
|
||||||
|
- convenzioni sui nomi di routine (differenza tra high-level e medium-level),
|
||||||
|
strutture dati,\\
|
||||||
|
moduli, costanti, etc. (vedi guida psblas) \\
|
||||||
|
- versione reale e complessa\\
|
@ -0,0 +1,41 @@
|
|||||||
|
\section{Code Distribution\label{sec:distribution}}
|
||||||
|
|
||||||
|
The MLD2P4 is freely distributable under the following copyright
|
||||||
|
terms:
|
||||||
|
\begin{verbatim}
|
||||||
|
MLD2P4 version 1.0
|
||||||
|
MultiLevel Domain Decomposition Parallel Preconditioners Package
|
||||||
|
based on PSBLAS (Parallel Sparse BLAS version 2.3)
|
||||||
|
|
||||||
|
(C) Copyright 2008
|
||||||
|
|
||||||
|
Salvatore Filippone University of Rome Tor Vergata
|
||||||
|
Alfredo Buttari University of Rome Tor Vergata
|
||||||
|
Pasqua D'Ambra ICAR-CNR, Naples
|
||||||
|
Daniela di Serafino Second University of Naples
|
||||||
|
|
||||||
|
|
||||||
|
Redistribution and use in source and binary forms, with or without
|
||||||
|
modification, are permitted provided that the following conditions
|
||||||
|
are met:
|
||||||
|
1. Redistributions of source code must retain the above copyright
|
||||||
|
notice, this list of conditions and the following disclaimer.
|
||||||
|
2. Redistributions in binary form must reproduce the above copyright
|
||||||
|
notice, this list of conditions, and the following disclaimer in the
|
||||||
|
documentation and/or other materials provided with the distribution.
|
||||||
|
3. The name of the MLD2P4 group or the names of its contributors may
|
||||||
|
not be used to endorse or promote products derived from this
|
||||||
|
software without specific written permission.
|
||||||
|
|
||||||
|
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||||
|
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
|
||||||
|
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||||
|
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE MLD2P4 GROUP OR ITS CONTRIBUTORS
|
||||||
|
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
||||||
|
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
||||||
|
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||||
|
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
||||||
|
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
||||||
|
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
||||||
|
POSSIBILITY OF SUCH DAMAGE.
|
||||||
|
\end{verbatim}
|
@ -0,0 +1,9 @@
|
|||||||
|
\section{Error Handling}\label{sec:errors}
|
||||||
|
|
||||||
|
Error handling
|
||||||
|
- Breve descrizione con rinvio alla guida di PSBLAS
|
||||||
|
|
||||||
|
%%% Local Variables:
|
||||||
|
%%% mode: latex
|
||||||
|
%%% TeX-master: "userguide"
|
||||||
|
%%% End:
|
@ -0,0 +1,224 @@
|
|||||||
|
\section{Getting Started\label{sec:started}}
|
||||||
|
|
||||||
|
We describe the basics for building and applying MLD2P4 one-level and multi-level
|
||||||
|
Schwarz preconditioners with the Krylov solvers included in PSBLAS \cite{}.
|
||||||
|
The following five steps are required:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item \emph{Allocate and initialize the preconditioner data structure, according to
|
||||||
|
a preconditioner type chosen by the user}. This is performed by the routine
|
||||||
|
\verb|mld_precinit|, which also sets a default preconditioner for each preconditioner
|
||||||
|
type selected by the user. The default preconditioner associated to each preconditioner
|
||||||
|
type is listed in Table~\ref{tab:precinit}; the string used by \verb|mld_precinit|
|
||||||
|
to identify each preconditioner type is also given. The preconditioner data structure is
|
||||||
|
the derived data type \verb|mld_prec_type|, which is accessed to the user only
|
||||||
|
through the MLD2P4 routines.
|
||||||
|
\item \emph{Choose a specific variant of the selected preconditioner type, by setting
|
||||||
|
the preconditioner parameters.} This is performed by the routine \verb|mld_precset|.
|
||||||
|
A few examples concerning the use of \verb|mld_precset| are given in
|
||||||
|
Sections~\ref{sec:example1} and \ref{sec:example1}; a complete list of all the
|
||||||
|
preconditioner parameters and their allowed values is provided in
|
||||||
|
Section~\ref{sec:highlevel}.
|
||||||
|
\item \emph{Build the preconditioner for a given matrix.} This is performed by
|
||||||
|
the routine \verb|mld_precbld|.
|
||||||
|
\item \emph{Apply the preconditioner at each iteration of a Krylov solver.}
|
||||||
|
This is performed by the routine \verb|mld_precaply|. When using the PSBLAS Krylov solvers,
|
||||||
|
this step is completely transparent to the user, since \verb|mld_precaply| is called
|
||||||
|
by the PSBLAS routine implementing the Krylov solver (\verb|psb_krylov|).
|
||||||
|
\item \emph{Deallocate the preconditioner data structure}. This is performed by
|
||||||
|
the routine \verb|mld_precfree|. This step is complementary to step 1 and should
|
||||||
|
be performed when the preconditioner is no more used.
|
||||||
|
\end{enumerate}
|
||||||
|
A detailed description of the above routines is given in Section~\ref{sec:highlevel}.
|
||||||
|
|
||||||
|
Note that the Fortran 95 module \verb|mld_prec_mod| must be used in the program
|
||||||
|
calling the MLD2P4 routines. Furthermore, to apply MLD2P4 with the Krylov solvers
|
||||||
|
from PSBLAS, the module \verb|psb_krylov_mod| must be used too.
|
||||||
|
|
||||||
|
Two simple example programs showing the (basic) use of MLD2P4 are reported in
|
||||||
|
Section~\ref{sec:examples}.
|
||||||
|
|
||||||
|
\begin{table}[th]
|
||||||
|
{
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{|l|l|p{6.7cm}|}
|
||||||
|
\hline
|
||||||
|
Type & String & Default preconditioner \\ \hline
|
||||||
|
No preconditioner &'NOPREC'& (Considered only to use the PSBLAS
|
||||||
|
Krylov solvers with no preconditioner.) \\
|
||||||
|
Diagonal & 'DIAG' & --- \\
|
||||||
|
Block Jacobi & 'BJAC' & ILU(0) on the local blocks.\\
|
||||||
|
Additive Schwarz & 'AS' & Restricted Additive Schwarz (RAS),
|
||||||
|
with overlap 1 and ILU(0) on the local blocks. \\
|
||||||
|
Multilevel &'ML' & Multi-level hybrid preconditioner (additive on the
|
||||||
|
same level and multiplicative through the levels),
|
||||||
|
with post-smoothing only. Number of levels: 2;
|
||||||
|
post-smoother: block-Jacobi preconditioner, with ILU(0)
|
||||||
|
on the local blocks; coarsest matrix: distributed among the
|
||||||
|
processors; corase-level solver: 4 sweeps of the
|
||||||
|
block-Jacobi solver, with ILU(0) on the blocks. \\
|
||||||
|
\hline
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
}
|
||||||
|
\caption{Preconditioner types and default choices.\label{tab:precinit}}
|
||||||
|
\end{table}
|
||||||
|
|
||||||
|
\subsection{Examples\label{sec:examples}}
|
||||||
|
|
||||||
|
The simple code reported below shows how to set and apply the MLD2P4 default multi-level
|
||||||
|
preconditioned, i.e.\ the two-level hybrid post-smoothed Schwarz preconditioner, using block-Jacobi with ILU(0) on the blocks as basic preconditioner,
|
||||||
|
a coarse matrix distributed among the processors, and four block-Jacobi sweeps with ILU(0) on the blocks as approximate coarse-level solver. The choice of this preconditioner is made
|
||||||
|
by simply specifying \verb|'ML'| as second argument of \verb|mld_precinit|
|
||||||
|
(a call to \verb|mld_precset| is not needed).
|
||||||
|
The preconditioner is applied within the BiCGSTAB solver provided by PSBLAS.
|
||||||
|
|
||||||
|
The part of the code concerning the
|
||||||
|
reading and assembling of the sparse matrix and the right-hand side vector, performed
|
||||||
|
through the PSBLAS routines for sparse matrix and vector management, is not reported
|
||||||
|
here for brevity. Other statements concerning the use of PSBLAS are neglected too.
|
||||||
|
The complete code can be found in the example program file \verb|example_2lev_default.f90|
|
||||||
|
in the directory \textbf{XXXXXX (SPECIFICARE).} Note that the modules \verb|psb_base_mod|
|
||||||
|
and \verb|psb_util_mod| at the beginning of the code are required by PSBLAS.
|
||||||
|
For details on the use of the PSBLAS routines, see the PSBLAS User's Guide \cite{}.
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
use psb_base_mod
|
||||||
|
use psb_util_mod
|
||||||
|
use mld_prec_mod
|
||||||
|
use psb_krylov_mod
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! sparse matrix
|
||||||
|
type(psb_dspmat_type) :: A
|
||||||
|
! sparse matrix descriptor
|
||||||
|
type(psb_desc_type) :: DESC_A
|
||||||
|
! preconditioner
|
||||||
|
type(mld_prec_type) :: PRE
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! initialize the parallel environment
|
||||||
|
call psb_init(ictxt)
|
||||||
|
call psb_info(ictxt,iam,np)
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! read and assemble the matrix A and the right-hand
|
||||||
|
! side b using PSBLAS routines for sparse matrix /
|
||||||
|
! vector management
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! initialize the default multi-level preconditioner
|
||||||
|
! (two-level hybrid post-smoothed Schwarz)
|
||||||
|
call mld_precinit(PRE,'ML',info)
|
||||||
|
!
|
||||||
|
! build the preconditioner
|
||||||
|
call psb_precbld(A,PRE,DESC_A,info)
|
||||||
|
!
|
||||||
|
! set the solver parameters and the initial guess
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! solve Ax=b with preconditioned BiCGSTAB
|
||||||
|
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! cleanup the preconditioner
|
||||||
|
call mld_precfree(PRE,info)
|
||||||
|
!
|
||||||
|
! cleanup other data structures
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! exit the parallel environment
|
||||||
|
call psb_exit(ictxt)
|
||||||
|
stop
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
|
||||||
|
\textbf{MODIFICARE TUTTA LA PARTE CHE SEGUE:\\
|
||||||
|
- solo istruzioni diverse dall'esempio precedente (essenzialmente il setting del precondizionatore, magari con piu' chiamate a precset;\\
|
||||||
|
- lasciare l'osservazione sulla specifica esplicita del numero di livelli;\\
|
||||||
|
- rimandare al paragrafo successivo per una decrizione accurata di tutti i parametri;\\
|
||||||
|
- lasciare l'osservazione sui vecchi utenti di PSBLAS.}\\
|
||||||
|
|
||||||
|
In the following we describe the general procedure for setting and building one of the MLD2P4 preconditioners.
|
||||||
|
The user has first to prepare the preconditioner data structure by using the routine \verb|mld_precinit|. Input parameters
|
||||||
|
for this routine include a string parameter, needed to define the preconditioner type, and an optional integer parameter
|
||||||
|
specifying the number of the levels in the case of a multi-level preconditioner.
|
||||||
|
Note that if the optional parameter is not present and a multi-level preconditioner has been chosen,
|
||||||
|
a two-level preconditioner is set. On the other hand, the integer parameter is ignored if the type of the preconditioner is not multilevel.
|
||||||
|
In Table \ref{tab:precinit} we report both the possible choices for the preconditioner type
|
||||||
|
and the related default preconditioners.
|
||||||
|
|
||||||
|
|
||||||
|
The user of MLD2P4 may set a lot of parameters for one-level and multi-level Schwarz, in order
|
||||||
|
to define a different preconditioner than that of default choices. The parameters
|
||||||
|
can be set through the routine \verb|mld_precset|. The APIs of \verb|mld_precinit| and \verb|mld_precset| as well as the complete
|
||||||
|
list of the parameters that can be set with the corresponding allowed values are reported in Section \ref{sec:highlevel}. In the following a simple code
|
||||||
|
for a three-level hybrid post-smoothed Schwarz preconditioner, using RAS with overlap 1 as local preconditioner,
|
||||||
|
with ILU(0) on the local blocks, a distributed coarse matrix, four block-Jacobi sweeps with the UMFPACK LU
|
||||||
|
factorization on the blocks as coarse-matrix solver, is reported. Note that for the multi-level preconditioners, the levels are numbered in increasing
|
||||||
|
order starting from the finest one, i.e. level 1 is the finest level.
|
||||||
|
For more details, see the test program \verb|example2.f90| in xxxx(directory dei test).\\[0.5cm]
|
||||||
|
|
||||||
|
\begin{verbatim}
|
||||||
|
use psb_base_mod
|
||||||
|
use psb_util_mod
|
||||||
|
use mld_prec_mod
|
||||||
|
use psb_krylov_mod
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! sparse matrix
|
||||||
|
type(psb_dspmat_type) :: A
|
||||||
|
! sparse matrix descriptor
|
||||||
|
type(psb_desc_type) :: DESC_A
|
||||||
|
! preconditioner data
|
||||||
|
type(mld_dprec_type) :: PRE
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! initialization of the parallel environment
|
||||||
|
|
||||||
|
call psb_init(ictxt)
|
||||||
|
call psb_info(ictxt,iam,np)
|
||||||
|
... ...
|
||||||
|
! read and assemble the matrix A and the right-hand
|
||||||
|
! side vector b using PSBLAS routines for sparse
|
||||||
|
! matrix/vector management
|
||||||
|
... ...
|
||||||
|
! prepare the three-level hybrid post-smoothed Schwarz
|
||||||
|
! using RAS with overlap 1 as local preconditioner
|
||||||
|
!
|
||||||
|
call mld_precinit(PRE,'ML',info,nlev=3)
|
||||||
|
call mld_precset(PRE,mld_n_ovr_,novr=1,info,ilev=1)
|
||||||
|
call mld_precset(PRE,mld_sub_restr_,psb_halo_,info,ilev=1)
|
||||||
|
NOTA: e' PROPRIO BRUTTO "PSB_HALO_", BISOGNEREBBE AVERE COSTANTI CHE HANNO IL PREFISSO MLD!
|
||||||
|
!
|
||||||
|
! build preconditioner
|
||||||
|
call psb_precbld(A,PRE,DESC_A,info)
|
||||||
|
!
|
||||||
|
! set solver parameters and initial guess
|
||||||
|
... ...
|
||||||
|
! solve Ax=b with preconditioned BiCGSTAB
|
||||||
|
|
||||||
|
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
|
||||||
|
... ...
|
||||||
|
!
|
||||||
|
! cleanup storage and exit
|
||||||
|
!
|
||||||
|
call mld_precfree(PRE,info)
|
||||||
|
!
|
||||||
|
call psb_gefree(b,DESC_A,info)
|
||||||
|
call psb_gefree(x,DESC_A,info)
|
||||||
|
call psb_spfree(A,DESC_A,info)
|
||||||
|
call psb_cdfree(DESC_A,info)
|
||||||
|
!
|
||||||
|
call psb_exit(ictxt)
|
||||||
|
stop
|
||||||
|
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
{\bf Remark for users with PSBLAS-based legacy codes:} when MLD2P4 is installed, a PSBLAS user, with a PSBLAS-based legacy code
|
||||||
|
calling base preconditioners included in PSBLAS (NOPREC, DIAG and BJAC), is able to use the same preconditioners without changes to the code, if she/he
|
||||||
|
includes in her/his program the file \verb|psb_prec_mod|.
|
||||||
|
|
||||||
|
%%% Local Variables:
|
||||||
|
%%% mode: latex
|
||||||
|
%%% TeX-master: "userguide"
|
||||||
|
%%% End:
|
@ -0,0 +1,279 @@
|
|||||||
|
\section{High-Level User Interface\label{sec:highlevel}}
|
||||||
|
|
||||||
|
At the upper layer of MLD2P4, five black-box routines encapsulate all the functionalities for the construction
|
||||||
|
and the application of any of the multi-level preconditioners.
|
||||||
|
In the following we give the details of the above routines. Note that for each routine are available four
|
||||||
|
different versions depending on involved data types: Real-Single/Double Precision, Complex-Single/Double Precision.
|
||||||
|
|
||||||
|
\subsection{Preconditioner Setup and Building}\label{sec:setup}
|
||||||
|
|
||||||
|
The setup of a MLD2P4 preconditioner is obtained by using the \verb|mld_precinit| routine, which
|
||||||
|
allocates and initializes the preconditioner data structure.
|
||||||
|
The API of this routine as well as the description of the arguments is reported in Fig.~\ref{fig:prcinit}.
|
||||||
|
Note that the allowed values for the \verb|ptype| argument are reported in Table~\ref{tab:precinit} (Sec. \ref{sec:started}).
|
||||||
|
%
|
||||||
|
\begin{figure}[h]
|
||||||
|
\begin{center}
|
||||||
|
{\small
|
||||||
|
\begin{verbatim}
|
||||||
|
mld_precinit(p,ptype,info,nlev)
|
||||||
|
|
||||||
|
Arguments:
|
||||||
|
p type(mld_dprec_type), input/output.
|
||||||
|
The preconditioner data structure.
|
||||||
|
ptype character, input. The type of preconditioner.
|
||||||
|
info integer, output. Error code.
|
||||||
|
nlev integer, optional, input.
|
||||||
|
The number of levels of the multilevel preconditioner.
|
||||||
|
If nlev is not present and ptype=`ML'/`ml',
|
||||||
|
then nlev=2 is assumed.
|
||||||
|
Otherwise, nlev is ignored.
|
||||||
|
\end{verbatim}
|
||||||
|
}
|
||||||
|
\end{center}
|
||||||
|
\caption{API of the routine for preconditioner allocation and inizialization.\label{fig:prcinit}}
|
||||||
|
\end{figure}
|
||||||
|
%
|
||||||
|
%
|
||||||
|
\begin{figure}[h]
|
||||||
|
\begin{center}
|
||||||
|
{\small
|
||||||
|
\begin{verbatim}
|
||||||
|
mld_precfree(p,info)
|
||||||
|
|
||||||
|
Arguments:
|
||||||
|
p - type(mld_dprec_type), input/output.
|
||||||
|
The preconditioner data structure to be deallocated.
|
||||||
|
info - integer, output.
|
||||||
|
Error code.
|
||||||
|
\end{verbatim}
|
||||||
|
}
|
||||||
|
\end{center}
|
||||||
|
\caption{API of the routine for preconditioner deallocation.\label{fig:prcfree}}
|
||||||
|
\end{figure}
|
||||||
|
|
||||||
|
A twin routine for deallocation of the preconditioner data structure is the \verb|mld_precfree| routine, whose API is reported in
|
||||||
|
Fig.~\ref{fig:prcfree}.
|
||||||
|
As mentioned in Section~\ref{sec:multilevel}, a multi-level preconditioner is a combination
|
||||||
|
of coarse-level corrections and one-level preconditioner (or smoothers).
|
||||||
|
Different combinations of these components together with different type of one-level preconditioner
|
||||||
|
as well as different algorithms to build and apply coarse-level corrections allow to the user of defining different multi-level
|
||||||
|
preconditioners.
|
||||||
|
The user of MLD2P4 may specify the type of multi-level framework (additive or multiplicative), details on the
|
||||||
|
aggregation algorithm, details on the type and the way for applying the one-level preconditioner
|
||||||
|
(as pre-smoother, post-smoother or both), the coarsest matrix storage
|
||||||
|
(distributed or replicated), the type of the solver to be employed at the coarsest level
|
||||||
|
and related details, by setting some parameters through the routine \verb|mld_precset| (see Section~\ref{sec:list}).
|
||||||
|
The API of this routine is reported in Fig.~\ref{fig:prcset}.
|
||||||
|
%
|
||||||
|
\begin{figure}[h]
|
||||||
|
\begin{center}
|
||||||
|
{\small
|
||||||
|
\begin{verbatim}
|
||||||
|
mld_precset(p,what,val,info,ilev)
|
||||||
|
|
||||||
|
Arguments:
|
||||||
|
p - type(mld_dprec_type), input/output.
|
||||||
|
The preconditioner data structure.
|
||||||
|
what - integer, input.
|
||||||
|
The number identifying the parameter to be set.
|
||||||
|
A mnemonic constant has been associated to each of these
|
||||||
|
numbers.
|
||||||
|
val - integer/character, input.
|
||||||
|
The value of the parameter to be set.
|
||||||
|
info - integer, output.
|
||||||
|
Error code.
|
||||||
|
ilev - integer, optional, input.
|
||||||
|
For the multilevel preconditioner, the level at which the
|
||||||
|
preconditioner parameter has to be set.
|
||||||
|
If nlev is not present, the parameter identified by 'what'
|
||||||
|
is set at all the appropriate levels.
|
||||||
|
\end{verbatim}
|
||||||
|
}
|
||||||
|
\end{center}
|
||||||
|
\caption{API of the routine for preconditioner setup.\label{fig:prcset}}
|
||||||
|
\end{figure}
|
||||||
|
%
|
||||||
|
Finally, to build a preconditioner, according to the requirements made trough the routines \verb|mld_precinit| and \verb|mld_precset|,
|
||||||
|
a user of MLD2P4 have to call the \verb|prec_build| routine, whose API is reported in Figure~\ref{fig:prcbld}.
|
||||||
|
%
|
||||||
|
\begin{figure}[h]
|
||||||
|
\begin{center}
|
||||||
|
{\small
|
||||||
|
\begin{verbatim}
|
||||||
|
mld_precbld(a,desc_a,prec,info)
|
||||||
|
|
||||||
|
Arguments:
|
||||||
|
a - type(psb_dspmat_type).
|
||||||
|
The sparse matrix structure containing the local part of the
|
||||||
|
matrix to be preconditioned.
|
||||||
|
desc_a - type(psb_desc_type), input.
|
||||||
|
The communication descriptor of a.
|
||||||
|
p - type(mld_dprec_type), input/output.
|
||||||
|
The preconditioner data structure containing the local part
|
||||||
|
of the preconditioner to be built.
|
||||||
|
info - integer, output.
|
||||||
|
Error code.
|
||||||
|
\end{verbatim}
|
||||||
|
}
|
||||||
|
\end{center}
|
||||||
|
\caption{API of the routine for preconditioner building.\label{fig:prcbld}}
|
||||||
|
\end{figure}
|
||||||
|
|
||||||
|
\subsubsection{List of the preconditioner parameters\label{sec:list}}
|
||||||
|
|
||||||
|
In the following we report the list of possible parameters to be set through the \verb|mld_precset| routine,
|
||||||
|
in order to choose the type of multi-level preconditioner. The parameters are classified depending on their scope.
|
||||||
|
Note that for character data both uppercase and lowercase strings are allowed.
|
||||||
|
\begin{table}[h]
|
||||||
|
{\small \label{tab:prec_type}
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
Parameter (\verb|what|) & Allowed values ( \verb|val|)\\
|
||||||
|
\verb|mld_ml_type_| & 'ADD', 'MULT'\\
|
||||||
|
& Define the type of multi-level preconditioner.\\
|
||||||
|
\verb|mld_prec_type_| & 'DIAG', 'BJAC', 'AS' \\
|
||||||
|
& Define the smoother at a certain level.\\
|
||||||
|
\verb|mld_smooth_pos_| & 'PRE', 'POST', 'BOTH'\\
|
||||||
|
& Define the way to apply the smoother.\\
|
||||||
|
\end{tabular}
|
||||||
|
\caption{Parameters for preconditioner type.}
|
||||||
|
}
|
||||||
|
\end{table}
|
||||||
|
|
||||||
|
In order to build a coarse matrix from a fine one, this version of MLD2P4 implements the
|
||||||
|
smoothed aggregation algorithm described in Section~\ref{sec:aggregation}. However, since for nonsymmetric problems the
|
||||||
|
application of a correct smoothed procedure is yet an open problem~\cite{lin}, the user
|
||||||
|
may also choose to apply a nonsmoothed aggregation technique, where the prolongator operator from
|
||||||
|
the coarse to fine-space vertices is the simple piecewice constant interpolation
|
||||||
|
(the tentative prolongator) operator defined in Section~\ref{sec:aggregation}.
|
||||||
|
The coarsening scheme takes into account possible anisotropic features of the problems, by using
|
||||||
|
a threshold level to be used for dropping matrix coefficients during the process.
|
||||||
|
The parallel implementation of the coarsening algorithm is based on a decoupled approach, where each process applies the coarsening scheme
|
||||||
|
to its own local data. The uncoupled scheme can be applied to the matrix $A+A^T$, in the case of matrices with nonsymmetric sparsity pattern.
|
||||||
|
In the Table \ref{tab:aggr_type} we list the parameters that the user can specify for the aggregation algorithm.
|
||||||
|
\begin{table}[h]
|
||||||
|
{\small \label{tab:aggr_type}
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
Parameter & Allowed values \\
|
||||||
|
(\verb|what|) & ( \verb|val|)\\
|
||||||
|
\verb|mld_aggr_alg_| & 'DEC', 'SYMDEC'\\
|
||||||
|
& Define the aggregation scheme\\
|
||||||
|
& Now, only decoupled aggregation is available \\
|
||||||
|
& (if 'SYMDEC' is set, the symmetric part of the matrix is considered)\\
|
||||||
|
\verb|mld_aggr_kind_| & 'SMOOTH', 'RAW'\\
|
||||||
|
& Define the type of aggregation technique (smoothed or nonsmoothed).\\
|
||||||
|
\verb|mld_aggr_thresh_| & Dropping threshold in aggregation.\\
|
||||||
|
& Default 0.0\\
|
||||||
|
\verb|mld_aggr_eig_| & NON E' DEFINITA LA STRINGA CORRISPONDENTE a mldmaxnorm\\
|
||||||
|
& Define the algorithm to evaluate the maximum eigenvalue\\
|
||||||
|
& of $D^{-1}A$ for smoothed aggregation. Now only the A-norm of the\\
|
||||||
|
& matrix is available.\\
|
||||||
|
\end{tabular}
|
||||||
|
\caption{Parameters for aggregation type.}
|
||||||
|
}
|
||||||
|
\end{table}
|
||||||
|
|
||||||
|
Some options are available for the system involving the coarsest matrix.
|
||||||
|
Indeed, this matrix can be replicated or distributed among the processors.
|
||||||
|
In the former case, various versions of incomplete LU (ILU) factorizations of the
|
||||||
|
coarsest matrix are available in order to solve the coarsest system.
|
||||||
|
In the current version of MLD2P4, the following factorizations are available~\cite{saad}:
|
||||||
|
\begin{description}
|
||||||
|
\item[ILU(k):] ILU factorization with fill-in level $k$;
|
||||||
|
\item[MILU(k):] modified ILU factorization with fill-in level $k$;
|
||||||
|
\item[ILU(k,t):] ILU with threshold $t$ and $k$ additional entries in each row of the L and U factors with respect to the initial sparsity pattern.
|
||||||
|
\end{description}
|
||||||
|
Furthermore, interfaces to UMFPACK~\cite{UMFPACK}, version 4.4, and to SuperLU package~\cite{SUPERLU}, version 3.0, have been also available to deal
|
||||||
|
with the coarsest system, when the coarsest matrix is replicated among the processors.
|
||||||
|
On the other hand, to solve the coarsest-level system when the coarsest matrix is distributed,
|
||||||
|
a block-Jacobi routine has been developed. It uses the different versions of ILU or the LU
|
||||||
|
factorization on the coarse matrix diagonal blocks held by the processors. In the case of
|
||||||
|
distributed coarsest matrix is also available an interface to SupeLU$\_$dist~\cite{SUPERLUDIST}, version 2.0, for distributed
|
||||||
|
sparse factorization and solve.
|
||||||
|
See the Table \ref{tab:coarse_mat} for details.
|
||||||
|
\begin{table}[h]
|
||||||
|
{\small \label{tab:coarse_mat}
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
Parameter & Allowed values\\
|
||||||
|
( \verb|what|) & ( \verb|val|)\\
|
||||||
|
\verb|mld_coarse_mat_| & 'DISTR', 'REPL' \\
|
||||||
|
& Coarse Matrix: distributed or replicated \\
|
||||||
|
\verb|mld_coarse_solve_| & 'ILU', 'MILU', 'ILUT', 'SLU', 'UMF', SLUDIST', BJAC????\\
|
||||||
|
& Available Coarse solver.\\
|
||||||
|
& Only SLUDIST e BJAC can be used when coarse matrix is distributed\\
|
||||||
|
\verb|mld_coarse_BJAC_sweeps_| & (NON VA BENE mldcoarsesweeps) number of Block-Jacobi sweeps when BJAC is used as coarsest solver\\
|
||||||
|
\verb|mld_coarse_fill_in_| & level of fill-in in MILU and ILU factorization\\
|
||||||
|
& E IL THRESHOLD PER ILUT? \\
|
||||||
|
\end{tabular}
|
||||||
|
\caption{Parameters for coarsest matrix solver.}
|
||||||
|
}
|
||||||
|
\end{table}
|
||||||
|
|
||||||
|
When a Schwarz algorithm is considered as smoother at a certain level or as one-level preconditioner, the user may set many parameters
|
||||||
|
in order to choose the type of additive Schwarz version (AS,RAS,ASH), the number of overlaps as well as the local solver.
|
||||||
|
All the parameters are reported in Table \ref{tab:schwarz_type}.
|
||||||
|
\begin{table}[h]
|
||||||
|
{\small \label{tab:schwarz_type}
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
Parameter & Allowed values\\
|
||||||
|
(\verb|what|) & (\verb|val|)\\
|
||||||
|
\verb|mld_n_ovr_| & Number of overlaps \\
|
||||||
|
\verb|mld_sub_restr_| & 'HALO', 'NONE'\\
|
||||||
|
\verb|mld_sub_prol_| & 'SUM', 'NONE'\\
|
||||||
|
\verb|mld_sub_solve_| & 'ILU', 'MILU', 'ILUT', 'SLU', 'UMF'\\
|
||||||
|
\verb|mld_sub_ren_| & MANCANO LE STRINGHE\\
|
||||||
|
\verb|mld_sub_fill_in_| & level of fill-in in local diagonal blocks, when ILU-type factorizations are used\\
|
||||||
|
\end{tabular}
|
||||||
|
\caption{Parameters for Schwarz smoother/preconditioner type.}
|
||||||
|
}
|
||||||
|
\end{table}
|
||||||
|
Its worth noting that, the classical AS method corresponds to the couple of values 'HALO' and 'SUM' of the argument \verb|val|,
|
||||||
|
for the values \verb|mld_sub_restr_| and \verb|mld_sub_prol_| of the argument \verb|what|, respectively. While, the RAS method corresponds to
|
||||||
|
the couple of values 'NONE' and 'SUM' and ASH method corresponds to the couple of values 'HALO' and 'NONE'.
|
||||||
|
|
||||||
|
\subsection{Preconditioner Application} \label{sec:application}
|
||||||
|
|
||||||
|
Once the preconditioner has been built, it may be applied at each iteration
|
||||||
|
of a Krylov solver by calling the routine \verb|mld_precaply| (CAMBIARE NOME ROUTINE NEL SOFTWARE EVITANDO L'UNDERSCORE),
|
||||||
|
whose API is shown in Figure~\ref{fig:prcaply}.
|
||||||
|
This routine computes $y = op(M^{-1})\, x$, where $M$ is the previously built
|
||||||
|
preconditioner, stored in the \verb|prec| data structure, and $op$
|
||||||
|
denotes the matrix itself or its transpose, according to the value of \verb|trans|.
|
||||||
|
Note that this routine is called within the PSBLAS-based Krylov solver available in the PSBLAS library (see the PSBLAS User's Guide for details),
|
||||||
|
therefore, the use of this routine is generally transparent to the MLD2P4 user.
|
||||||
|
%
|
||||||
|
\begin{figure}[h]
|
||||||
|
\begin{center}
|
||||||
|
{\small
|
||||||
|
\begin{verbatim}
|
||||||
|
mld_precaply(prec,x,y,desc_data,info,trans,work)
|
||||||
|
|
||||||
|
Arguments:
|
||||||
|
prec - type(mld_dprec_type), input.
|
||||||
|
The preconditioner data structure containing the local part
|
||||||
|
of the preconditioner to be applied.
|
||||||
|
x - real(psb_dpk_), dimension(:), input.
|
||||||
|
The local part of the vector X in Y := op(M^(-1)) * X.
|
||||||
|
y - real(psb_dpk_), dimension(:), output.
|
||||||
|
The local part of the vector Y in Y := op(M^(-1)) * X.
|
||||||
|
desc_data - type(psb_desc_type), input.
|
||||||
|
The communication descriptor associated to the matrix to be
|
||||||
|
preconditioned.
|
||||||
|
info - integer, output.
|
||||||
|
Error code.
|
||||||
|
trans - character(len=1), optional.
|
||||||
|
If trans='N','n' then op(M^(-1)) = M^(-1);
|
||||||
|
if trans='T','t' then op(M^(-1)) = M^(-T) (transpose of M^(-1)).
|
||||||
|
work - real(psb_dpk_), dimension (:), optional, target.
|
||||||
|
Workspace. Its size must be at
|
||||||
|
least 4*psb_cd_get_local_cols(desc_data).
|
||||||
|
\end{verbatim}
|
||||||
|
}
|
||||||
|
\end{center}
|
||||||
|
\caption{API of the routine for preconditioner application.\label{fig:prcaply}}
|
||||||
|
\end{figure}
|
||||||
|
|
||||||
|
%%% Local Variables:
|
||||||
|
%%% mode: latex
|
||||||
|
%%% TeX-master: "userguide"
|
||||||
|
%%% End:
|
@ -0,0 +1,10 @@
|
|||||||
|
\section{List of Routines}\label{sec:routines}
|
||||||
|
|
||||||
|
Elenco (ordine alfabetico) di tutte le routine, con rinvio (ipertestuale e num. pag.) alla descrizione
|
||||||
|
di ciascuna in qualche paragrafo precedente
|
||||||
|
(una specie di indice analitico, che rimanda alle routine descritte precedentemente nei rispettivi paragrafi)
|
||||||
|
|
||||||
|
%%% Local Variables:
|
||||||
|
%%% mode: latex
|
||||||
|
%%% TeX-master: "userguide"
|
||||||
|
%%% End:
|
@ -0,0 +1,62 @@
|
|||||||
|
\section{General Overview\label{sec:overview}}
|
||||||
|
|
||||||
|
The \emph{Multi-Level Domain Decomposition Parallel Preconditioners Package based on
|
||||||
|
PSBLAS (MLD2P4}) provides various versions of multi-level Schwarz preconditioners~\cite{DD2},
|
||||||
|
to be used in the iterative solutions of sparse linear systems $Ax=b$, where
|
||||||
|
$A$ is a square, real or complex, sparse matrix with a symmetric sparsity pattern.
|
||||||
|
\textbf{Ma non abbiamo detto che, se il pattern di sparista' non e' simmetrico,
|
||||||
|
lavoriamo su $(A+A^T)/2$? Ma questo vale solo per l'aggregazione? Dovremmo fare
|
||||||
|
qualcosa di consistente anche con 1-lev Schwarz.}
|
||||||
|
Both additive and hybrid preconditioners, i.e.\ multiplicative among the levels
|
||||||
|
and additive inside a level, are implemented; the basic additive Schwarz preconditioners
|
||||||
|
are obtained by considering only one level. A purely algebraic approach is used to
|
||||||
|
generate a sequence of coarse-level corrections to a basic preconditioner, without
|
||||||
|
explicitly using any information on the geometry of the original problem (e.g.\ the
|
||||||
|
discretization of a PDE). The smoothed aggregation technique is applied
|
||||||
|
as algebraic coarsening strategy~\cite{}.
|
||||||
|
|
||||||
|
The package is written in Fortran~95, using object-oriented techniques,
|
||||||
|
and is based on a distributed-memory parallel programming paradigm. \textbf{SALVATORE,
|
||||||
|
potresti aggiungere due righe sulla scelta del Fortran 95 e sul semplice interfacciamento
|
||||||
|
con i legacy codes, senza ripetere quello che e' detto sotto sulla scelta di PSBLAS?}
|
||||||
|
Single and double precision implementations of MLD2P4 are available for both the
|
||||||
|
real and the complex case, that can be used through a single interface.
|
||||||
|
\textbf{SALVATORE, funziona tutto?}
|
||||||
|
|
||||||
|
MLD2P4 has been designed to implement scalable and easy-to-use multilevel preconditioners
|
||||||
|
in the context of the PSBLAS (Parallel Sparse BLAS) computational framework~\cite{}.
|
||||||
|
PSBLAS is a library originally developed to address the parallel implementation of
|
||||||
|
iterative solvers for sparse linear system, by providing basic linear algebra
|
||||||
|
operators and data management facilities for distributed sparse matrices; it
|
||||||
|
also includes parallel Krylov solvers, built on the top of the basic PSBLAS kernels.
|
||||||
|
The preconditioners available in MLD2P4 can be used with these Krylov solvers.
|
||||||
|
The choice of PSBLAS has been mainly motivated by the need of having
|
||||||
|
a portable and efficient software infrastructure implementing ``de facto'' standard
|
||||||
|
parallel sparse linear algebra kernels, to pursue goals such as performance,
|
||||||
|
portability, modularity ed extensibility in the development of the preconditioner
|
||||||
|
package. On the other hand, the implementation of MLD2P4 has led to some
|
||||||
|
revisions and extentions of the PSBLAS kernels, leading to the
|
||||||
|
recent PSBLAS 2.0 version~\cite{}. The inter-process comunication required
|
||||||
|
by MLD2P4 is encapsulated into the PSBLAS routines, except few cases where
|
||||||
|
MPI~\cite{} is explicitly called. Therefore, MLD2P4 can be run on any parallel
|
||||||
|
machine where PSBLAS and MPI implementations are available.
|
||||||
|
|
||||||
|
MLD2P4 has a layered and modular software architecture where three main layers can be identified. The lower layer consists of the PSBLAS kernels, the middle one implements
|
||||||
|
the construction and application phases of the preconditioners, and the upper one
|
||||||
|
provides a uniform and easy-to-use interface to all the preconditioners.
|
||||||
|
This architecture allows for different levels of use of the package:
|
||||||
|
few black-box routines at the upper level allow non-expert users to easily
|
||||||
|
build any preconditioner available in MLD2P4 and to apply it within a PSBLAS Krylov solver.
|
||||||
|
On the other hand, the routines of the middle and lower layer can be used and extended
|
||||||
|
by expert users to build new versions of multi-level Schwarz preconditioners.\\
|
||||||
|
|
||||||
|
\textbf{Organizzazione della guida:\\
|
||||||
|
dire che per il momento non
|
||||||
|
forniamo anche la documentazione del middle layer, ma lo faremo in seguito\\}
|
||||||
|
|
||||||
|
\textbf{Evidenziare le parole chiave che caratterizzano il nostro package}
|
||||||
|
|
||||||
|
%%% Local Variables:
|
||||||
|
%%% mode: latex
|
||||||
|
%%% TeX-master: "userguide"
|
||||||
|
%%% End:
|
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue