mld2p4:
docs/pdf/Makefile docs/pdf/abstract.tex docs/pdf/advanced.tex docs/pdf/background.tex docs/pdf/bibliography.tex docs/pdf/building.tex docs/pdf/conventions.tex docs/pdf/distribution.tex docs/pdf/errors.tex docs/pdf/gettingstarted.tex docs/pdf/highlevelview.tex docs/pdf/listofroutines.tex docs/pdf/overview.tex docs/pdf/userguide.tex docs/userguide.pdf New documentation, partial fixes.stopcriterion
parent
9eeef87a3a
commit
001f6693b8
@ -1,19 +1,19 @@
|
||||
\begin{abstract}
|
||||
\emph{MLD2P4 (Multi-Level Domain Decomposition Parallel Preconditioners Package based on
|
||||
PSBLAS}) is a package of parallel algebraic multi-level preconditioners.
|
||||
It implements various versions of one-level additive and of multi-level additive
|
||||
and hybrid Schwarz algorithms. In the multi-level case, a purely algebraic approach
|
||||
is applied to generate coarse-level corrections, so that no geometric background is needed
|
||||
concerning the matrix to be preconditioned. The matrix is required to be square, real or complex, with a symmetric sparsity pattern \textbf{Non consideriamo anche il caso non simmetrico
|
||||
con $(A+A^T)/2$?}.
|
||||
|
||||
MLD2P4 has been designed to provide scalable and easy-to-use preconditioners in the
|
||||
context of the PSBLAS (Parallel Sparse Basic Linear Algebra Subprograms)
|
||||
computational framework and can be used in conjuction with the Krylov solvers
|
||||
available in this framework. MLD2P4 enables the user to easily specify different aspects
|
||||
of a generic algebraic multilevel Schwarz preconditioner, thus allowing to search
|
||||
for the ``best'' preconditioner for the problem at hand. The package has been designed
|
||||
employing object-oriented techniques, using Fortran 95 and MPI, with interfaces to
|
||||
additional external libraries such as UMFPACK, SuperLU and SuperLU\_Dist, that
|
||||
can be exploited in building multi-level preconditioners.
|
||||
\begin{abstract}
|
||||
\emph{MLD2P4 (Multi-Level Domain Decomposition Parallel Preconditioners Package based on
|
||||
PSBLAS}) is a package of parallel algebraic multi-level preconditioners.
|
||||
It implements various versions of one-level additive and of multi-level additive
|
||||
and hybrid Schwarz algorithms. In the multi-level case, a purely algebraic approach
|
||||
is applied to generate coarse-level corrections, so that no geometric background is needed
|
||||
concerning the matrix to be preconditioned. The matrix is required to be square, real or complex, with a symmetric sparsity pattern \textbf{Non consideriamo anche il caso non simmetrico
|
||||
con $(A+A^T)/2$?}.
|
||||
|
||||
MLD2P4 has been designed to provide scalable and easy-to-use preconditioners in the
|
||||
context of the PSBLAS (Parallel Sparse Basic Linear Algebra Subprograms)
|
||||
computational framework and can be used in conjuction with the Krylov solvers
|
||||
available in this framework. MLD2P4 enables the user to easily specify different aspects
|
||||
of a generic algebraic multilevel Schwarz preconditioner, thus allowing to search
|
||||
for the ``best'' preconditioner for the problem at hand. The package has been designed
|
||||
employing object-oriented techniques, using Fortran 95 and MPI, with interfaces to
|
||||
additional external libraries such as UMFPACK, SuperLU and SuperLU\_Dist, that
|
||||
can be exploited in building multi-level preconditioners.
|
||||
\end{abstract}
|
@ -1,12 +1,12 @@
|
||||
\section{Advanced Use}\label{sec:advanced}
|
||||
|
||||
- MLD2P4 software architecture \\
|
||||
- preconditioner data structure (descrizione "dettagliata") + possibilita' di settare singolarmente
|
||||
i vari livelli (possibilita' accennata solamente nella precedente descrizione di precset) \\
|
||||
- descrizione routine medium level (con introduzione sulle potenzialita' di ampliamento (?), offerte
|
||||
da queto strato software) \\
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
\section{Advanced Use}\label{sec:advanced}
|
||||
|
||||
- MLD2P4 software architecture \\
|
||||
- preconditioner data structure (descrizione "dettagliata") + possibilita' di settare singolarmente
|
||||
i vari livelli (possibilita' accennata solamente nella precedente descrizione di precset) \\
|
||||
- descrizione routine medium level (con introduzione sulle potenzialita' di ampliamento (?), offerte
|
||||
da queto strato software) \\
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
|
@ -1,291 +1,291 @@
|
||||
\section{Multi-level Domain Decomposition Background\label{sec:background}}
|
||||
|
||||
\emph{Domain Decomposition} (DD) preconditioners, coupled with Krylov iterative
|
||||
solvers, are widely used in the parallel solution of large and sparse linear systems.
|
||||
These preconditioners are based on the divide and conquer technique: the matrix
|
||||
to be preconditioned is divided into submatrices, a ``local linear system''
|
||||
involving each submatrix is (approximately) solved, and the local solutions are used
|
||||
to build a preconditioner for the whole original matrix. This process
|
||||
often corresponds to dividing a physical domain associated to the original matrix
|
||||
into subdomains, e.g. in a PDE discretization, to (approximately) solving the
|
||||
subproblems corresponding to the subdomains and to building an approximate
|
||||
solution of the original problem from the local solutions
|
||||
\cite{Cai_Widlund_92,dd1_94,dd2_96}.
|
||||
|
||||
\emph{Additive Schwarz} preconditioners are DD preconditioners using overlapping
|
||||
submatrices, i.e.\ with some common rows, to couple the local information
|
||||
related to the submatrices (see, e.g., \cite{dd2_96}).
|
||||
The main motivations for choosing Additive Schwarz preconditioners are their
|
||||
intrinsic parallelism and good \textbf{(dire good e' un po' "`forte"', dato che
|
||||
subito dopo diciamo che la convergenza dipende dal numero di sottomatrici)}
|
||||
convergence properties. A drawback of these
|
||||
preconditioners is that the number of iterations of the preconditioned solvers
|
||||
generally grows with the number of submatrices. This may be a serious limitation
|
||||
on parallel computers, since the number of submatrices usually matches the number
|
||||
of available processors. Optimal convergence rates, i.e.\ iteration numbers
|
||||
independent of the number of submatrices, can be obtained by correcting the
|
||||
preconditioner through a suitable approximation of the original linear system
|
||||
in a coarse space, which globally couples the information related to the single
|
||||
submatrices.
|
||||
|
||||
\emph{Two-level Schwarz} preconditioners are obtained
|
||||
by combining basic (one-level) Schwarz preconditioners with coarse-level
|
||||
corrections. In this context, the one-level preconditioner is often
|
||||
called smoother. Different two-level preconditioners are obtained by varying the
|
||||
choice of the smoother, of the coarse-level correction and the
|
||||
way they are combined \cite{dd2_96}. The same reasoning can be applied starting
|
||||
from the coarse-level system, i.e.\ a coarse-space correction can be built
|
||||
from this system, thus obtaining \emph{multi-level} preconditioners.
|
||||
|
||||
It is worth noting that optimal preconditioners do not necessarily correspond
|
||||
to minimum execution times. Indeed, to obtain effective multilevel preconditioners
|
||||
a tradeoff between optimality of convergence and the cost of building and applying
|
||||
the coarse-space corrections must be achieved. The choice of the number of levels,
|
||||
i.e.\ of the coarse-space corrections, also affects the effectiveness of the
|
||||
preconditioners. One more goal is to get convergence rates as less sensitive
|
||||
as possible to variations in the matrix coefficients.
|
||||
|
||||
Two main approaches can be used to build coarse-space corrections. The geometric approach
|
||||
applies coarsening strategies based on the knowledge of some physical grid associated
|
||||
to the matrix and requires the user to define grid transfer operators from the fine
|
||||
to the coarse levels and vice versa. This may result difficult for complex geometries;
|
||||
furthermore, suitable one-level preconditioners may be required to get efficient
|
||||
interplay between fine and coarse levels, e.g.\ when matrices with highly varying coefficients
|
||||
are considered. The algebraic approach builds coarse-space corrections using only matrix
|
||||
information. It performs a fully automatic coarsening and enforces the interplay between
|
||||
the fine and coarse levels by suitably choosing the coarse space and the coarse-to-fine
|
||||
interpolation \cite{StubenGMD69_99}.
|
||||
|
||||
MLD2P4 uses a pure algebraic approach for building the sequence of coarse matrices
|
||||
starting from the original matrix. The algebraic approach is based on the \emph{smoothed
|
||||
aggregation} algorithm \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}. A decoupled version
|
||||
of this algorithm is implemented, where the smoothed aggregation is applied locally
|
||||
to each submatrix \cite{Tuminaro_Tong_00}. In the next two subsections we provide
|
||||
a brief description of the multi-level Schwarz preconditioners and on the smoothed
|
||||
aggregation technique as implemented in MLD2P4. For further details the user
|
||||
is referred to \cite{para_04,apnum_07,aaecc_07,dd2_96}.
|
||||
|
||||
|
||||
\subsection{Multi-level Schwarz Preconditioners\label{sec:multilevel}}
|
||||
|
||||
The Multilevel preconditioners implemented in MLD2P4 are obtained by combining
|
||||
Additive Schwarz preconditioners with coarse-space corrections; therefore
|
||||
we first provide a sketch of the Additive Schwarz preconditioners.
|
||||
|
||||
Given a linear system
|
||||
\[ Ax=b, \]
|
||||
where $A=(a_{ij}) \in \Re^{n \times n}$ is a
|
||||
nonsingular sparse matrix with a symmetric non-zero pattern,
|
||||
let $G=(W,E)$ be the adjacency graph of $A$, where $W=\{1, 2, \ldots, n\}$
|
||||
and $E=\{(i,j) : a_{ij} \neq 0\}$ are the vertex set and the edge set of $G$,
|
||||
respectively. Two vertices are called adjacent if there is an edge connecting
|
||||
them. For any integer $\delta > 0$, a $\delta$-overlap
|
||||
partition of $W$ can be defined recursively as follows.
|
||||
Given a 0-overlap (or non-overlapping) partition of $W$,
|
||||
i.e.\ a set of $m$ disjoint nonempty sets $W_i^0 \subset W$ such that
|
||||
$\cup_{i=1}^m W_i^0 = W$, a $\delta$-overlap
|
||||
partition of $W$ is obtained by considering the sets
|
||||
$W_i^\delta \supset W_i^{\delta-1}$, obtained by including the vertices that
|
||||
are adjacent to any vertex in $W_i^{\delta-1}$.
|
||||
|
||||
Let $n_i^\delta$ be the size of $W_i^\delta$ and $R_i^{\delta} \in
|
||||
\Re^{n_i^\delta \times n}$ the restriction operator that maps
|
||||
a vector $v \in \Re^n$ onto the vector $v_i^{\delta} \in \Re^{n_i^\delta}$
|
||||
containing the components of $v$ corresponding to the vertices in
|
||||
$W_i^\delta$. The transpose of $R_i^{\delta}$ is a
|
||||
prolongation operator from $\Re^{n_i^\delta}$ to $\Re^n$.
|
||||
The matrix $A_i^\delta=R_i^\delta A (R_i^\delta)^T \in
|
||||
\Re^{n_i^\delta \times n_i^\delta}$ can be considered
|
||||
as a restriction of $A$ corresponding to the set $W_i^{\delta}$.
|
||||
|
||||
The \emph{classical one-level AS} preconditioner is defined by
|
||||
\[
|
||||
M_{AS}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
|
||||
(A_i^\delta)^{-1} R_i^{\delta},
|
||||
\]
|
||||
where $A_i^\delta$ is assumed to be nonsingular. Its application
|
||||
to a vector $v \in \Re^n$ within a Krylov solver requires the following
|
||||
three steps:
|
||||
\begin{enumerate}
|
||||
\item restriction of $v$ as $v_i = R_i^{\delta} v$, $i=1,\ldots,m$;
|
||||
\item (approximate) solution of the linear systems $A_i^\delta w_i = v_i$,
|
||||
$i=1,\ldots,m$;
|
||||
\item prolongation and sum of the $w_i$'s, i.e. $w = \sum_{i=1}^m (R_i^{\delta})^T w_i$.
|
||||
\end{enumerate}
|
||||
A variant of the classical AS preconditioner that outperforms it
|
||||
in terms of both convergence rate and of computation and communication
|
||||
time on parallel distributed-memory computers is the so-called \emph{Restricted AS
|
||||
(RAS)} preconditioner~\cite{Cai_Sarkis,Efstathiou_Gander}. It
|
||||
is obtained by zeroing the components of $w_i$ corresponding to the
|
||||
overlapping vertices when applying the prolongation. Therefore,
|
||||
RAS differs from classical AS by the prolongation operator $(R_i^{\delta})^T$,
|
||||
which is substituted by $(\tilde{R}_i^0)^T \in \Re^{n_i^\delta \times n}$,
|
||||
where $\tilde{R}_i^0$ obtained by zeroing the rows of $R_i^\delta$
|
||||
corresponding to the vertices in $W_i^\delta \backslash W_i^0$:
|
||||
\[
|
||||
M_{RAS}^{-1}= \sum_{i=1}^m (\tilde{R}_i^0)^T
|
||||
(A_i^\delta)^{-1} R_i^{\delta}.
|
||||
\]
|
||||
Analogously, the AS variant called \emph{AS with Harmonic extension (ASH)}
|
||||
is defined by
|
||||
\[ M_{ASH}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
|
||||
(A_i^\delta)^{-1} \tilde{R}_i^0.
|
||||
\]
|
||||
We note that for $\delta=0$ the three variants of the AS preconditioner are
|
||||
all equal to the block-Jacobi preconditioner.
|
||||
|
||||
As already observed, the convergence rate of the one-level Schwarz
|
||||
preconditioned iterative solvers deteriorates as the number $m$ of partitions
|
||||
of $W$ increases \cite{dd1_94,dd2_96}. To reduce the dependency
|
||||
of the number of iterations on the degree of parallelism we may
|
||||
introduce a global coupling among the overlapping partitions by defining
|
||||
a coarse-space approximation $A_C$ of the matrix $A$.
|
||||
In a pure algebraic setting, $A_C$ is usually built with
|
||||
a Galerkin approach. Given a set $W_C$ of \emph{coarse vertices},
|
||||
with size $n_C$, and a suitable restriction operator
|
||||
$R_C \in \Re^{n_C \times n}$, $A_C$ is defined as
|
||||
\[
|
||||
A_C=R_C A R_C^T
|
||||
\]
|
||||
and the coarse-level correction matrix to be combined with a generic
|
||||
one-level AS preconditioner $M_{1L}$ is obtained as
|
||||
\[
|
||||
M_{C}^{-1}= R_C^T A_C^{-1} R_C,
|
||||
\]
|
||||
where $A_C$ is assumed to be nonsingular. The application of $M_{C}^{-1}$
|
||||
to a vector $v$ corresponds to a restriction, a solution and
|
||||
a prolongation step; the solution step, involving the matrix $A_C$,
|
||||
may be carried out also approximately.
|
||||
|
||||
The combination of $M_{C}$ and $M_{1L}$ may be
|
||||
performed in either an additive or a multiplicative framework.
|
||||
In the former case, the \emph{two-level additive} Schwarz preconditioner
|
||||
is obtained:
|
||||
\[
|
||||
M_{2LA}^{-1} = M_{C}^{-1} + M_{1L}^{-1}.
|
||||
\]
|
||||
Applying $M_{2L-A}^{-1}$ to a vector $v$ within a Krylov solver
|
||||
corresponds to applying $M_{C}^{-1}$
|
||||
and $M_{1L}^{-1}$ to $v$ independently and then summing up
|
||||
the results.
|
||||
|
||||
In the multiplicative case, the combination can be
|
||||
performed by first applying the smoother $M_{1L}^{-1}$ and then
|
||||
the coarse-level correction operator $M_{C}^{-1}$:
|
||||
\[
|
||||
\begin{array}{l}
|
||||
w = M_{1L}^{-1} v, \\
|
||||
z = w + M_{C}^{-1} (v-Aw);
|
||||
\end{array}
|
||||
\]
|
||||
this corresponds to the following \emph{two-level hybrid pre-smoothed}
|
||||
Schwarz preconditioner:
|
||||
\[
|
||||
M_{2LH-PRE}^{-1} = M_{C}^{-1} + \left( I - M_{C}^{-1}A \right) M_{1L}^{-1}.
|
||||
\]
|
||||
On the other hand, by applying the smoother after the coarse-level correction,
|
||||
i.e.\ by computing
|
||||
\[
|
||||
\begin{array}{l}
|
||||
w = M_{C}^{-1} v , \\
|
||||
z = w + M_{1L}^{-1} (v-Aw) ,
|
||||
\end{array}
|
||||
\]
|
||||
the \emph{two-level hybrid post-smoothed}
|
||||
Schwarz preconditioner is obtained:
|
||||
\[
|
||||
M_{2LH-POST}^{-1} = M_{1L}^{-1} + \left( I - M_{1L}^{-1}A \right) M_{C}^{-1}.
|
||||
\]
|
||||
One more variant of two-level hybrid preconditioner is obtained by applying
|
||||
the smoother before and after the coarse-level correction. In this case, the
|
||||
preconditioner is symmetric if $A$, $M_{1L}$ and $M_{C}$ are symmetric.
|
||||
|
||||
As previously noted, on parallel computers the number of sumatrices usually matches
|
||||
the number of available processors. When the size of the system to be preconditioned
|
||||
is very large, the use of many proccessors, i.e.\ of many small submatrices, often
|
||||
leads to a large coarse-level system, whose solution may be computationally expensive.
|
||||
On the other hand, the use of few processors often leads to local sumatrices that
|
||||
are too expensive to be processed on single processors, because of memory and/or
|
||||
computing requirements. Therefore, it seems natural to use a recursive approach,
|
||||
in which the coarse-level correction is re-applied starting from the current
|
||||
coarse-level system. The corresponding preconditioners are called \emph{multi-level}.
|
||||
One more reason for the multi-level approach is that it may significantly
|
||||
reduce the computational cost of preconditioning with respect to the two-level case
|
||||
(see \cite[Chapter 3]{dd2_96}). Additive and hybrid multilevel preconditioners
|
||||
are obtained as direct extensions of the two-level counterparts. Other combinations
|
||||
of the smoothers and coarse-level corrections are possible, leading to variants
|
||||
of the previous algorithms. For a detailed descrition of them, the reader is
|
||||
referred to \cite[Chapter 3]{dd2_96}.
|
||||
\textbf{Secondo me qui ci vorrebbe una descrizione algoritmica, a titolo di esempio,
|
||||
di un precondizionatore multilevel, ad esempio quello ibrido con pre-smoothing, sul tipo
|
||||
della descrizione in figura 1 della guida di Trilinos ML 4.0. CHE NE PENSATE?}
|
||||
|
||||
|
||||
\subsection{Smoothed Aggregation\label{sec:aggregation}}
|
||||
|
||||
To define the restriction operator $R_C$, which is used to compute
|
||||
the coarse-level matrix $A_C$, MLD2P4 uses the \emph{smoothed aggregation}
|
||||
algorithm described in \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}.
|
||||
The basic idea of this algorithm is to build a coarse set of vertices
|
||||
$W_C$ by suitably grouping the vertices of $W$ into disjoint subsets
|
||||
(aggregates), and to define the coarse-to-fine space transfer operator $R_C^T$ by
|
||||
applying a suitable smoother to a simple piecewise constant
|
||||
prolongation operator, to improve the quality of the coarse-space correction.
|
||||
|
||||
Three main steps can be identified in the smoothed aggregation procedure:
|
||||
\begin{itemize}
|
||||
\item coarsening of the vertex set $W$, to obtain $W_C$;
|
||||
\item construction of the prolongator $R_C^T$;
|
||||
\item application of $R_C$ and $R_C^T$ to build $A_C$.
|
||||
\end{itemize}
|
||||
|
||||
To perform the coarsening step, we have implemented the aggregation algorithm sketched
|
||||
in \cite{apnum_07}. According to \cite{brezina_vanek}, a modification of this algorithm
|
||||
has been actually considered,
|
||||
in which each aggregate $N_r$ is made of vertices of $W$ that are \emph{strongly coupled}
|
||||
to a certain root vertex $r \in W$, i.e.\
|
||||
\[ N_r = \left\{s \in W: |a_{rs}| \geq \theta \sqrt{|a_{rr}a_{ss}|} \right\} \]
|
||||
for a given $\theta \in [0,1]$.
|
||||
Since the previous algorithm has a sequential nature, a \emph{decoupled} version of
|
||||
it has been chosen, where each processor $i$ independently applies the algorithm to
|
||||
the set of vertices $W_i^0$ assigned to it in the initial data distribution. This
|
||||
version is embarrassingly parallel, since it does not require any data communication.
|
||||
On the other hand, it may produce non-uniform aggregates near boundary vertices,
|
||||
i.e.\ near vertices adjacent to vertices in other processors, and is strongly
|
||||
dependent on the number of processors and on the initial partitioning of the matrix $A$.
|
||||
Nevertheless, this algorithm has been chosen for the implementation in MLD2P4,
|
||||
since it has been shown to produce good results in practice \cite{Tuminaro_Tong_00}.
|
||||
|
||||
The prolongator $P_C=R_C^T$ is built starting from a \emph{tentative prolongator}
|
||||
$P \in \Re^{n \times n_C}$, defined as
|
||||
\begin{equation}
|
||||
P=(p_{ij}), \quad p_{ij}=
|
||||
\left\{ \begin{array}{ll}
|
||||
1 & \quad \mbox{if} \; i \in V^j_C \\
|
||||
0 & \quad \mbox{otherwise}
|
||||
\end{array} \right. .
|
||||
\label{eq:tent_prol}
|
||||
\end{equation}
|
||||
$P_C$ is obtained by
|
||||
applying to $P$ a smoother $S \in \Re^{n \times n}$:
|
||||
\begin{equation}
|
||||
P_C = S P,
|
||||
\label{eq:smoothed_prol}
|
||||
\end{equation}
|
||||
in order to remove oscillatory components from the range of the prolongator
|
||||
and hence to improve the convergence properties of the multi-level
|
||||
Schwarz method \cite{Brezina_Vanek_,StubenGMD69_99}.
|
||||
A simple choice for $S$ is the damped Jacobi smoother:
|
||||
\begin{equation}
|
||||
S = I - \omega D^{-1} A ,
|
||||
\label{eq:jac_smoother}
|
||||
\end{equation}
|
||||
where the value of $\omega$ can be chosen
|
||||
using some estimate of the spectral radius of $D^{-1}A$ \cite{Brezina_Vanek}.
|
||||
\textbf{Cenno al filtering di $A$ nello smoothing, dicendo che pero' non e' stato
|
||||
implementato?}
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
\section{Multi-level Domain Decomposition Background\label{sec:background}}
|
||||
|
||||
\emph{Domain Decomposition} (DD) preconditioners, coupled with Krylov iterative
|
||||
solvers, are widely used in the parallel solution of large and sparse linear systems.
|
||||
These preconditioners are based on the divide and conquer technique: the matrix
|
||||
to be preconditioned is divided into submatrices, a ``local linear system''
|
||||
involving each submatrix is (approximately) solved, and the local solutions are used
|
||||
to build a preconditioner for the whole original matrix. This process
|
||||
often corresponds to dividing a physical domain associated to the original matrix
|
||||
into subdomains, e.g. in a PDE discretization, to (approximately) solving the
|
||||
subproblems corresponding to the subdomains and to building an approximate
|
||||
solution of the original problem from the local solutions
|
||||
\cite{Cai_Widlund_92,dd1_94,dd2_96}.
|
||||
|
||||
\emph{Additive Schwarz} preconditioners are DD preconditioners using overlapping
|
||||
submatrices, i.e.\ with some common rows, to couple the local information
|
||||
related to the submatrices (see, e.g., \cite{dd2_96}).
|
||||
The main motivations for choosing Additive Schwarz preconditioners are their
|
||||
intrinsic parallelism and good \textbf{(dire good e' un po' "`forte"', dato che
|
||||
subito dopo diciamo che la convergenza dipende dal numero di sottomatrici)}
|
||||
convergence properties. A drawback of these
|
||||
preconditioners is that the number of iterations of the preconditioned solvers
|
||||
generally grows with the number of submatrices. This may be a serious limitation
|
||||
on parallel computers, since the number of submatrices usually matches the number
|
||||
of available processors. Optimal convergence rates, i.e.\ iteration numbers
|
||||
independent of the number of submatrices, can be obtained by correcting the
|
||||
preconditioner through a suitable approximation of the original linear system
|
||||
in a coarse space, which globally couples the information related to the single
|
||||
submatrices.
|
||||
|
||||
\emph{Two-level Schwarz} preconditioners are obtained
|
||||
by combining basic (one-level) Schwarz preconditioners with coarse-level
|
||||
corrections. In this context, the one-level preconditioner is often
|
||||
called smoother. Different two-level preconditioners are obtained by varying the
|
||||
choice of the smoother, of the coarse-level correction and the
|
||||
way they are combined \cite{dd2_96}. The same reasoning can be applied starting
|
||||
from the coarse-level system, i.e.\ a coarse-space correction can be built
|
||||
from this system, thus obtaining \emph{multi-level} preconditioners.
|
||||
|
||||
It is worth noting that optimal preconditioners do not necessarily correspond
|
||||
to minimum execution times. Indeed, to obtain effective multilevel preconditioners
|
||||
a tradeoff between optimality of convergence and the cost of building and applying
|
||||
the coarse-space corrections must be achieved. The choice of the number of levels,
|
||||
i.e.\ of the coarse-space corrections, also affects the effectiveness of the
|
||||
preconditioners. One more goal is to get convergence rates as less sensitive
|
||||
as possible to variations in the matrix coefficients.
|
||||
|
||||
Two main approaches can be used to build coarse-space corrections. The geometric approach
|
||||
applies coarsening strategies based on the knowledge of some physical grid associated
|
||||
to the matrix and requires the user to define grid transfer operators from the fine
|
||||
to the coarse levels and vice versa. This may result difficult for complex geometries;
|
||||
furthermore, suitable one-level preconditioners may be required to get efficient
|
||||
interplay between fine and coarse levels, e.g.\ when matrices with highly varying coefficients
|
||||
are considered. The algebraic approach builds coarse-space corrections using only matrix
|
||||
information. It performs a fully automatic coarsening and enforces the interplay between
|
||||
the fine and coarse levels by suitably choosing the coarse space and the coarse-to-fine
|
||||
interpolation \cite{StubenGMD69_99}.
|
||||
|
||||
MLD2P4 uses a pure algebraic approach for building the sequence of coarse matrices
|
||||
starting from the original matrix. The algebraic approach is based on the \emph{smoothed
|
||||
aggregation} algorithm \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}. A decoupled version
|
||||
of this algorithm is implemented, where the smoothed aggregation is applied locally
|
||||
to each submatrix \cite{Tuminaro_Tong_00}. In the next two subsections we provide
|
||||
a brief description of the multi-level Schwarz preconditioners and on the smoothed
|
||||
aggregation technique as implemented in MLD2P4. For further details the user
|
||||
is referred to \cite{para_04,apnum_07,aaecc_07,dd2_96}.
|
||||
|
||||
|
||||
\subsection{Multi-level Schwarz Preconditioners\label{sec:multilevel}}
|
||||
|
||||
The Multilevel preconditioners implemented in MLD2P4 are obtained by combining
|
||||
Additive Schwarz preconditioners with coarse-space corrections; therefore
|
||||
we first provide a sketch of the Additive Schwarz preconditioners.
|
||||
|
||||
Given a linear system
|
||||
\[ Ax=b, \]
|
||||
where $A=(a_{ij}) \in \Re^{n \times n}$ is a
|
||||
nonsingular sparse matrix with a symmetric non-zero pattern,
|
||||
let $G=(W,E)$ be the adjacency graph of $A$, where $W=\{1, 2, \ldots, n\}$
|
||||
and $E=\{(i,j) : a_{ij} \neq 0\}$ are the vertex set and the edge set of $G$,
|
||||
respectively. Two vertices are called adjacent if there is an edge connecting
|
||||
them. For any integer $\delta > 0$, a $\delta$-overlap
|
||||
partition of $W$ can be defined recursively as follows.
|
||||
Given a 0-overlap (or non-overlapping) partition of $W$,
|
||||
i.e.\ a set of $m$ disjoint nonempty sets $W_i^0 \subset W$ such that
|
||||
$\cup_{i=1}^m W_i^0 = W$, a $\delta$-overlap
|
||||
partition of $W$ is obtained by considering the sets
|
||||
$W_i^\delta \supset W_i^{\delta-1}$, obtained by including the vertices that
|
||||
are adjacent to any vertex in $W_i^{\delta-1}$.
|
||||
|
||||
Let $n_i^\delta$ be the size of $W_i^\delta$ and $R_i^{\delta} \in
|
||||
\Re^{n_i^\delta \times n}$ the restriction operator that maps
|
||||
a vector $v \in \Re^n$ onto the vector $v_i^{\delta} \in \Re^{n_i^\delta}$
|
||||
containing the components of $v$ corresponding to the vertices in
|
||||
$W_i^\delta$. The transpose of $R_i^{\delta}$ is a
|
||||
prolongation operator from $\Re^{n_i^\delta}$ to $\Re^n$.
|
||||
The matrix $A_i^\delta=R_i^\delta A (R_i^\delta)^T \in
|
||||
\Re^{n_i^\delta \times n_i^\delta}$ can be considered
|
||||
as a restriction of $A$ corresponding to the set $W_i^{\delta}$.
|
||||
|
||||
The \emph{classical one-level AS} preconditioner is defined by
|
||||
\[
|
||||
M_{AS}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
|
||||
(A_i^\delta)^{-1} R_i^{\delta},
|
||||
\]
|
||||
where $A_i^\delta$ is assumed to be nonsingular. Its application
|
||||
to a vector $v \in \Re^n$ within a Krylov solver requires the following
|
||||
three steps:
|
||||
\begin{enumerate}
|
||||
\item restriction of $v$ as $v_i = R_i^{\delta} v$, $i=1,\ldots,m$;
|
||||
\item (approximate) solution of the linear systems $A_i^\delta w_i = v_i$,
|
||||
$i=1,\ldots,m$;
|
||||
\item prolongation and sum of the $w_i$'s, i.e. $w = \sum_{i=1}^m (R_i^{\delta})^T w_i$.
|
||||
\end{enumerate}
|
||||
A variant of the classical AS preconditioner that outperforms it
|
||||
in terms of both convergence rate and of computation and communication
|
||||
time on parallel distributed-memory computers is the so-called \emph{Restricted AS
|
||||
(RAS)} preconditioner~\cite{Cai_Sarkis,Efstathiou_Gander}. It
|
||||
is obtained by zeroing the components of $w_i$ corresponding to the
|
||||
overlapping vertices when applying the prolongation. Therefore,
|
||||
RAS differs from classical AS by the prolongation operator $(R_i^{\delta})^T$,
|
||||
which is substituted by $(\tilde{R}_i^0)^T \in \Re^{n_i^\delta \times n}$,
|
||||
where $\tilde{R}_i^0$ obtained by zeroing the rows of $R_i^\delta$
|
||||
corresponding to the vertices in $W_i^\delta \backslash W_i^0$:
|
||||
\[
|
||||
M_{RAS}^{-1}= \sum_{i=1}^m (\tilde{R}_i^0)^T
|
||||
(A_i^\delta)^{-1} R_i^{\delta}.
|
||||
\]
|
||||
Analogously, the AS variant called \emph{AS with Harmonic extension (ASH)}
|
||||
is defined by
|
||||
\[ M_{ASH}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
|
||||
(A_i^\delta)^{-1} \tilde{R}_i^0.
|
||||
\]
|
||||
We note that for $\delta=0$ the three variants of the AS preconditioner are
|
||||
all equal to the block-Jacobi preconditioner.
|
||||
|
||||
As already observed, the convergence rate of the one-level Schwarz
|
||||
preconditioned iterative solvers deteriorates as the number $m$ of partitions
|
||||
of $W$ increases \cite{dd1_94,dd2_96}. To reduce the dependency
|
||||
of the number of iterations on the degree of parallelism we may
|
||||
introduce a global coupling among the overlapping partitions by defining
|
||||
a coarse-space approximation $A_C$ of the matrix $A$.
|
||||
In a pure algebraic setting, $A_C$ is usually built with
|
||||
a Galerkin approach. Given a set $W_C$ of \emph{coarse vertices},
|
||||
with size $n_C$, and a suitable restriction operator
|
||||
$R_C \in \Re^{n_C \times n}$, $A_C$ is defined as
|
||||
\[
|
||||
A_C=R_C A R_C^T
|
||||
\]
|
||||
and the coarse-level correction matrix to be combined with a generic
|
||||
one-level AS preconditioner $M_{1L}$ is obtained as
|
||||
\[
|
||||
M_{C}^{-1}= R_C^T A_C^{-1} R_C,
|
||||
\]
|
||||
where $A_C$ is assumed to be nonsingular. The application of $M_{C}^{-1}$
|
||||
to a vector $v$ corresponds to a restriction, a solution and
|
||||
a prolongation step; the solution step, involving the matrix $A_C$,
|
||||
may be carried out also approximately.
|
||||
|
||||
The combination of $M_{C}$ and $M_{1L}$ may be
|
||||
performed in either an additive or a multiplicative framework.
|
||||
In the former case, the \emph{two-level additive} Schwarz preconditioner
|
||||
is obtained:
|
||||
\[
|
||||
M_{2LA}^{-1} = M_{C}^{-1} + M_{1L}^{-1}.
|
||||
\]
|
||||
Applying $M_{2L-A}^{-1}$ to a vector $v$ within a Krylov solver
|
||||
corresponds to applying $M_{C}^{-1}$
|
||||
and $M_{1L}^{-1}$ to $v$ independently and then summing up
|
||||
the results.
|
||||
|
||||
In the multiplicative case, the combination can be
|
||||
performed by first applying the smoother $M_{1L}^{-1}$ and then
|
||||
the coarse-level correction operator $M_{C}^{-1}$:
|
||||
\[
|
||||
\begin{array}{l}
|
||||
w = M_{1L}^{-1} v, \\
|
||||
z = w + M_{C}^{-1} (v-Aw);
|
||||
\end{array}
|
||||
\]
|
||||
this corresponds to the following \emph{two-level hybrid pre-smoothed}
|
||||
Schwarz preconditioner:
|
||||
\[
|
||||
M_{2LH-PRE}^{-1} = M_{C}^{-1} + \left( I - M_{C}^{-1}A \right) M_{1L}^{-1}.
|
||||
\]
|
||||
On the other hand, by applying the smoother after the coarse-level correction,
|
||||
i.e.\ by computing
|
||||
\[
|
||||
\begin{array}{l}
|
||||
w = M_{C}^{-1} v , \\
|
||||
z = w + M_{1L}^{-1} (v-Aw) ,
|
||||
\end{array}
|
||||
\]
|
||||
the \emph{two-level hybrid post-smoothed}
|
||||
Schwarz preconditioner is obtained:
|
||||
\[
|
||||
M_{2LH-POST}^{-1} = M_{1L}^{-1} + \left( I - M_{1L}^{-1}A \right) M_{C}^{-1}.
|
||||
\]
|
||||
One more variant of two-level hybrid preconditioner is obtained by applying
|
||||
the smoother before and after the coarse-level correction. In this case, the
|
||||
preconditioner is symmetric if $A$, $M_{1L}$ and $M_{C}$ are symmetric.
|
||||
|
||||
As previously noted, on parallel computers the number of sumatrices usually matches
|
||||
the number of available processors. When the size of the system to be preconditioned
|
||||
is very large, the use of many proccessors, i.e.\ of many small submatrices, often
|
||||
leads to a large coarse-level system, whose solution may be computationally expensive.
|
||||
On the other hand, the use of few processors often leads to local sumatrices that
|
||||
are too expensive to be processed on single processors, because of memory and/or
|
||||
computing requirements. Therefore, it seems natural to use a recursive approach,
|
||||
in which the coarse-level correction is re-applied starting from the current
|
||||
coarse-level system. The corresponding preconditioners are called \emph{multi-level}.
|
||||
One more reason for the multi-level approach is that it may significantly
|
||||
reduce the computational cost of preconditioning with respect to the two-level case
|
||||
(see \cite[Chapter 3]{dd2_96}). Additive and hybrid multilevel preconditioners
|
||||
are obtained as direct extensions of the two-level counterparts. Other combinations
|
||||
of the smoothers and coarse-level corrections are possible, leading to variants
|
||||
of the previous algorithms. For a detailed descrition of them, the reader is
|
||||
referred to \cite[Chapter 3]{dd2_96}.
|
||||
\textbf{Secondo me qui ci vorrebbe una descrizione algoritmica, a titolo di esempio,
|
||||
di un precondizionatore multilevel, ad esempio quello ibrido con pre-smoothing, sul tipo
|
||||
della descrizione in figura 1 della guida di Trilinos ML 4.0. CHE NE PENSATE?}
|
||||
|
||||
|
||||
\subsection{Smoothed Aggregation\label{sec:aggregation}}
|
||||
|
||||
To define the restriction operator $R_C$, which is used to compute
|
||||
the coarse-level matrix $A_C$, MLD2P4 uses the \emph{smoothed aggregation}
|
||||
algorithm described in \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}.
|
||||
The basic idea of this algorithm is to build a coarse set of vertices
|
||||
$W_C$ by suitably grouping the vertices of $W$ into disjoint subsets
|
||||
(aggregates), and to define the coarse-to-fine space transfer operator $R_C^T$ by
|
||||
applying a suitable smoother to a simple piecewise constant
|
||||
prolongation operator, to improve the quality of the coarse-space correction.
|
||||
|
||||
Three main steps can be identified in the smoothed aggregation procedure:
|
||||
\begin{itemize}
|
||||
\item coarsening of the vertex set $W$, to obtain $W_C$;
|
||||
\item construction of the prolongator $R_C^T$;
|
||||
\item application of $R_C$ and $R_C^T$ to build $A_C$.
|
||||
\end{itemize}
|
||||
|
||||
To perform the coarsening step, we have implemented the aggregation algorithm sketched
|
||||
in \cite{apnum_07}. According to \cite{brezina_vanek}, a modification of this algorithm
|
||||
has been actually considered,
|
||||
in which each aggregate $N_r$ is made of vertices of $W$ that are \emph{strongly coupled}
|
||||
to a certain root vertex $r \in W$, i.e.\
|
||||
\[ N_r = \left\{s \in W: |a_{rs}| \geq \theta \sqrt{|a_{rr}a_{ss}|} \right\} \]
|
||||
for a given $\theta \in [0,1]$.
|
||||
Since the previous algorithm has a sequential nature, a \emph{decoupled} version of
|
||||
it has been chosen, where each processor $i$ independently applies the algorithm to
|
||||
the set of vertices $W_i^0$ assigned to it in the initial data distribution. This
|
||||
version is embarrassingly parallel, since it does not require any data communication.
|
||||
On the other hand, it may produce non-uniform aggregates near boundary vertices,
|
||||
i.e.\ near vertices adjacent to vertices in other processors, and is strongly
|
||||
dependent on the number of processors and on the initial partitioning of the matrix $A$.
|
||||
Nevertheless, this algorithm has been chosen for the implementation in MLD2P4,
|
||||
since it has been shown to produce good results in practice \cite{Tuminaro_Tong_00}.
|
||||
|
||||
The prolongator $P_C=R_C^T$ is built starting from a \emph{tentative prolongator}
|
||||
$P \in \Re^{n \times n_C}$, defined as
|
||||
\begin{equation}
|
||||
P=(p_{ij}), \quad p_{ij}=
|
||||
\left\{ \begin{array}{ll}
|
||||
1 & \quad \mbox{if} \; i \in V^j_C \\
|
||||
0 & \quad \mbox{otherwise}
|
||||
\end{array} \right. .
|
||||
\label{eq:tent_prol}
|
||||
\end{equation}
|
||||
$P_C$ is obtained by
|
||||
applying to $P$ a smoother $S \in \Re^{n \times n}$:
|
||||
\begin{equation}
|
||||
P_C = S P,
|
||||
\label{eq:smoothed_prol}
|
||||
\end{equation}
|
||||
in order to remove oscillatory components from the range of the prolongator
|
||||
and hence to improve the convergence properties of the multi-level
|
||||
Schwarz method \cite{Brezina_Vanek_,StubenGMD69_99}.
|
||||
A simple choice for $S$ is the damped Jacobi smoother:
|
||||
\begin{equation}
|
||||
S = I - \omega D^{-1} A ,
|
||||
\label{eq:jac_smoother}
|
||||
\end{equation}
|
||||
where the value of $\omega$ can be chosen
|
||||
using some estimate of the spectral radius of $D^{-1}A$ \cite{Brezina_Vanek}.
|
||||
\textbf{Cenno al filtering di $A$ nello smoothing, dicendo che pero' non e' stato
|
||||
implementato?}
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
|
@ -1,152 +1,152 @@
|
||||
\begin{thebibliography}{99}
|
||||
|
||||
%
|
||||
\bibitem{PARA04FOREST}
|
||||
Bella, G., Filippone, S., De Maio, A., Testa, M.:
|
||||
A Simulation Model for Forest Fires.
|
||||
In: Dongarra, J., Madsen, K., Wasniewski, J. (eds.):
|
||||
Proceedings of PARA~04 Workshop on State of the Art
|
||||
in Scientific Computing. Lecture Notes in Computer Science, 3732. Berlin:
|
||||
Springer, 2005
|
||||
%
|
||||
\bibitem{aaecc_07} A. Buttari, D. di Serafino, P. D'Ambra, S. Filippone,\newblock
|
||||
2LEV-D2P4: a package of high-performance preconditioners,\newblock
|
||||
Applicable Algebra in Engineering, Communications and Computing,
|
||||
Volume 18, Number 3, May, 2007, pp. 223-239
|
||||
%Published online: 13 February 2007, {\tt http://dx.doi.org/10.1007/s00200-007-0035-z}
|
||||
%
|
||||
\bibitem{apnum_07} P. D'Ambra, S. Filippone, D. Di Serafino\newblock
|
||||
On the Development of PSBLAS-based Parallel Two-level Schwarz Preconditioners
|
||||
\newblock
|
||||
Applied Numerical Mathematics, Elsevier Science,
|
||||
Volume 57, Issues 11-12, November-December 2007, Pages 1181-1196.
|
||||
%published online 3 February 2007, {\tt
|
||||
% http://dx.doi.org/10.1016/j.apnum.2007.01.006}
|
||||
|
||||
%% \bibitem{DOUGLAS}
|
||||
%% R.E.~Bank and C.C.~Douglas,
|
||||
%% {\em SMMP: Sparse Matrix Multiplication Package},
|
||||
%% Advances in Computational Mathematics, 1993, 1, 127-137.
|
||||
%% (See also {\tt http://www.mgnet.org/~douglas/ccd-codes.html})
|
||||
%
|
||||
%
|
||||
\bibitem{para_04}
|
||||
A.~Buttari, P.~D'Ambra, D.~di Serafino and S.~Filippone,
|
||||
{\em Extending PSBLAS to Build Parallel Schwarz Preconditioners},
|
||||
in , J.~Dongarra, K.~Madsen, J.~Wasniewski, editors,
|
||||
Proceedings of PARA~04 Workshop on State of the Art
|
||||
in Scientific Computing, pp.~593--602, Lecture Notes in Computer Science,
|
||||
Springer, 2005.
|
||||
%
|
||||
%% \bibitem{CAI_SAAD}
|
||||
%% X.~C.~Cai and Y.~Saad,
|
||||
%% {\em Overlapping Domain Decomposition Algorithms for General Sparse Matrices},
|
||||
%% Numerical Linear Algebra with Applications, 3(3), pp.~221--237, 1996.
|
||||
%% %
|
||||
%% \bibitem{CAI_SARKIS}
|
||||
%% X.C.~Cai and M.~Sarkis,
|
||||
%% {\em A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems},
|
||||
%% SIAM Journal on Scientific Computing, 21(2), pp.~792--797, 1999.
|
||||
%
|
||||
\bibitem{Cai_Widlund_92}
|
||||
X.C.~Cai and O.~B.~Widlund,
|
||||
{\em Domain Decomposition Algorithms for Indefinite Elliptic Problems},
|
||||
SIAM Journal on Scientific and Statistical Computing, 13(1), pp.~243--258, 1992.
|
||||
%
|
||||
\bibitem{dd1_94}
|
||||
T.~Chan and T.~Mathew,
|
||||
{\em Domain Decomposition Algorithms},
|
||||
in A.~Iserles, editor, Acta Numerica 1994, pp.~61--143, 1994.
|
||||
Cambridge University Press.
|
||||
%% %
|
||||
%% \bibitem{UMFPACK}
|
||||
%% T.A.~Davis,
|
||||
%% {\em Algorithm 832: UMFPACK - an Unsymmetric-pattern Multifrontal
|
||||
%% Method with a Column Pre-ordering Strategy},
|
||||
%% ACM Transactions on Mathematical Software, 30, pp.~196--199, 2004.
|
||||
%% (See also {\tt http://www.cise.ufl.edu/~davis/})
|
||||
%% %
|
||||
%% \bibitem{SUPERLU}
|
||||
%% J.W.~Demmel, S.C.~Eisenstat, J.R.~Gilbert, X.S.~Li and J.W.H.~Liu,
|
||||
%% A supernodal approach to sparse partial pivoting,
|
||||
%% SIAM Journal on Matrix Analysis and Applications, 20(3), pp.~720--755, 1999.
|
||||
%
|
||||
\bibitem{BLACS}
|
||||
J.~J.~Dongarra and R.~C.~Whaley,
|
||||
{\em A User's Guide to the BLACS v.~1.1},
|
||||
Lapack Working Note 94, Tech.\ Rep.\ UT-CS-95-281, University of
|
||||
Tennessee, March 1995 (updated May 1997).
|
||||
%
|
||||
\bibitem{sblas_97}
|
||||
I.~Duff, M.~Marrone, G.~Radicati and C.~Vittoli,
|
||||
{\em Level 3 Basic Linear Algebra Subprograms for Sparse Matrices:
|
||||
a User Level Interface},
|
||||
ACM Transactions on Mathematical Software, 23(3), pp.~379--401, 1997.
|
||||
%
|
||||
\bibitem{sblas_02}
|
||||
I.~Duff, M.~Heroux and R.~Pozo,
|
||||
{\em An Overview of the Sparse Basic Linear
|
||||
Algebra Subprograms: the New Standard from the BLAS Technical Forum},
|
||||
ACM Transactions on Mathematical Software, 28(2), pp.~239--267, 2002.
|
||||
%
|
||||
\bibitem{psblas_00}
|
||||
S.~Filippone and M.~Colajanni,
|
||||
{\em PSBLAS: A Library for Parallel Linear Algebra
|
||||
Computation on Sparse Matrices},
|
||||
\newblock
|
||||
ACM Transactions on Mathematical Software, 26(4), pp.~527--550, 2000.
|
||||
%
|
||||
\bibitem{KIVA3PSBLAS}
|
||||
S.~Filippone, P.~D'Ambra, M.~Colajanni,
|
||||
{\em Using a Parallel Library of Sparse Linear Algebra in a Fluid Dynamics
|
||||
Applications Code on Linux Clusters},
|
||||
in G.~Joubert, A.~Murli, F.~Peters, M.~Vanneschi, editors,
|
||||
Parallel Computing - Advances \& Current Issues,
|
||||
pp.~441--448, Imperial College Press, 2002.
|
||||
%
|
||||
\bibitem{METIS}
|
||||
Karypis, G. and Kumar, V.,
|
||||
{\em {METIS}: Unstructured Graph Partitioning and Sparse Matrix
|
||||
Ordering System}.
|
||||
Minneapolis, MN 55455: University of Minnesota, Department of
|
||||
Computer Science, 1995.
|
||||
Internet Address: {\verb|http://www.cs.umn.edu/~karypis|}.
|
||||
\bibitem{BLAS1}
|
||||
Lawson, C., Hanson, R., Kincaid, D. and Krogh, F.,
|
||||
Basic {L}inear {A}lgebra {S}ubprograms for {F}ortran usage,
|
||||
{ACM Trans. Math. Softw.} vol.~{5}, 38--329, 1979.
|
||||
|
||||
\bibitem{machiels}
|
||||
{Machiels, L. and Deville, M.}
|
||||
{\em Fortran 90: An entry to object-oriented programming for the solution
|
||||
of partial differential equations.}
|
||||
{ACM Trans. Math. Softw.} vol.~{23}, 32--49.
|
||||
\bibitem{metcalf}
|
||||
{Metcalf, M., Reid, J. and Cohen, M.}
|
||||
{\em Fortran 95/2003 explained.}
|
||||
{Oxford University Press}, 2004.
|
||||
|
||||
\bibitem{dd2_96}
|
||||
B.~Smith, P.~Bjorstad and W.~Gropp,
|
||||
{\em Domain Decomposition: Parallel Multilevel Methods for Elliptic
|
||||
Partial Differential Equations},
|
||||
Cambridge University Press, 1996.
|
||||
|
||||
\bibitem{MPI1}
|
||||
M.~Snir, S.~Otto, S.~Huss-Lederman, D.~Walker and J.~Dongarra,
|
||||
{\em MPI: The Complete Reference. Volume 1 - The MPI Core}, second edition,
|
||||
MIT Press, 1998.
|
||||
%
|
||||
\bibitem{BREZINA_VANEK}
|
||||
M.~Brezina and P.~Van{\v e}k,
|
||||
{\em A Black-Box Iterative Solver Based on a Two-Level Schwarz Method},
|
||||
Computing, 1999, 63, 233-263.
|
||||
%
|
||||
%
|
||||
\bibitem{VANEK_MANDEL_BREZINA}
|
||||
P.~Van{\v e}k, J.~Mandel and M.~Brezina,
|
||||
{\em Algebraic Multigrid by Smoothed Aggregation for Second and Fourth Order Elliptic Problems},
|
||||
Computing, 1996, 56, 179-196.
|
||||
%
|
||||
|
||||
\begin{thebibliography}{99}
|
||||
|
||||
%
|
||||
\bibitem{PARA04FOREST}
|
||||
Bella, G., Filippone, S., De Maio, A., Testa, M.:
|
||||
A Simulation Model for Forest Fires.
|
||||
In: Dongarra, J., Madsen, K., Wasniewski, J. (eds.):
|
||||
Proceedings of PARA~04 Workshop on State of the Art
|
||||
in Scientific Computing. Lecture Notes in Computer Science, 3732. Berlin:
|
||||
Springer, 2005
|
||||
%
|
||||
\bibitem{aaecc_07} A. Buttari, D. di Serafino, P. D'Ambra, S. Filippone,\newblock
|
||||
2LEV-D2P4: a package of high-performance preconditioners,\newblock
|
||||
Applicable Algebra in Engineering, Communications and Computing,
|
||||
Volume 18, Number 3, May, 2007, pp. 223-239
|
||||
%Published online: 13 February 2007, {\tt http://dx.doi.org/10.1007/s00200-007-0035-z}
|
||||
%
|
||||
\bibitem{apnum_07} P. D'Ambra, S. Filippone, D. Di Serafino\newblock
|
||||
On the Development of PSBLAS-based Parallel Two-level Schwarz Preconditioners
|
||||
\newblock
|
||||
Applied Numerical Mathematics, Elsevier Science,
|
||||
Volume 57, Issues 11-12, November-December 2007, Pages 1181-1196.
|
||||
%published online 3 February 2007, {\tt
|
||||
% http://dx.doi.org/10.1016/j.apnum.2007.01.006}
|
||||
|
||||
%% \bibitem{DOUGLAS}
|
||||
%% R.E.~Bank and C.C.~Douglas,
|
||||
%% {\em SMMP: Sparse Matrix Multiplication Package},
|
||||
%% Advances in Computational Mathematics, 1993, 1, 127-137.
|
||||
%% (See also {\tt http://www.mgnet.org/~douglas/ccd-codes.html})
|
||||
%
|
||||
%
|
||||
\bibitem{para_04}
|
||||
A.~Buttari, P.~D'Ambra, D.~di Serafino and S.~Filippone,
|
||||
{\em Extending PSBLAS to Build Parallel Schwarz Preconditioners},
|
||||
in , J.~Dongarra, K.~Madsen, J.~Wasniewski, editors,
|
||||
Proceedings of PARA~04 Workshop on State of the Art
|
||||
in Scientific Computing, pp.~593--602, Lecture Notes in Computer Science,
|
||||
Springer, 2005.
|
||||
%
|
||||
%% \bibitem{CAI_SAAD}
|
||||
%% X.~C.~Cai and Y.~Saad,
|
||||
%% {\em Overlapping Domain Decomposition Algorithms for General Sparse Matrices},
|
||||
%% Numerical Linear Algebra with Applications, 3(3), pp.~221--237, 1996.
|
||||
%% %
|
||||
%% \bibitem{CAI_SARKIS}
|
||||
%% X.C.~Cai and M.~Sarkis,
|
||||
%% {\em A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems},
|
||||
%% SIAM Journal on Scientific Computing, 21(2), pp.~792--797, 1999.
|
||||
%
|
||||
\bibitem{Cai_Widlund_92}
|
||||
X.C.~Cai and O.~B.~Widlund,
|
||||
{\em Domain Decomposition Algorithms for Indefinite Elliptic Problems},
|
||||
SIAM Journal on Scientific and Statistical Computing, 13(1), pp.~243--258, 1992.
|
||||
%
|
||||
\bibitem{dd1_94}
|
||||
T.~Chan and T.~Mathew,
|
||||
{\em Domain Decomposition Algorithms},
|
||||
in A.~Iserles, editor, Acta Numerica 1994, pp.~61--143, 1994.
|
||||
Cambridge University Press.
|
||||
%% %
|
||||
%% \bibitem{UMFPACK}
|
||||
%% T.A.~Davis,
|
||||
%% {\em Algorithm 832: UMFPACK - an Unsymmetric-pattern Multifrontal
|
||||
%% Method with a Column Pre-ordering Strategy},
|
||||
%% ACM Transactions on Mathematical Software, 30, pp.~196--199, 2004.
|
||||
%% (See also {\tt http://www.cise.ufl.edu/~davis/})
|
||||
%% %
|
||||
%% \bibitem{SUPERLU}
|
||||
%% J.W.~Demmel, S.C.~Eisenstat, J.R.~Gilbert, X.S.~Li and J.W.H.~Liu,
|
||||
%% A supernodal approach to sparse partial pivoting,
|
||||
%% SIAM Journal on Matrix Analysis and Applications, 20(3), pp.~720--755, 1999.
|
||||
%
|
||||
\bibitem{BLACS}
|
||||
J.~J.~Dongarra and R.~C.~Whaley,
|
||||
{\em A User's Guide to the BLACS v.~1.1},
|
||||
Lapack Working Note 94, Tech.\ Rep.\ UT-CS-95-281, University of
|
||||
Tennessee, March 1995 (updated May 1997).
|
||||
%
|
||||
\bibitem{sblas_97}
|
||||
I.~Duff, M.~Marrone, G.~Radicati and C.~Vittoli,
|
||||
{\em Level 3 Basic Linear Algebra Subprograms for Sparse Matrices:
|
||||
a User Level Interface},
|
||||
ACM Transactions on Mathematical Software, 23(3), pp.~379--401, 1997.
|
||||
%
|
||||
\bibitem{sblas_02}
|
||||
I.~Duff, M.~Heroux and R.~Pozo,
|
||||
{\em An Overview of the Sparse Basic Linear
|
||||
Algebra Subprograms: the New Standard from the BLAS Technical Forum},
|
||||
ACM Transactions on Mathematical Software, 28(2), pp.~239--267, 2002.
|
||||
%
|
||||
\bibitem{psblas_00}
|
||||
S.~Filippone and M.~Colajanni,
|
||||
{\em PSBLAS: A Library for Parallel Linear Algebra
|
||||
Computation on Sparse Matrices},
|
||||
\newblock
|
||||
ACM Transactions on Mathematical Software, 26(4), pp.~527--550, 2000.
|
||||
%
|
||||
\bibitem{KIVA3PSBLAS}
|
||||
S.~Filippone, P.~D'Ambra, M.~Colajanni,
|
||||
{\em Using a Parallel Library of Sparse Linear Algebra in a Fluid Dynamics
|
||||
Applications Code on Linux Clusters},
|
||||
in G.~Joubert, A.~Murli, F.~Peters, M.~Vanneschi, editors,
|
||||
Parallel Computing - Advances \& Current Issues,
|
||||
pp.~441--448, Imperial College Press, 2002.
|
||||
%
|
||||
\bibitem{METIS}
|
||||
Karypis, G. and Kumar, V.,
|
||||
{\em {METIS}: Unstructured Graph Partitioning and Sparse Matrix
|
||||
Ordering System}.
|
||||
Minneapolis, MN 55455: University of Minnesota, Department of
|
||||
Computer Science, 1995.
|
||||
Internet Address: {\verb|http://www.cs.umn.edu/~karypis|}.
|
||||
\bibitem{BLAS1}
|
||||
Lawson, C., Hanson, R., Kincaid, D. and Krogh, F.,
|
||||
Basic {L}inear {A}lgebra {S}ubprograms for {F}ortran usage,
|
||||
{ACM Trans. Math. Softw.} vol.~{5}, 38--329, 1979.
|
||||
|
||||
\bibitem{machiels}
|
||||
{Machiels, L. and Deville, M.}
|
||||
{\em Fortran 90: An entry to object-oriented programming for the solution
|
||||
of partial differential equations.}
|
||||
{ACM Trans. Math. Softw.} vol.~{23}, 32--49.
|
||||
\bibitem{metcalf}
|
||||
{Metcalf, M., Reid, J. and Cohen, M.}
|
||||
{\em Fortran 95/2003 explained.}
|
||||
{Oxford University Press}, 2004.
|
||||
|
||||
\bibitem{dd2_96}
|
||||
B.~Smith, P.~Bjorstad and W.~Gropp,
|
||||
{\em Domain Decomposition: Parallel Multilevel Methods for Elliptic
|
||||
Partial Differential Equations},
|
||||
Cambridge University Press, 1996.
|
||||
|
||||
\bibitem{MPI1}
|
||||
M.~Snir, S.~Otto, S.~Huss-Lederman, D.~Walker and J.~Dongarra,
|
||||
{\em MPI: The Complete Reference. Volume 1 - The MPI Core}, second edition,
|
||||
MIT Press, 1998.
|
||||
%
|
||||
\bibitem{BREZINA_VANEK}
|
||||
M.~Brezina and P.~Van{\v e}k,
|
||||
{\em A Black-Box Iterative Solver Based on a Two-Level Schwarz Method},
|
||||
Computing, 1999, 63, 233-263.
|
||||
%
|
||||
%
|
||||
\bibitem{VANEK_MANDEL_BREZINA}
|
||||
P.~Van{\v e}k, J.~Mandel and M.~Brezina,
|
||||
{\em Algebraic Multigrid by Smoothed Aggregation for Second and Fourth Order Elliptic Problems},
|
||||
Computing, 1996, 56, 179-196.
|
||||
%
|
||||
|
||||
\end{thebibliography}
|
@ -1,7 +1,7 @@
|
||||
\section{Configuring and Building MLD2P4\label{sec:configuring}}
|
||||
- uso di GNU autoconf e automake \\
|
||||
- software di base necessario (MPI, BLACS, BLAS, PSBLAS - specificare versioni) \\
|
||||
- software opzionale (UMFPACK, SuperLU, SuperLUdist - specificare versioni e opzioni di configure) \\
|
||||
- sistemi operativi e compilatori su cui MLD2P4 e' stato costruito con successo \\
|
||||
- sono previste opzioni di configurazione per il debugging o per il profiling? \\
|
||||
- albero delle directory \\
|
||||
\section{Configuring and Building MLD2P4\label{sec:configuring}}
|
||||
- uso di GNU autoconf e automake \\
|
||||
- software di base necessario (MPI, BLACS, BLAS, PSBLAS - specificare versioni) \\
|
||||
- software opzionale (UMFPACK, SuperLU, SuperLUdist - specificare versioni e opzioni di configure) \\
|
||||
- sistemi operativi e compilatori su cui MLD2P4 e' stato costruito con successo \\
|
||||
- sono previste opzioni di configurazione per il debugging o per il profiling? \\
|
||||
- albero delle directory \\
|
||||
|
@ -1,6 +1,6 @@
|
||||
\section{Notational Conventions\label{sec:conventions}}
|
||||
- caratteri tipografici usati nella guida (vedi guida ML recente e guida Aztec) \\
|
||||
- convenzioni sui nomi di routine (differenza tra high-level e medium-level),
|
||||
strutture dati,\\
|
||||
moduli, costanti, etc. (vedi guida psblas) \\
|
||||
\section{Notational Conventions\label{sec:conventions}}
|
||||
- caratteri tipografici usati nella guida (vedi guida ML recente e guida Aztec) \\
|
||||
- convenzioni sui nomi di routine (differenza tra high-level e medium-level),
|
||||
strutture dati,\\
|
||||
moduli, costanti, etc. (vedi guida psblas) \\
|
||||
- versione reale e complessa\\
|
@ -1,41 +1,42 @@
|
||||
\section{Code Distribution\label{sec:distribution}}
|
||||
|
||||
The MLD2P4 is freely distributable under the following copyright
|
||||
terms:
|
||||
\begin{verbatim}
|
||||
MLD2P4 version 1.0
|
||||
MultiLevel Domain Decomposition Parallel Preconditioners Package
|
||||
based on PSBLAS (Parallel Sparse BLAS version 2.3)
|
||||
|
||||
(C) Copyright 2008
|
||||
|
||||
Salvatore Filippone University of Rome Tor Vergata
|
||||
Alfredo Buttari University of Rome Tor Vergata
|
||||
Pasqua D'Ambra ICAR-CNR, Naples
|
||||
Daniela di Serafino Second University of Naples
|
||||
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
2. Redistributions in binary form must reproduce the above copyright
|
||||
notice, this list of conditions, and the following disclaimer in the
|
||||
documentation and/or other materials provided with the distribution.
|
||||
3. The name of the MLD2P4 group or the names of its contributors may
|
||||
not be used to endorse or promote products derived from this
|
||||
software without specific written permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
|
||||
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE MLD2P4 GROUP OR ITS CONTRIBUTORS
|
||||
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
||||
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
||||
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
||||
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
||||
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGE.
|
||||
\end{verbatim}
|
||||
\section{Code Distribution\label{sec:distribution}}
|
||||
|
||||
The MLD2P4 is freely distributable under the following copyright
|
||||
terms: {\small
|
||||
\begin{verbatim}
|
||||
MLD2P4 version 1.0
|
||||
MultiLevel Domain Decomposition Parallel Preconditioners Package
|
||||
based on PSBLAS (Parallel Sparse BLAS version 2.3)
|
||||
|
||||
(C) Copyright 2008
|
||||
|
||||
Salvatore Filippone University of Rome Tor Vergata
|
||||
Alfredo Buttari University of Rome Tor Vergata
|
||||
Pasqua D'Ambra ICAR-CNR, Naples
|
||||
Daniela di Serafino Second University of Naples
|
||||
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
2. Redistributions in binary form must reproduce the above copyright
|
||||
notice, this list of conditions, and the following disclaimer in the
|
||||
documentation and/or other materials provided with the distribution.
|
||||
3. The name of the MLD2P4 group or the names of its contributors may
|
||||
not be used to endorse or promote products derived from this
|
||||
software without specific written permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
|
||||
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE MLD2P4 GROUP OR ITS CONTRIBUTORS
|
||||
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
||||
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
||||
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
||||
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
||||
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGE.
|
||||
\end{verbatim}
|
||||
}
|
@ -1,9 +1,9 @@
|
||||
\section{Error Handling}\label{sec:errors}
|
||||
|
||||
Error handling
|
||||
- Breve descrizione con rinvio alla guida di PSBLAS
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
\section{Error Handling}\label{sec:errors}
|
||||
|
||||
Error handling
|
||||
- Breve descrizione con rinvio alla guida di PSBLAS
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
|
@ -1,224 +1,231 @@
|
||||
\section{Getting Started\label{sec:started}}
|
||||
|
||||
We describe the basics for building and applying MLD2P4 one-level and multi-level
|
||||
Schwarz preconditioners with the Krylov solvers included in PSBLAS \cite{}.
|
||||
The following five steps are required:
|
||||
\begin{enumerate}
|
||||
\item \emph{Allocate and initialize the preconditioner data structure, according to
|
||||
a preconditioner type chosen by the user}. This is performed by the routine
|
||||
\verb|mld_precinit|, which also sets a default preconditioner for each preconditioner
|
||||
type selected by the user. The default preconditioner associated to each preconditioner
|
||||
type is listed in Table~\ref{tab:precinit}; the string used by \verb|mld_precinit|
|
||||
to identify each preconditioner type is also given. The preconditioner data structure is
|
||||
the derived data type \verb|mld_prec_type|, which is accessed to the user only
|
||||
through the MLD2P4 routines.
|
||||
\item \emph{Choose a specific variant of the selected preconditioner type, by setting
|
||||
the preconditioner parameters.} This is performed by the routine \verb|mld_precset|.
|
||||
A few examples concerning the use of \verb|mld_precset| are given in
|
||||
Sections~\ref{sec:example1} and \ref{sec:example1}; a complete list of all the
|
||||
preconditioner parameters and their allowed values is provided in
|
||||
Section~\ref{sec:highlevel}.
|
||||
\item \emph{Build the preconditioner for a given matrix.} This is performed by
|
||||
the routine \verb|mld_precbld|.
|
||||
\item \emph{Apply the preconditioner at each iteration of a Krylov solver.}
|
||||
This is performed by the routine \verb|mld_precaply|. When using the PSBLAS Krylov solvers,
|
||||
this step is completely transparent to the user, since \verb|mld_precaply| is called
|
||||
by the PSBLAS routine implementing the Krylov solver (\verb|psb_krylov|).
|
||||
\item \emph{Deallocate the preconditioner data structure}. This is performed by
|
||||
the routine \verb|mld_precfree|. This step is complementary to step 1 and should
|
||||
be performed when the preconditioner is no more used.
|
||||
\end{enumerate}
|
||||
A detailed description of the above routines is given in Section~\ref{sec:highlevel}.
|
||||
|
||||
Note that the Fortran 95 module \verb|mld_prec_mod| must be used in the program
|
||||
calling the MLD2P4 routines. Furthermore, to apply MLD2P4 with the Krylov solvers
|
||||
from PSBLAS, the module \verb|psb_krylov_mod| must be used too.
|
||||
|
||||
Two simple example programs showing the (basic) use of MLD2P4 are reported in
|
||||
Section~\ref{sec:examples}.
|
||||
|
||||
\begin{table}[th]
|
||||
{
|
||||
\begin{center}
|
||||
\begin{tabular}{|l|l|p{6.7cm}|}
|
||||
\hline
|
||||
Type & String & Default preconditioner \\ \hline
|
||||
No preconditioner &'NOPREC'& (Considered only to use the PSBLAS
|
||||
Krylov solvers with no preconditioner.) \\
|
||||
Diagonal & 'DIAG' & --- \\
|
||||
Block Jacobi & 'BJAC' & ILU(0) on the local blocks.\\
|
||||
Additive Schwarz & 'AS' & Restricted Additive Schwarz (RAS),
|
||||
with overlap 1 and ILU(0) on the local blocks. \\
|
||||
Multilevel &'ML' & Multi-level hybrid preconditioner (additive on the
|
||||
same level and multiplicative through the levels),
|
||||
with post-smoothing only. Number of levels: 2;
|
||||
post-smoother: block-Jacobi preconditioner, with ILU(0)
|
||||
on the local blocks; coarsest matrix: distributed among the
|
||||
processors; corase-level solver: 4 sweeps of the
|
||||
block-Jacobi solver, with ILU(0) on the blocks. \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
}
|
||||
\caption{Preconditioner types and default choices.\label{tab:precinit}}
|
||||
\end{table}
|
||||
|
||||
\subsection{Examples\label{sec:examples}}
|
||||
|
||||
The simple code reported below shows how to set and apply the MLD2P4 default multi-level
|
||||
preconditioned, i.e.\ the two-level hybrid post-smoothed Schwarz preconditioner, using block-Jacobi with ILU(0) on the blocks as basic preconditioner,
|
||||
a coarse matrix distributed among the processors, and four block-Jacobi sweeps with ILU(0) on the blocks as approximate coarse-level solver. The choice of this preconditioner is made
|
||||
by simply specifying \verb|'ML'| as second argument of \verb|mld_precinit|
|
||||
(a call to \verb|mld_precset| is not needed).
|
||||
The preconditioner is applied within the BiCGSTAB solver provided by PSBLAS.
|
||||
|
||||
The part of the code concerning the
|
||||
reading and assembling of the sparse matrix and the right-hand side vector, performed
|
||||
through the PSBLAS routines for sparse matrix and vector management, is not reported
|
||||
here for brevity. Other statements concerning the use of PSBLAS are neglected too.
|
||||
The complete code can be found in the example program file \verb|example_2lev_default.f90|
|
||||
in the directory \textbf{XXXXXX (SPECIFICARE).} Note that the modules \verb|psb_base_mod|
|
||||
and \verb|psb_util_mod| at the beginning of the code are required by PSBLAS.
|
||||
For details on the use of the PSBLAS routines, see the PSBLAS User's Guide \cite{}.
|
||||
|
||||
\begin{verbatim}
|
||||
use psb_base_mod
|
||||
use psb_util_mod
|
||||
use mld_prec_mod
|
||||
use psb_krylov_mod
|
||||
... ...
|
||||
!
|
||||
! sparse matrix
|
||||
type(psb_dspmat_type) :: A
|
||||
! sparse matrix descriptor
|
||||
type(psb_desc_type) :: DESC_A
|
||||
! preconditioner
|
||||
type(mld_prec_type) :: PRE
|
||||
... ...
|
||||
!
|
||||
! initialize the parallel environment
|
||||
call psb_init(ictxt)
|
||||
call psb_info(ictxt,iam,np)
|
||||
... ...
|
||||
!
|
||||
! read and assemble the matrix A and the right-hand
|
||||
! side b using PSBLAS routines for sparse matrix /
|
||||
! vector management
|
||||
... ...
|
||||
!
|
||||
! initialize the default multi-level preconditioner
|
||||
! (two-level hybrid post-smoothed Schwarz)
|
||||
call mld_precinit(PRE,'ML',info)
|
||||
!
|
||||
! build the preconditioner
|
||||
call psb_precbld(A,PRE,DESC_A,info)
|
||||
!
|
||||
! set the solver parameters and the initial guess
|
||||
... ...
|
||||
!
|
||||
! solve Ax=b with preconditioned BiCGSTAB
|
||||
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
|
||||
... ...
|
||||
!
|
||||
! cleanup the preconditioner
|
||||
call mld_precfree(PRE,info)
|
||||
!
|
||||
! cleanup other data structures
|
||||
... ...
|
||||
!
|
||||
! exit the parallel environment
|
||||
call psb_exit(ictxt)
|
||||
stop
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
\textbf{MODIFICARE TUTTA LA PARTE CHE SEGUE:\\
|
||||
- solo istruzioni diverse dall'esempio precedente (essenzialmente il setting del precondizionatore, magari con piu' chiamate a precset;\\
|
||||
- lasciare l'osservazione sulla specifica esplicita del numero di livelli;\\
|
||||
- rimandare al paragrafo successivo per una decrizione accurata di tutti i parametri;\\
|
||||
- lasciare l'osservazione sui vecchi utenti di PSBLAS.}\\
|
||||
|
||||
In the following we describe the general procedure for setting and building one of the MLD2P4 preconditioners.
|
||||
The user has first to prepare the preconditioner data structure by using the routine \verb|mld_precinit|. Input parameters
|
||||
for this routine include a string parameter, needed to define the preconditioner type, and an optional integer parameter
|
||||
specifying the number of the levels in the case of a multi-level preconditioner.
|
||||
Note that if the optional parameter is not present and a multi-level preconditioner has been chosen,
|
||||
a two-level preconditioner is set. On the other hand, the integer parameter is ignored if the type of the preconditioner is not multilevel.
|
||||
In Table \ref{tab:precinit} we report both the possible choices for the preconditioner type
|
||||
and the related default preconditioners.
|
||||
|
||||
|
||||
The user of MLD2P4 may set a lot of parameters for one-level and multi-level Schwarz, in order
|
||||
to define a different preconditioner than that of default choices. The parameters
|
||||
can be set through the routine \verb|mld_precset|. The APIs of \verb|mld_precinit| and \verb|mld_precset| as well as the complete
|
||||
list of the parameters that can be set with the corresponding allowed values are reported in Section \ref{sec:highlevel}. In the following a simple code
|
||||
for a three-level hybrid post-smoothed Schwarz preconditioner, using RAS with overlap 1 as local preconditioner,
|
||||
with ILU(0) on the local blocks, a distributed coarse matrix, four block-Jacobi sweeps with the UMFPACK LU
|
||||
factorization on the blocks as coarse-matrix solver, is reported. Note that for the multi-level preconditioners, the levels are numbered in increasing
|
||||
order starting from the finest one, i.e. level 1 is the finest level.
|
||||
For more details, see the test program \verb|example2.f90| in xxxx(directory dei test).\\[0.5cm]
|
||||
|
||||
\begin{verbatim}
|
||||
use psb_base_mod
|
||||
use psb_util_mod
|
||||
use mld_prec_mod
|
||||
use psb_krylov_mod
|
||||
... ...
|
||||
!
|
||||
! sparse matrix
|
||||
type(psb_dspmat_type) :: A
|
||||
! sparse matrix descriptor
|
||||
type(psb_desc_type) :: DESC_A
|
||||
! preconditioner data
|
||||
type(mld_dprec_type) :: PRE
|
||||
... ...
|
||||
!
|
||||
! initialization of the parallel environment
|
||||
|
||||
call psb_init(ictxt)
|
||||
call psb_info(ictxt,iam,np)
|
||||
... ...
|
||||
! read and assemble the matrix A and the right-hand
|
||||
! side vector b using PSBLAS routines for sparse
|
||||
! matrix/vector management
|
||||
... ...
|
||||
! prepare the three-level hybrid post-smoothed Schwarz
|
||||
! using RAS with overlap 1 as local preconditioner
|
||||
!
|
||||
call mld_precinit(PRE,'ML',info,nlev=3)
|
||||
call mld_precset(PRE,mld_n_ovr_,novr=1,info,ilev=1)
|
||||
call mld_precset(PRE,mld_sub_restr_,psb_halo_,info,ilev=1)
|
||||
NOTA: e' PROPRIO BRUTTO "PSB_HALO_", BISOGNEREBBE AVERE COSTANTI CHE HANNO IL PREFISSO MLD!
|
||||
!
|
||||
! build preconditioner
|
||||
call psb_precbld(A,PRE,DESC_A,info)
|
||||
!
|
||||
! set solver parameters and initial guess
|
||||
... ...
|
||||
! solve Ax=b with preconditioned BiCGSTAB
|
||||
|
||||
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
|
||||
... ...
|
||||
!
|
||||
! cleanup storage and exit
|
||||
!
|
||||
call mld_precfree(PRE,info)
|
||||
!
|
||||
call psb_gefree(b,DESC_A,info)
|
||||
call psb_gefree(x,DESC_A,info)
|
||||
call psb_spfree(A,DESC_A,info)
|
||||
call psb_cdfree(DESC_A,info)
|
||||
!
|
||||
call psb_exit(ictxt)
|
||||
stop
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
{\bf Remark for users with PSBLAS-based legacy codes:} when MLD2P4 is installed, a PSBLAS user, with a PSBLAS-based legacy code
|
||||
calling base preconditioners included in PSBLAS (NOPREC, DIAG and BJAC), is able to use the same preconditioners without changes to the code, if she/he
|
||||
includes in her/his program the file \verb|psb_prec_mod|.
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
\section{Getting Started\label{sec:started}}
|
||||
|
||||
We describe the basics for building and applying MLD2P4 one-level and multi-level
|
||||
Schwarz preconditioners with the Krylov solvers included in PSBLAS \cite{}.
|
||||
The following steps are required:
|
||||
\begin{enumerate}
|
||||
\item \emph{Declare the preconditioner data structure}. It is a derived data type,
|
||||
\verb|mld_|\emph{x}\verb|prec_type|,where \emph{x} may be \verb|s|, \verb|d|, \verb|c|
|
||||
or \verb|z|, according to the basic data type of the sparse matrix
|
||||
(\verb|s| = real single precision; \verb|s| = real double precision;
|
||||
\verb|c| = complex single precision; \verb|z| = complex double precision).
|
||||
This data structure is accessed by the user only through the MLD2P4 routines,
|
||||
following an object-oriented approach.
|
||||
\item \emph{Allocate and initialize the preconditioner data structure, according to
|
||||
a preconditioner type chosen by the user}. This is performed by the routine
|
||||
\verb|mld_precinit|, which also sets a default preconditioner for each preconditioner
|
||||
type selected by the user. The default preconditioner associated to each preconditioner
|
||||
type is listed in Table~\ref{tab:precinit}; the string used by \verb|mld_precinit|
|
||||
to identify each preconditioner type is also given.
|
||||
\item \emph{Choose a specific preconditioner within the selected preconditioner type, by setting
|
||||
the preconditioner parameters.} This is performed by the routine \verb|mld_precset|.
|
||||
A few examples concerning the use of \verb|mld_precset| are given in
|
||||
Section~\ref{sec:examples}; a complete list of all the
|
||||
preconditioner parameters and their allowed values is provided in
|
||||
Section~\ref{sec:highlevel}.
|
||||
\item \emph{Build the preconditioner for a given matrix.} This is performed by
|
||||
the routine \verb|mld_precbld|.
|
||||
\item \emph{Apply the preconditioner at each iteration of a Krylov solver.}
|
||||
This is performed by the routine \verb|mld_precaply|. When using the PSBLAS Krylov solvers,
|
||||
this step is completely transparent to the user, since \verb|mld_precaply| is called
|
||||
by the PSBLAS routine implementing the Krylov solver (\verb|psb_krylov|).
|
||||
\item \emph{Deallocate the preconditioner data structure}. This is performed by
|
||||
the routine \verb|mld_precfree|. This step is complementary to step 1 and should
|
||||
be performed when the preconditioner is no more used.
|
||||
\end{enumerate}
|
||||
A detailed description of the above routines is given in Section~\ref{sec:highlevel}.
|
||||
|
||||
Note that the Fortran 95 module \verb|mld_prec_mod| must be used in the program
|
||||
calling the MLD2P4 routines. Furthermore, to apply MLD2P4 with the Krylov solvers
|
||||
from PSBLAS, the module \verb|psb_krylov_mod| must be used too.
|
||||
|
||||
Examples showing the basic use of MLD2P4 are reported in Section~\ref{sec:examples}.
|
||||
|
||||
\begin{table}[th]
|
||||
{
|
||||
\begin{center}
|
||||
\begin{tabular}{|l|l|p{6.7cm}|}
|
||||
\hline
|
||||
Type & String & Default preconditioner \\ \hline
|
||||
No preconditioner &\verb|'NOPREC'|& (Considered only to use the PSBLAS
|
||||
Krylov solvers with no preconditioner.) \\
|
||||
Diagonal & \verb|'DIAG'| & --- \\
|
||||
Block Jacobi & \verb|'BJAC'| & Block Jacobi with ILU(0) on the local blocks.\\
|
||||
Additive Schwarz & \verb|'AS'| & Restricted Additive Schwarz (RAS),
|
||||
with overlap 1 and ILU(0) on the local blocks. \\
|
||||
Multilevel &\verb|'ML'| & Multi-level hybrid preconditioner (additive on the
|
||||
same level and multiplicative through the levels),
|
||||
with post-smoothing only. Number of levels: 2;
|
||||
post-smoother: block-Jacobi preconditioner with ILU(0)
|
||||
on the local blocks; coarsest matrix: distributed among the
|
||||
processors; corase-level solver: 4 sweeps of the
|
||||
block-Jacobi solver, with ILU(0) on the blocks. \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
}
|
||||
\caption{Preconditioner types and default choices.\label{tab:precinit}}
|
||||
\end{table}
|
||||
|
||||
\subsection{Examples\label{sec:examples}}
|
||||
|
||||
The code reported below shows how to set and apply the MLD2P4 default multi-level
|
||||
preconditioned, i.e.\ the two-level hybrid post-smoothed Schwarz preconditioner,
|
||||
using block-Jacobi with ILU(0) on the blocks as basic preconditioner,
|
||||
a coarse matrix distributed among the processors, and four block-Jacobi
|
||||
sweeps with ILU(0) on the blocks as approximate coarse-level solver.
|
||||
The choice of this preconditioner is made
|
||||
by simply specifying \verb|'ML'| as second argument of \verb|mld_precinit|
|
||||
(a call to \verb|mld_precset| is not needed).
|
||||
The preconditioner is applied within the BiCGSTAB solver provided by PSBLAS.
|
||||
|
||||
The part of the code concerning the
|
||||
reading and assembling of the sparse matrix and the right-hand side vector, performed
|
||||
through the PSBLAS routines for sparse matrix and vector management, is not reported
|
||||
here for brevity. Other statements concerning the use of PSBLAS are neglected too.
|
||||
The complete code can be found in the example program file \verb|example_2lev_default.f90|
|
||||
in the directory \textbf{XXXXXX (SPECIFICARE).} Note that the modules \verb|psb_base_mod|
|
||||
and \verb|psb_util_mod| at the beginning of the code are required by PSBLAS.
|
||||
For details on the use of the PSBLAS routines, see the PSBLAS User's Guide \cite{}.
|
||||
|
||||
\begin{verbatim}
|
||||
use psb_base_mod
|
||||
use psb_util_mod
|
||||
use mld_prec_mod
|
||||
use psb_krylov_mod
|
||||
... ...
|
||||
!
|
||||
! sparse matrix
|
||||
type(psb_dspmat_type) :: A
|
||||
! sparse matrix descriptor
|
||||
type(psb_desc_type) :: DESC_A
|
||||
! preconditioner
|
||||
type(mld_dprec_type) :: PRE
|
||||
... ...
|
||||
!
|
||||
! initialize the parallel environment
|
||||
call psb_init(ictxt)
|
||||
call psb_info(ictxt,iam,np)
|
||||
... ...
|
||||
!
|
||||
! read and assemble the matrix A and the right-hand
|
||||
! side b using PSBLAS routines for sparse matrix /
|
||||
! vector management
|
||||
... ...
|
||||
!
|
||||
! initialize the default multi-level preconditioner
|
||||
! (two-level hybrid post-smoothed Schwarz)
|
||||
call mld_precinit(PRE,'ML',info)
|
||||
!
|
||||
! build the preconditioner
|
||||
call psb_precbld(A,PRE,DESC_A,info)
|
||||
!
|
||||
! set the solver parameters and the initial guess
|
||||
... ...
|
||||
!
|
||||
! solve Ax=b with preconditioned BiCGSTAB
|
||||
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
|
||||
... ...
|
||||
!
|
||||
! cleanup the preconditioner
|
||||
call mld_precfree(PRE,info)
|
||||
!
|
||||
! cleanup other data structures
|
||||
... ...
|
||||
!
|
||||
! exit the parallel environment
|
||||
call psb_exit(ictxt)
|
||||
stop
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
\textbf{MODIFICARE TUTTA LA PARTE CHE SEGUE:\\
|
||||
- solo istruzioni diverse dall'esempio precedente (essenzialmente il setting del precondizionatore, magari con piu' chiamate a precset;\\
|
||||
- lasciare l'osservazione sulla specifica esplicita del numero di livelli;\\
|
||||
- rimandare al paragrafo successivo per una decrizione accurata di tutti i parametri;\\
|
||||
- lasciare l'osservazione sui vecchi utenti di PSBLAS.}\\
|
||||
|
||||
In the following we describe the general procedure for setting and building one of the MLD2P4 preconditioners.
|
||||
The user has first to prepare the preconditioner data structure by using the routine \verb|mld_precinit|. Input parameters
|
||||
for this routine include a string parameter, needed to define the preconditioner type, and an optional integer parameter
|
||||
specifying the number of the levels in the case of a multi-level preconditioner.
|
||||
Note that if the optional parameter is not present and a multi-level preconditioner has been chosen,
|
||||
a two-level preconditioner is set. On the other hand, the integer parameter is ignored if the type of the preconditioner is not multilevel.
|
||||
In Table \ref{tab:precinit} we report both the possible choices for the preconditioner type
|
||||
and the related default preconditioners.
|
||||
|
||||
|
||||
The user of MLD2P4 may set a lot of parameters for one-level and multi-level Schwarz, in order
|
||||
to define a different preconditioner than that of default choices. The parameters
|
||||
can be set through the routine \verb|mld_precset|. The APIs of \verb|mld_precinit| and \verb|mld_precset| as well as the complete
|
||||
list of the parameters that can be set with the corresponding allowed values are reported in Section \ref{sec:highlevel}. In the following a simple code
|
||||
for a three-level hybrid post-smoothed Schwarz preconditioner, using RAS with overlap 1 as local preconditioner,
|
||||
with ILU(0) on the local blocks, a distributed coarse matrix, four block-Jacobi sweeps with the UMFPACK LU
|
||||
factorization on the blocks as coarse-matrix solver, is reported. Note that for the multi-level preconditioners, the levels are numbered in increasing
|
||||
order starting from the finest one, i.e. level 1 is the finest level.
|
||||
For more details, see the test program \verb|example2.f90| in xxxx(directory dei test).\\[0.5cm]
|
||||
|
||||
\begin{verbatim}
|
||||
use psb_base_mod
|
||||
use psb_util_mod
|
||||
use mld_prec_mod
|
||||
use psb_krylov_mod
|
||||
... ...
|
||||
!
|
||||
! sparse matrix
|
||||
type(psb_dspmat_type) :: A
|
||||
! sparse matrix descriptor
|
||||
type(psb_desc_type) :: DESC_A
|
||||
! preconditioner data
|
||||
type(mld_dprec_type) :: PRE
|
||||
... ...
|
||||
!
|
||||
! initialization of the parallel environment
|
||||
|
||||
call psb_init(ictxt)
|
||||
call psb_info(ictxt,iam,np)
|
||||
... ...
|
||||
! read and assemble the matrix A and the right-hand
|
||||
! side vector b using PSBLAS routines for sparse
|
||||
! matrix/vector management
|
||||
... ...
|
||||
! prepare the three-level hybrid post-smoothed Schwarz
|
||||
! using RAS with overlap 1 as local preconditioner
|
||||
!
|
||||
call mld_precinit(PRE,'ML',info,nlev=3)
|
||||
call mld_precset(PRE,mld_n_ovr_,novr=1,info,ilev=1)
|
||||
call mld_precset(PRE,mld_sub_restr_,psb_halo_,info,ilev=1)
|
||||
NOTA: e' PROPRIO BRUTTO "PSB_HALO_", BISOGNEREBBE AVERE COSTANTI CHE HANNO IL PREFISSO MLD!
|
||||
!
|
||||
! build preconditioner
|
||||
call psb_precbld(A,PRE,DESC_A,info)
|
||||
!
|
||||
! set solver parameters and initial guess
|
||||
... ...
|
||||
! solve Ax=b with preconditioned BiCGSTAB
|
||||
|
||||
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
|
||||
... ...
|
||||
!
|
||||
! cleanup storage and exit
|
||||
!
|
||||
call mld_precfree(PRE,info)
|
||||
!
|
||||
call psb_gefree(b,DESC_A,info)
|
||||
call psb_gefree(x,DESC_A,info)
|
||||
call psb_spfree(A,DESC_A,info)
|
||||
call psb_cdfree(DESC_A,info)
|
||||
!
|
||||
call psb_exit(ictxt)
|
||||
stop
|
||||
|
||||
\end{verbatim}
|
||||
|
||||
{\bf Remark for users with PSBLAS-based legacy codes:} when MLD2P4 is installed, a PSBLAS user, with a PSBLAS-based legacy code
|
||||
calling base preconditioners included in PSBLAS (NOPREC, DIAG and BJAC), is able to use the same preconditioners without changes to the code, if she/he
|
||||
includes in her/his program the file \verb|psb_prec_mod|.
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
|
@ -1,10 +1,10 @@
|
||||
\section{List of Routines}\label{sec:routines}
|
||||
|
||||
Elenco (ordine alfabetico) di tutte le routine, con rinvio (ipertestuale e num. pag.) alla descrizione
|
||||
di ciascuna in qualche paragrafo precedente
|
||||
(una specie di indice analitico, che rimanda alle routine descritte precedentemente nei rispettivi paragrafi)
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
\section{List of Routines}\label{sec:routines}
|
||||
|
||||
Elenco (ordine alfabetico) di tutte le routine, con rinvio (ipertestuale e num. pag.) alla descrizione
|
||||
di ciascuna in qualche paragrafo precedente
|
||||
(una specie di indice analitico, che rimanda alle routine descritte precedentemente nei rispettivi paragrafi)
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
|
@ -1,62 +1,62 @@
|
||||
\section{General Overview\label{sec:overview}}
|
||||
|
||||
The \emph{Multi-Level Domain Decomposition Parallel Preconditioners Package based on
|
||||
PSBLAS (MLD2P4}) provides various versions of multi-level Schwarz preconditioners~\cite{DD2},
|
||||
to be used in the iterative solutions of sparse linear systems $Ax=b$, where
|
||||
$A$ is a square, real or complex, sparse matrix with a symmetric sparsity pattern.
|
||||
\textbf{Ma non abbiamo detto che, se il pattern di sparista' non e' simmetrico,
|
||||
lavoriamo su $(A+A^T)/2$? Ma questo vale solo per l'aggregazione? Dovremmo fare
|
||||
qualcosa di consistente anche con 1-lev Schwarz.}
|
||||
Both additive and hybrid preconditioners, i.e.\ multiplicative among the levels
|
||||
and additive inside a level, are implemented; the basic additive Schwarz preconditioners
|
||||
are obtained by considering only one level. A purely algebraic approach is used to
|
||||
generate a sequence of coarse-level corrections to a basic preconditioner, without
|
||||
explicitly using any information on the geometry of the original problem (e.g.\ the
|
||||
discretization of a PDE). The smoothed aggregation technique is applied
|
||||
as algebraic coarsening strategy~\cite{}.
|
||||
|
||||
The package is written in Fortran~95, using object-oriented techniques,
|
||||
and is based on a distributed-memory parallel programming paradigm. \textbf{SALVATORE,
|
||||
potresti aggiungere due righe sulla scelta del Fortran 95 e sul semplice interfacciamento
|
||||
con i legacy codes, senza ripetere quello che e' detto sotto sulla scelta di PSBLAS?}
|
||||
Single and double precision implementations of MLD2P4 are available for both the
|
||||
real and the complex case, that can be used through a single interface.
|
||||
\textbf{SALVATORE, funziona tutto?}
|
||||
|
||||
MLD2P4 has been designed to implement scalable and easy-to-use multilevel preconditioners
|
||||
in the context of the PSBLAS (Parallel Sparse BLAS) computational framework~\cite{}.
|
||||
PSBLAS is a library originally developed to address the parallel implementation of
|
||||
iterative solvers for sparse linear system, by providing basic linear algebra
|
||||
operators and data management facilities for distributed sparse matrices; it
|
||||
also includes parallel Krylov solvers, built on the top of the basic PSBLAS kernels.
|
||||
The preconditioners available in MLD2P4 can be used with these Krylov solvers.
|
||||
The choice of PSBLAS has been mainly motivated by the need of having
|
||||
a portable and efficient software infrastructure implementing ``de facto'' standard
|
||||
parallel sparse linear algebra kernels, to pursue goals such as performance,
|
||||
portability, modularity ed extensibility in the development of the preconditioner
|
||||
package. On the other hand, the implementation of MLD2P4 has led to some
|
||||
revisions and extentions of the PSBLAS kernels, leading to the
|
||||
recent PSBLAS 2.0 version~\cite{}. The inter-process comunication required
|
||||
by MLD2P4 is encapsulated into the PSBLAS routines, except few cases where
|
||||
MPI~\cite{} is explicitly called. Therefore, MLD2P4 can be run on any parallel
|
||||
machine where PSBLAS and MPI implementations are available.
|
||||
|
||||
MLD2P4 has a layered and modular software architecture where three main layers can be identified. The lower layer consists of the PSBLAS kernels, the middle one implements
|
||||
the construction and application phases of the preconditioners, and the upper one
|
||||
provides a uniform and easy-to-use interface to all the preconditioners.
|
||||
This architecture allows for different levels of use of the package:
|
||||
few black-box routines at the upper level allow non-expert users to easily
|
||||
build any preconditioner available in MLD2P4 and to apply it within a PSBLAS Krylov solver.
|
||||
On the other hand, the routines of the middle and lower layer can be used and extended
|
||||
by expert users to build new versions of multi-level Schwarz preconditioners.\\
|
||||
|
||||
\textbf{Organizzazione della guida:\\
|
||||
dire che per il momento non
|
||||
forniamo anche la documentazione del middle layer, ma lo faremo in seguito\\}
|
||||
|
||||
\textbf{Evidenziare le parole chiave che caratterizzano il nostro package}
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
\section{General Overview\label{sec:overview}}
|
||||
|
||||
The \emph{Multi-Level Domain Decomposition Parallel Preconditioners Package based on
|
||||
PSBLAS (MLD2P4}) provides various versions of multi-level Schwarz preconditioners~\cite{DD2},
|
||||
to be used in the iterative solutions of sparse linear systems $Ax=b$, where
|
||||
$A$ is a square, real or complex, sparse matrix with a symmetric sparsity pattern.
|
||||
\textbf{Ma non abbiamo detto che, se il pattern di sparista' non e' simmetrico,
|
||||
lavoriamo su $(A+A^T)/2$? Ma questo vale solo per l'aggregazione? Dovremmo fare
|
||||
qualcosa di consistente anche con 1-lev Schwarz.}
|
||||
Both additive and hybrid preconditioners, i.e.\ multiplicative among the levels
|
||||
and additive inside a level, are implemented; the basic additive Schwarz preconditioners
|
||||
are obtained by considering only one level. A purely algebraic approach is used to
|
||||
generate a sequence of coarse-level corrections to a basic preconditioner, without
|
||||
explicitly using any information on the geometry of the original problem (e.g.\ the
|
||||
discretization of a PDE). The smoothed aggregation technique is applied
|
||||
as algebraic coarsening strategy~\cite{}.
|
||||
|
||||
The package is written in Fortran~95, using object-oriented techniques,
|
||||
and is based on a distributed-memory parallel programming paradigm. \textbf{SALVATORE,
|
||||
potresti aggiungere due righe sulla scelta del Fortran 95 e sul semplice interfacciamento
|
||||
con i legacy codes, senza ripetere quello che e' detto sotto sulla scelta di PSBLAS?}
|
||||
Single and double precision implementations of MLD2P4 are available for both the
|
||||
real and the complex case, that can be used through a single interface.
|
||||
\textbf{SALVATORE, funziona tutto?}
|
||||
|
||||
MLD2P4 has been designed to implement scalable and easy-to-use multilevel preconditioners
|
||||
in the context of the PSBLAS (Parallel Sparse BLAS) computational framework~\cite{}.
|
||||
PSBLAS is a library originally developed to address the parallel implementation of
|
||||
iterative solvers for sparse linear system, by providing basic linear algebra
|
||||
operators and data management facilities for distributed sparse matrices; it
|
||||
also includes parallel Krylov solvers, built on the top of the basic PSBLAS kernels.
|
||||
The preconditioners available in MLD2P4 can be used with these Krylov solvers.
|
||||
The choice of PSBLAS has been mainly motivated by the need of having
|
||||
a portable and efficient software infrastructure implementing ``de facto'' standard
|
||||
parallel sparse linear algebra kernels, to pursue goals such as performance,
|
||||
portability, modularity ed extensibility in the development of the preconditioner
|
||||
package. On the other hand, the implementation of MLD2P4 has led to some
|
||||
revisions and extentions of the PSBLAS kernels, leading to the
|
||||
recent PSBLAS 2.0 version~\cite{}. The inter-process comunication required
|
||||
by MLD2P4 is encapsulated into the PSBLAS routines, except few cases where
|
||||
MPI~\cite{} is explicitly called. Therefore, MLD2P4 can be run on any parallel
|
||||
machine where PSBLAS and MPI implementations are available.
|
||||
|
||||
MLD2P4 has a layered and modular software architecture where three main layers can be identified. The lower layer consists of the PSBLAS kernels, the middle one implements
|
||||
the construction and application phases of the preconditioners, and the upper one
|
||||
provides a uniform and easy-to-use interface to all the preconditioners.
|
||||
This architecture allows for different levels of use of the package:
|
||||
few black-box routines at the upper level allow non-expert users to easily
|
||||
build any preconditioner available in MLD2P4 and to apply it within a PSBLAS Krylov solver.
|
||||
On the other hand, the routines of the middle and lower layer can be used and extended
|
||||
by expert users to build new versions of multi-level Schwarz preconditioners.\\
|
||||
|
||||
\textbf{Organizzazione della guida:\\
|
||||
dire che per il momento non
|
||||
forniamo anche la documentazione del middle layer, ma lo faremo in seguito\\}
|
||||
|
||||
\textbf{Evidenziare le parole chiave che caratterizzano il nostro package}
|
||||
|
||||
%%% Local Variables:
|
||||
%%% mode: latex
|
||||
%%% TeX-master: "userguide"
|
||||
%%% End:
|
||||
|
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue