docs/pdf/Makefile
 docs/pdf/abstract.tex
 docs/pdf/advanced.tex
 docs/pdf/background.tex
 docs/pdf/bibliography.tex
 docs/pdf/building.tex
 docs/pdf/conventions.tex
 docs/pdf/distribution.tex
 docs/pdf/errors.tex
 docs/pdf/gettingstarted.tex
 docs/pdf/highlevelview.tex
 docs/pdf/listofroutines.tex
 docs/pdf/overview.tex
 docs/pdf/userguide.tex
 docs/userguide.pdf

New documentation, partial fixes.
stopcriterion
Salvatore Filippone 17 years ago
parent 9eeef87a3a
commit 001f6693b8

@ -1,19 +1,19 @@
\begin{abstract} \begin{abstract}
\emph{MLD2P4 (Multi-Level Domain Decomposition Parallel Preconditioners Package based on \emph{MLD2P4 (Multi-Level Domain Decomposition Parallel Preconditioners Package based on
PSBLAS}) is a package of parallel algebraic multi-level preconditioners. PSBLAS}) is a package of parallel algebraic multi-level preconditioners.
It implements various versions of one-level additive and of multi-level additive It implements various versions of one-level additive and of multi-level additive
and hybrid Schwarz algorithms. In the multi-level case, a purely algebraic approach and hybrid Schwarz algorithms. In the multi-level case, a purely algebraic approach
is applied to generate coarse-level corrections, so that no geometric background is needed is applied to generate coarse-level corrections, so that no geometric background is needed
concerning the matrix to be preconditioned. The matrix is required to be square, real or complex, with a symmetric sparsity pattern \textbf{Non consideriamo anche il caso non simmetrico concerning the matrix to be preconditioned. The matrix is required to be square, real or complex, with a symmetric sparsity pattern \textbf{Non consideriamo anche il caso non simmetrico
con $(A+A^T)/2$?}. con $(A+A^T)/2$?}.
MLD2P4 has been designed to provide scalable and easy-to-use preconditioners in the MLD2P4 has been designed to provide scalable and easy-to-use preconditioners in the
context of the PSBLAS (Parallel Sparse Basic Linear Algebra Subprograms) context of the PSBLAS (Parallel Sparse Basic Linear Algebra Subprograms)
computational framework and can be used in conjuction with the Krylov solvers computational framework and can be used in conjuction with the Krylov solvers
available in this framework. MLD2P4 enables the user to easily specify different aspects available in this framework. MLD2P4 enables the user to easily specify different aspects
of a generic algebraic multilevel Schwarz preconditioner, thus allowing to search of a generic algebraic multilevel Schwarz preconditioner, thus allowing to search
for the ``best'' preconditioner for the problem at hand. The package has been designed for the ``best'' preconditioner for the problem at hand. The package has been designed
employing object-oriented techniques, using Fortran 95 and MPI, with interfaces to employing object-oriented techniques, using Fortran 95 and MPI, with interfaces to
additional external libraries such as UMFPACK, SuperLU and SuperLU\_Dist, that additional external libraries such as UMFPACK, SuperLU and SuperLU\_Dist, that
can be exploited in building multi-level preconditioners. can be exploited in building multi-level preconditioners.
\end{abstract} \end{abstract}

@ -1,12 +1,12 @@
\section{Advanced Use}\label{sec:advanced} \section{Advanced Use}\label{sec:advanced}
- MLD2P4 software architecture \\ - MLD2P4 software architecture \\
- preconditioner data structure (descrizione "dettagliata") + possibilita' di settare singolarmente - preconditioner data structure (descrizione "dettagliata") + possibilita' di settare singolarmente
i vari livelli (possibilita' accennata solamente nella precedente descrizione di precset) \\ i vari livelli (possibilita' accennata solamente nella precedente descrizione di precset) \\
- descrizione routine medium level (con introduzione sulle potenzialita' di ampliamento (?), offerte - descrizione routine medium level (con introduzione sulle potenzialita' di ampliamento (?), offerte
da queto strato software) \\ da queto strato software) \\
%%% Local Variables: %%% Local Variables:
%%% mode: latex %%% mode: latex
%%% TeX-master: "userguide" %%% TeX-master: "userguide"
%%% End: %%% End:

@ -1,291 +1,291 @@
\section{Multi-level Domain Decomposition Background\label{sec:background}} \section{Multi-level Domain Decomposition Background\label{sec:background}}
\emph{Domain Decomposition} (DD) preconditioners, coupled with Krylov iterative \emph{Domain Decomposition} (DD) preconditioners, coupled with Krylov iterative
solvers, are widely used in the parallel solution of large and sparse linear systems. solvers, are widely used in the parallel solution of large and sparse linear systems.
These preconditioners are based on the divide and conquer technique: the matrix These preconditioners are based on the divide and conquer technique: the matrix
to be preconditioned is divided into submatrices, a ``local linear system'' to be preconditioned is divided into submatrices, a ``local linear system''
involving each submatrix is (approximately) solved, and the local solutions are used involving each submatrix is (approximately) solved, and the local solutions are used
to build a preconditioner for the whole original matrix. This process to build a preconditioner for the whole original matrix. This process
often corresponds to dividing a physical domain associated to the original matrix often corresponds to dividing a physical domain associated to the original matrix
into subdomains, e.g. in a PDE discretization, to (approximately) solving the into subdomains, e.g. in a PDE discretization, to (approximately) solving the
subproblems corresponding to the subdomains and to building an approximate subproblems corresponding to the subdomains and to building an approximate
solution of the original problem from the local solutions solution of the original problem from the local solutions
\cite{Cai_Widlund_92,dd1_94,dd2_96}. \cite{Cai_Widlund_92,dd1_94,dd2_96}.
\emph{Additive Schwarz} preconditioners are DD preconditioners using overlapping \emph{Additive Schwarz} preconditioners are DD preconditioners using overlapping
submatrices, i.e.\ with some common rows, to couple the local information submatrices, i.e.\ with some common rows, to couple the local information
related to the submatrices (see, e.g., \cite{dd2_96}). related to the submatrices (see, e.g., \cite{dd2_96}).
The main motivations for choosing Additive Schwarz preconditioners are their The main motivations for choosing Additive Schwarz preconditioners are their
intrinsic parallelism and good \textbf{(dire good e' un po' "`forte"', dato che intrinsic parallelism and good \textbf{(dire good e' un po' "`forte"', dato che
subito dopo diciamo che la convergenza dipende dal numero di sottomatrici)} subito dopo diciamo che la convergenza dipende dal numero di sottomatrici)}
convergence properties. A drawback of these convergence properties. A drawback of these
preconditioners is that the number of iterations of the preconditioned solvers preconditioners is that the number of iterations of the preconditioned solvers
generally grows with the number of submatrices. This may be a serious limitation generally grows with the number of submatrices. This may be a serious limitation
on parallel computers, since the number of submatrices usually matches the number on parallel computers, since the number of submatrices usually matches the number
of available processors. Optimal convergence rates, i.e.\ iteration numbers of available processors. Optimal convergence rates, i.e.\ iteration numbers
independent of the number of submatrices, can be obtained by correcting the independent of the number of submatrices, can be obtained by correcting the
preconditioner through a suitable approximation of the original linear system preconditioner through a suitable approximation of the original linear system
in a coarse space, which globally couples the information related to the single in a coarse space, which globally couples the information related to the single
submatrices. submatrices.
\emph{Two-level Schwarz} preconditioners are obtained \emph{Two-level Schwarz} preconditioners are obtained
by combining basic (one-level) Schwarz preconditioners with coarse-level by combining basic (one-level) Schwarz preconditioners with coarse-level
corrections. In this context, the one-level preconditioner is often corrections. In this context, the one-level preconditioner is often
called smoother. Different two-level preconditioners are obtained by varying the called smoother. Different two-level preconditioners are obtained by varying the
choice of the smoother, of the coarse-level correction and the choice of the smoother, of the coarse-level correction and the
way they are combined \cite{dd2_96}. The same reasoning can be applied starting way they are combined \cite{dd2_96}. The same reasoning can be applied starting
from the coarse-level system, i.e.\ a coarse-space correction can be built from the coarse-level system, i.e.\ a coarse-space correction can be built
from this system, thus obtaining \emph{multi-level} preconditioners. from this system, thus obtaining \emph{multi-level} preconditioners.
It is worth noting that optimal preconditioners do not necessarily correspond It is worth noting that optimal preconditioners do not necessarily correspond
to minimum execution times. Indeed, to obtain effective multilevel preconditioners to minimum execution times. Indeed, to obtain effective multilevel preconditioners
a tradeoff between optimality of convergence and the cost of building and applying a tradeoff between optimality of convergence and the cost of building and applying
the coarse-space corrections must be achieved. The choice of the number of levels, the coarse-space corrections must be achieved. The choice of the number of levels,
i.e.\ of the coarse-space corrections, also affects the effectiveness of the i.e.\ of the coarse-space corrections, also affects the effectiveness of the
preconditioners. One more goal is to get convergence rates as less sensitive preconditioners. One more goal is to get convergence rates as less sensitive
as possible to variations in the matrix coefficients. as possible to variations in the matrix coefficients.
Two main approaches can be used to build coarse-space corrections. The geometric approach Two main approaches can be used to build coarse-space corrections. The geometric approach
applies coarsening strategies based on the knowledge of some physical grid associated applies coarsening strategies based on the knowledge of some physical grid associated
to the matrix and requires the user to define grid transfer operators from the fine to the matrix and requires the user to define grid transfer operators from the fine
to the coarse levels and vice versa. This may result difficult for complex geometries; to the coarse levels and vice versa. This may result difficult for complex geometries;
furthermore, suitable one-level preconditioners may be required to get efficient furthermore, suitable one-level preconditioners may be required to get efficient
interplay between fine and coarse levels, e.g.\ when matrices with highly varying coefficients interplay between fine and coarse levels, e.g.\ when matrices with highly varying coefficients
are considered. The algebraic approach builds coarse-space corrections using only matrix are considered. The algebraic approach builds coarse-space corrections using only matrix
information. It performs a fully automatic coarsening and enforces the interplay between information. It performs a fully automatic coarsening and enforces the interplay between
the fine and coarse levels by suitably choosing the coarse space and the coarse-to-fine the fine and coarse levels by suitably choosing the coarse space and the coarse-to-fine
interpolation \cite{StubenGMD69_99}. interpolation \cite{StubenGMD69_99}.
MLD2P4 uses a pure algebraic approach for building the sequence of coarse matrices MLD2P4 uses a pure algebraic approach for building the sequence of coarse matrices
starting from the original matrix. The algebraic approach is based on the \emph{smoothed starting from the original matrix. The algebraic approach is based on the \emph{smoothed
aggregation} algorithm \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}. A decoupled version aggregation} algorithm \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}. A decoupled version
of this algorithm is implemented, where the smoothed aggregation is applied locally of this algorithm is implemented, where the smoothed aggregation is applied locally
to each submatrix \cite{Tuminaro_Tong_00}. In the next two subsections we provide to each submatrix \cite{Tuminaro_Tong_00}. In the next two subsections we provide
a brief description of the multi-level Schwarz preconditioners and on the smoothed a brief description of the multi-level Schwarz preconditioners and on the smoothed
aggregation technique as implemented in MLD2P4. For further details the user aggregation technique as implemented in MLD2P4. For further details the user
is referred to \cite{para_04,apnum_07,aaecc_07,dd2_96}. is referred to \cite{para_04,apnum_07,aaecc_07,dd2_96}.
\subsection{Multi-level Schwarz Preconditioners\label{sec:multilevel}} \subsection{Multi-level Schwarz Preconditioners\label{sec:multilevel}}
The Multilevel preconditioners implemented in MLD2P4 are obtained by combining The Multilevel preconditioners implemented in MLD2P4 are obtained by combining
Additive Schwarz preconditioners with coarse-space corrections; therefore Additive Schwarz preconditioners with coarse-space corrections; therefore
we first provide a sketch of the Additive Schwarz preconditioners. we first provide a sketch of the Additive Schwarz preconditioners.
Given a linear system Given a linear system
\[ Ax=b, \] \[ Ax=b, \]
where $A=(a_{ij}) \in \Re^{n \times n}$ is a where $A=(a_{ij}) \in \Re^{n \times n}$ is a
nonsingular sparse matrix with a symmetric non-zero pattern, nonsingular sparse matrix with a symmetric non-zero pattern,
let $G=(W,E)$ be the adjacency graph of $A$, where $W=\{1, 2, \ldots, n\}$ let $G=(W,E)$ be the adjacency graph of $A$, where $W=\{1, 2, \ldots, n\}$
and $E=\{(i,j) : a_{ij} \neq 0\}$ are the vertex set and the edge set of $G$, and $E=\{(i,j) : a_{ij} \neq 0\}$ are the vertex set and the edge set of $G$,
respectively. Two vertices are called adjacent if there is an edge connecting respectively. Two vertices are called adjacent if there is an edge connecting
them. For any integer $\delta > 0$, a $\delta$-overlap them. For any integer $\delta > 0$, a $\delta$-overlap
partition of $W$ can be defined recursively as follows. partition of $W$ can be defined recursively as follows.
Given a 0-overlap (or non-overlapping) partition of $W$, Given a 0-overlap (or non-overlapping) partition of $W$,
i.e.\ a set of $m$ disjoint nonempty sets $W_i^0 \subset W$ such that i.e.\ a set of $m$ disjoint nonempty sets $W_i^0 \subset W$ such that
$\cup_{i=1}^m W_i^0 = W$, a $\delta$-overlap $\cup_{i=1}^m W_i^0 = W$, a $\delta$-overlap
partition of $W$ is obtained by considering the sets partition of $W$ is obtained by considering the sets
$W_i^\delta \supset W_i^{\delta-1}$, obtained by including the vertices that $W_i^\delta \supset W_i^{\delta-1}$, obtained by including the vertices that
are adjacent to any vertex in $W_i^{\delta-1}$. are adjacent to any vertex in $W_i^{\delta-1}$.
Let $n_i^\delta$ be the size of $W_i^\delta$ and $R_i^{\delta} \in Let $n_i^\delta$ be the size of $W_i^\delta$ and $R_i^{\delta} \in
\Re^{n_i^\delta \times n}$ the restriction operator that maps \Re^{n_i^\delta \times n}$ the restriction operator that maps
a vector $v \in \Re^n$ onto the vector $v_i^{\delta} \in \Re^{n_i^\delta}$ a vector $v \in \Re^n$ onto the vector $v_i^{\delta} \in \Re^{n_i^\delta}$
containing the components of $v$ corresponding to the vertices in containing the components of $v$ corresponding to the vertices in
$W_i^\delta$. The transpose of $R_i^{\delta}$ is a $W_i^\delta$. The transpose of $R_i^{\delta}$ is a
prolongation operator from $\Re^{n_i^\delta}$ to $\Re^n$. prolongation operator from $\Re^{n_i^\delta}$ to $\Re^n$.
The matrix $A_i^\delta=R_i^\delta A (R_i^\delta)^T \in The matrix $A_i^\delta=R_i^\delta A (R_i^\delta)^T \in
\Re^{n_i^\delta \times n_i^\delta}$ can be considered \Re^{n_i^\delta \times n_i^\delta}$ can be considered
as a restriction of $A$ corresponding to the set $W_i^{\delta}$. as a restriction of $A$ corresponding to the set $W_i^{\delta}$.
The \emph{classical one-level AS} preconditioner is defined by The \emph{classical one-level AS} preconditioner is defined by
\[ \[
M_{AS}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T M_{AS}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
(A_i^\delta)^{-1} R_i^{\delta}, (A_i^\delta)^{-1} R_i^{\delta},
\] \]
where $A_i^\delta$ is assumed to be nonsingular. Its application where $A_i^\delta$ is assumed to be nonsingular. Its application
to a vector $v \in \Re^n$ within a Krylov solver requires the following to a vector $v \in \Re^n$ within a Krylov solver requires the following
three steps: three steps:
\begin{enumerate} \begin{enumerate}
\item restriction of $v$ as $v_i = R_i^{\delta} v$, $i=1,\ldots,m$; \item restriction of $v$ as $v_i = R_i^{\delta} v$, $i=1,\ldots,m$;
\item (approximate) solution of the linear systems $A_i^\delta w_i = v_i$, \item (approximate) solution of the linear systems $A_i^\delta w_i = v_i$,
$i=1,\ldots,m$; $i=1,\ldots,m$;
\item prolongation and sum of the $w_i$'s, i.e. $w = \sum_{i=1}^m (R_i^{\delta})^T w_i$. \item prolongation and sum of the $w_i$'s, i.e. $w = \sum_{i=1}^m (R_i^{\delta})^T w_i$.
\end{enumerate} \end{enumerate}
A variant of the classical AS preconditioner that outperforms it A variant of the classical AS preconditioner that outperforms it
in terms of both convergence rate and of computation and communication in terms of both convergence rate and of computation and communication
time on parallel distributed-memory computers is the so-called \emph{Restricted AS time on parallel distributed-memory computers is the so-called \emph{Restricted AS
(RAS)} preconditioner~\cite{Cai_Sarkis,Efstathiou_Gander}. It (RAS)} preconditioner~\cite{Cai_Sarkis,Efstathiou_Gander}. It
is obtained by zeroing the components of $w_i$ corresponding to the is obtained by zeroing the components of $w_i$ corresponding to the
overlapping vertices when applying the prolongation. Therefore, overlapping vertices when applying the prolongation. Therefore,
RAS differs from classical AS by the prolongation operator $(R_i^{\delta})^T$, RAS differs from classical AS by the prolongation operator $(R_i^{\delta})^T$,
which is substituted by $(\tilde{R}_i^0)^T \in \Re^{n_i^\delta \times n}$, which is substituted by $(\tilde{R}_i^0)^T \in \Re^{n_i^\delta \times n}$,
where $\tilde{R}_i^0$ obtained by zeroing the rows of $R_i^\delta$ where $\tilde{R}_i^0$ obtained by zeroing the rows of $R_i^\delta$
corresponding to the vertices in $W_i^\delta \backslash W_i^0$: corresponding to the vertices in $W_i^\delta \backslash W_i^0$:
\[ \[
M_{RAS}^{-1}= \sum_{i=1}^m (\tilde{R}_i^0)^T M_{RAS}^{-1}= \sum_{i=1}^m (\tilde{R}_i^0)^T
(A_i^\delta)^{-1} R_i^{\delta}. (A_i^\delta)^{-1} R_i^{\delta}.
\] \]
Analogously, the AS variant called \emph{AS with Harmonic extension (ASH)} Analogously, the AS variant called \emph{AS with Harmonic extension (ASH)}
is defined by is defined by
\[ M_{ASH}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T \[ M_{ASH}^{-1}= \sum_{i=1}^m (R_i^{\delta})^T
(A_i^\delta)^{-1} \tilde{R}_i^0. (A_i^\delta)^{-1} \tilde{R}_i^0.
\] \]
We note that for $\delta=0$ the three variants of the AS preconditioner are We note that for $\delta=0$ the three variants of the AS preconditioner are
all equal to the block-Jacobi preconditioner. all equal to the block-Jacobi preconditioner.
As already observed, the convergence rate of the one-level Schwarz As already observed, the convergence rate of the one-level Schwarz
preconditioned iterative solvers deteriorates as the number $m$ of partitions preconditioned iterative solvers deteriorates as the number $m$ of partitions
of $W$ increases \cite{dd1_94,dd2_96}. To reduce the dependency of $W$ increases \cite{dd1_94,dd2_96}. To reduce the dependency
of the number of iterations on the degree of parallelism we may of the number of iterations on the degree of parallelism we may
introduce a global coupling among the overlapping partitions by defining introduce a global coupling among the overlapping partitions by defining
a coarse-space approximation $A_C$ of the matrix $A$. a coarse-space approximation $A_C$ of the matrix $A$.
In a pure algebraic setting, $A_C$ is usually built with In a pure algebraic setting, $A_C$ is usually built with
a Galerkin approach. Given a set $W_C$ of \emph{coarse vertices}, a Galerkin approach. Given a set $W_C$ of \emph{coarse vertices},
with size $n_C$, and a suitable restriction operator with size $n_C$, and a suitable restriction operator
$R_C \in \Re^{n_C \times n}$, $A_C$ is defined as $R_C \in \Re^{n_C \times n}$, $A_C$ is defined as
\[ \[
A_C=R_C A R_C^T A_C=R_C A R_C^T
\] \]
and the coarse-level correction matrix to be combined with a generic and the coarse-level correction matrix to be combined with a generic
one-level AS preconditioner $M_{1L}$ is obtained as one-level AS preconditioner $M_{1L}$ is obtained as
\[ \[
M_{C}^{-1}= R_C^T A_C^{-1} R_C, M_{C}^{-1}= R_C^T A_C^{-1} R_C,
\] \]
where $A_C$ is assumed to be nonsingular. The application of $M_{C}^{-1}$ where $A_C$ is assumed to be nonsingular. The application of $M_{C}^{-1}$
to a vector $v$ corresponds to a restriction, a solution and to a vector $v$ corresponds to a restriction, a solution and
a prolongation step; the solution step, involving the matrix $A_C$, a prolongation step; the solution step, involving the matrix $A_C$,
may be carried out also approximately. may be carried out also approximately.
The combination of $M_{C}$ and $M_{1L}$ may be The combination of $M_{C}$ and $M_{1L}$ may be
performed in either an additive or a multiplicative framework. performed in either an additive or a multiplicative framework.
In the former case, the \emph{two-level additive} Schwarz preconditioner In the former case, the \emph{two-level additive} Schwarz preconditioner
is obtained: is obtained:
\[ \[
M_{2LA}^{-1} = M_{C}^{-1} + M_{1L}^{-1}. M_{2LA}^{-1} = M_{C}^{-1} + M_{1L}^{-1}.
\] \]
Applying $M_{2L-A}^{-1}$ to a vector $v$ within a Krylov solver Applying $M_{2L-A}^{-1}$ to a vector $v$ within a Krylov solver
corresponds to applying $M_{C}^{-1}$ corresponds to applying $M_{C}^{-1}$
and $M_{1L}^{-1}$ to $v$ independently and then summing up and $M_{1L}^{-1}$ to $v$ independently and then summing up
the results. the results.
In the multiplicative case, the combination can be In the multiplicative case, the combination can be
performed by first applying the smoother $M_{1L}^{-1}$ and then performed by first applying the smoother $M_{1L}^{-1}$ and then
the coarse-level correction operator $M_{C}^{-1}$: the coarse-level correction operator $M_{C}^{-1}$:
\[ \[
\begin{array}{l} \begin{array}{l}
w = M_{1L}^{-1} v, \\ w = M_{1L}^{-1} v, \\
z = w + M_{C}^{-1} (v-Aw); z = w + M_{C}^{-1} (v-Aw);
\end{array} \end{array}
\] \]
this corresponds to the following \emph{two-level hybrid pre-smoothed} this corresponds to the following \emph{two-level hybrid pre-smoothed}
Schwarz preconditioner: Schwarz preconditioner:
\[ \[
M_{2LH-PRE}^{-1} = M_{C}^{-1} + \left( I - M_{C}^{-1}A \right) M_{1L}^{-1}. M_{2LH-PRE}^{-1} = M_{C}^{-1} + \left( I - M_{C}^{-1}A \right) M_{1L}^{-1}.
\] \]
On the other hand, by applying the smoother after the coarse-level correction, On the other hand, by applying the smoother after the coarse-level correction,
i.e.\ by computing i.e.\ by computing
\[ \[
\begin{array}{l} \begin{array}{l}
w = M_{C}^{-1} v , \\ w = M_{C}^{-1} v , \\
z = w + M_{1L}^{-1} (v-Aw) , z = w + M_{1L}^{-1} (v-Aw) ,
\end{array} \end{array}
\] \]
the \emph{two-level hybrid post-smoothed} the \emph{two-level hybrid post-smoothed}
Schwarz preconditioner is obtained: Schwarz preconditioner is obtained:
\[ \[
M_{2LH-POST}^{-1} = M_{1L}^{-1} + \left( I - M_{1L}^{-1}A \right) M_{C}^{-1}. M_{2LH-POST}^{-1} = M_{1L}^{-1} + \left( I - M_{1L}^{-1}A \right) M_{C}^{-1}.
\] \]
One more variant of two-level hybrid preconditioner is obtained by applying One more variant of two-level hybrid preconditioner is obtained by applying
the smoother before and after the coarse-level correction. In this case, the the smoother before and after the coarse-level correction. In this case, the
preconditioner is symmetric if $A$, $M_{1L}$ and $M_{C}$ are symmetric. preconditioner is symmetric if $A$, $M_{1L}$ and $M_{C}$ are symmetric.
As previously noted, on parallel computers the number of sumatrices usually matches As previously noted, on parallel computers the number of sumatrices usually matches
the number of available processors. When the size of the system to be preconditioned the number of available processors. When the size of the system to be preconditioned
is very large, the use of many proccessors, i.e.\ of many small submatrices, often is very large, the use of many proccessors, i.e.\ of many small submatrices, often
leads to a large coarse-level system, whose solution may be computationally expensive. leads to a large coarse-level system, whose solution may be computationally expensive.
On the other hand, the use of few processors often leads to local sumatrices that On the other hand, the use of few processors often leads to local sumatrices that
are too expensive to be processed on single processors, because of memory and/or are too expensive to be processed on single processors, because of memory and/or
computing requirements. Therefore, it seems natural to use a recursive approach, computing requirements. Therefore, it seems natural to use a recursive approach,
in which the coarse-level correction is re-applied starting from the current in which the coarse-level correction is re-applied starting from the current
coarse-level system. The corresponding preconditioners are called \emph{multi-level}. coarse-level system. The corresponding preconditioners are called \emph{multi-level}.
One more reason for the multi-level approach is that it may significantly One more reason for the multi-level approach is that it may significantly
reduce the computational cost of preconditioning with respect to the two-level case reduce the computational cost of preconditioning with respect to the two-level case
(see \cite[Chapter 3]{dd2_96}). Additive and hybrid multilevel preconditioners (see \cite[Chapter 3]{dd2_96}). Additive and hybrid multilevel preconditioners
are obtained as direct extensions of the two-level counterparts. Other combinations are obtained as direct extensions of the two-level counterparts. Other combinations
of the smoothers and coarse-level corrections are possible, leading to variants of the smoothers and coarse-level corrections are possible, leading to variants
of the previous algorithms. For a detailed descrition of them, the reader is of the previous algorithms. For a detailed descrition of them, the reader is
referred to \cite[Chapter 3]{dd2_96}. referred to \cite[Chapter 3]{dd2_96}.
\textbf{Secondo me qui ci vorrebbe una descrizione algoritmica, a titolo di esempio, \textbf{Secondo me qui ci vorrebbe una descrizione algoritmica, a titolo di esempio,
di un precondizionatore multilevel, ad esempio quello ibrido con pre-smoothing, sul tipo di un precondizionatore multilevel, ad esempio quello ibrido con pre-smoothing, sul tipo
della descrizione in figura 1 della guida di Trilinos ML 4.0. CHE NE PENSATE?} della descrizione in figura 1 della guida di Trilinos ML 4.0. CHE NE PENSATE?}
\subsection{Smoothed Aggregation\label{sec:aggregation}} \subsection{Smoothed Aggregation\label{sec:aggregation}}
To define the restriction operator $R_C$, which is used to compute To define the restriction operator $R_C$, which is used to compute
the coarse-level matrix $A_C$, MLD2P4 uses the \emph{smoothed aggregation} the coarse-level matrix $A_C$, MLD2P4 uses the \emph{smoothed aggregation}
algorithm described in \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}. algorithm described in \cite{Brezina_Vanek_,Vanek_Mandel_Brezina_}.
The basic idea of this algorithm is to build a coarse set of vertices The basic idea of this algorithm is to build a coarse set of vertices
$W_C$ by suitably grouping the vertices of $W$ into disjoint subsets $W_C$ by suitably grouping the vertices of $W$ into disjoint subsets
(aggregates), and to define the coarse-to-fine space transfer operator $R_C^T$ by (aggregates), and to define the coarse-to-fine space transfer operator $R_C^T$ by
applying a suitable smoother to a simple piecewise constant applying a suitable smoother to a simple piecewise constant
prolongation operator, to improve the quality of the coarse-space correction. prolongation operator, to improve the quality of the coarse-space correction.
Three main steps can be identified in the smoothed aggregation procedure: Three main steps can be identified in the smoothed aggregation procedure:
\begin{itemize} \begin{itemize}
\item coarsening of the vertex set $W$, to obtain $W_C$; \item coarsening of the vertex set $W$, to obtain $W_C$;
\item construction of the prolongator $R_C^T$; \item construction of the prolongator $R_C^T$;
\item application of $R_C$ and $R_C^T$ to build $A_C$. \item application of $R_C$ and $R_C^T$ to build $A_C$.
\end{itemize} \end{itemize}
To perform the coarsening step, we have implemented the aggregation algorithm sketched To perform the coarsening step, we have implemented the aggregation algorithm sketched
in \cite{apnum_07}. According to \cite{brezina_vanek}, a modification of this algorithm in \cite{apnum_07}. According to \cite{brezina_vanek}, a modification of this algorithm
has been actually considered, has been actually considered,
in which each aggregate $N_r$ is made of vertices of $W$ that are \emph{strongly coupled} in which each aggregate $N_r$ is made of vertices of $W$ that are \emph{strongly coupled}
to a certain root vertex $r \in W$, i.e.\ to a certain root vertex $r \in W$, i.e.\
\[ N_r = \left\{s \in W: |a_{rs}| \geq \theta \sqrt{|a_{rr}a_{ss}|} \right\} \] \[ N_r = \left\{s \in W: |a_{rs}| \geq \theta \sqrt{|a_{rr}a_{ss}|} \right\} \]
for a given $\theta \in [0,1]$. for a given $\theta \in [0,1]$.
Since the previous algorithm has a sequential nature, a \emph{decoupled} version of Since the previous algorithm has a sequential nature, a \emph{decoupled} version of
it has been chosen, where each processor $i$ independently applies the algorithm to it has been chosen, where each processor $i$ independently applies the algorithm to
the set of vertices $W_i^0$ assigned to it in the initial data distribution. This the set of vertices $W_i^0$ assigned to it in the initial data distribution. This
version is embarrassingly parallel, since it does not require any data communication. version is embarrassingly parallel, since it does not require any data communication.
On the other hand, it may produce non-uniform aggregates near boundary vertices, On the other hand, it may produce non-uniform aggregates near boundary vertices,
i.e.\ near vertices adjacent to vertices in other processors, and is strongly i.e.\ near vertices adjacent to vertices in other processors, and is strongly
dependent on the number of processors and on the initial partitioning of the matrix $A$. dependent on the number of processors and on the initial partitioning of the matrix $A$.
Nevertheless, this algorithm has been chosen for the implementation in MLD2P4, Nevertheless, this algorithm has been chosen for the implementation in MLD2P4,
since it has been shown to produce good results in practice \cite{Tuminaro_Tong_00}. since it has been shown to produce good results in practice \cite{Tuminaro_Tong_00}.
The prolongator $P_C=R_C^T$ is built starting from a \emph{tentative prolongator} The prolongator $P_C=R_C^T$ is built starting from a \emph{tentative prolongator}
$P \in \Re^{n \times n_C}$, defined as $P \in \Re^{n \times n_C}$, defined as
\begin{equation} \begin{equation}
P=(p_{ij}), \quad p_{ij}= P=(p_{ij}), \quad p_{ij}=
\left\{ \begin{array}{ll} \left\{ \begin{array}{ll}
1 & \quad \mbox{if} \; i \in V^j_C \\ 1 & \quad \mbox{if} \; i \in V^j_C \\
0 & \quad \mbox{otherwise} 0 & \quad \mbox{otherwise}
\end{array} \right. . \end{array} \right. .
\label{eq:tent_prol} \label{eq:tent_prol}
\end{equation} \end{equation}
$P_C$ is obtained by $P_C$ is obtained by
applying to $P$ a smoother $S \in \Re^{n \times n}$: applying to $P$ a smoother $S \in \Re^{n \times n}$:
\begin{equation} \begin{equation}
P_C = S P, P_C = S P,
\label{eq:smoothed_prol} \label{eq:smoothed_prol}
\end{equation} \end{equation}
in order to remove oscillatory components from the range of the prolongator in order to remove oscillatory components from the range of the prolongator
and hence to improve the convergence properties of the multi-level and hence to improve the convergence properties of the multi-level
Schwarz method \cite{Brezina_Vanek_,StubenGMD69_99}. Schwarz method \cite{Brezina_Vanek_,StubenGMD69_99}.
A simple choice for $S$ is the damped Jacobi smoother: A simple choice for $S$ is the damped Jacobi smoother:
\begin{equation} \begin{equation}
S = I - \omega D^{-1} A , S = I - \omega D^{-1} A ,
\label{eq:jac_smoother} \label{eq:jac_smoother}
\end{equation} \end{equation}
where the value of $\omega$ can be chosen where the value of $\omega$ can be chosen
using some estimate of the spectral radius of $D^{-1}A$ \cite{Brezina_Vanek}. using some estimate of the spectral radius of $D^{-1}A$ \cite{Brezina_Vanek}.
\textbf{Cenno al filtering di $A$ nello smoothing, dicendo che pero' non e' stato \textbf{Cenno al filtering di $A$ nello smoothing, dicendo che pero' non e' stato
implementato?} implementato?}
%%% Local Variables: %%% Local Variables:
%%% mode: latex %%% mode: latex
%%% TeX-master: "userguide" %%% TeX-master: "userguide"
%%% End: %%% End:

@ -1,152 +1,152 @@
\begin{thebibliography}{99} \begin{thebibliography}{99}
% %
\bibitem{PARA04FOREST} \bibitem{PARA04FOREST}
Bella, G., Filippone, S., De Maio, A., Testa, M.: Bella, G., Filippone, S., De Maio, A., Testa, M.:
A Simulation Model for Forest Fires. A Simulation Model for Forest Fires.
In: Dongarra, J., Madsen, K., Wasniewski, J. (eds.): In: Dongarra, J., Madsen, K., Wasniewski, J. (eds.):
Proceedings of PARA~04 Workshop on State of the Art Proceedings of PARA~04 Workshop on State of the Art
in Scientific Computing. Lecture Notes in Computer Science, 3732. Berlin: in Scientific Computing. Lecture Notes in Computer Science, 3732. Berlin:
Springer, 2005 Springer, 2005
% %
\bibitem{aaecc_07} A. Buttari, D. di Serafino, P. D'Ambra, S. Filippone,\newblock \bibitem{aaecc_07} A. Buttari, D. di Serafino, P. D'Ambra, S. Filippone,\newblock
2LEV-D2P4: a package of high-performance preconditioners,\newblock 2LEV-D2P4: a package of high-performance preconditioners,\newblock
Applicable Algebra in Engineering, Communications and Computing, Applicable Algebra in Engineering, Communications and Computing,
Volume 18, Number 3, May, 2007, pp. 223-239 Volume 18, Number 3, May, 2007, pp. 223-239
%Published online: 13 February 2007, {\tt http://dx.doi.org/10.1007/s00200-007-0035-z} %Published online: 13 February 2007, {\tt http://dx.doi.org/10.1007/s00200-007-0035-z}
% %
\bibitem{apnum_07} P. D'Ambra, S. Filippone, D. Di Serafino\newblock \bibitem{apnum_07} P. D'Ambra, S. Filippone, D. Di Serafino\newblock
On the Development of PSBLAS-based Parallel Two-level Schwarz Preconditioners On the Development of PSBLAS-based Parallel Two-level Schwarz Preconditioners
\newblock \newblock
Applied Numerical Mathematics, Elsevier Science, Applied Numerical Mathematics, Elsevier Science,
Volume 57, Issues 11-12, November-December 2007, Pages 1181-1196. Volume 57, Issues 11-12, November-December 2007, Pages 1181-1196.
%published online 3 February 2007, {\tt %published online 3 February 2007, {\tt
% http://dx.doi.org/10.1016/j.apnum.2007.01.006} % http://dx.doi.org/10.1016/j.apnum.2007.01.006}
%% \bibitem{DOUGLAS} %% \bibitem{DOUGLAS}
%% R.E.~Bank and C.C.~Douglas, %% R.E.~Bank and C.C.~Douglas,
%% {\em SMMP: Sparse Matrix Multiplication Package}, %% {\em SMMP: Sparse Matrix Multiplication Package},
%% Advances in Computational Mathematics, 1993, 1, 127-137. %% Advances in Computational Mathematics, 1993, 1, 127-137.
%% (See also {\tt http://www.mgnet.org/~douglas/ccd-codes.html}) %% (See also {\tt http://www.mgnet.org/~douglas/ccd-codes.html})
% %
% %
\bibitem{para_04} \bibitem{para_04}
A.~Buttari, P.~D'Ambra, D.~di Serafino and S.~Filippone, A.~Buttari, P.~D'Ambra, D.~di Serafino and S.~Filippone,
{\em Extending PSBLAS to Build Parallel Schwarz Preconditioners}, {\em Extending PSBLAS to Build Parallel Schwarz Preconditioners},
in , J.~Dongarra, K.~Madsen, J.~Wasniewski, editors, in , J.~Dongarra, K.~Madsen, J.~Wasniewski, editors,
Proceedings of PARA~04 Workshop on State of the Art Proceedings of PARA~04 Workshop on State of the Art
in Scientific Computing, pp.~593--602, Lecture Notes in Computer Science, in Scientific Computing, pp.~593--602, Lecture Notes in Computer Science,
Springer, 2005. Springer, 2005.
% %
%% \bibitem{CAI_SAAD} %% \bibitem{CAI_SAAD}
%% X.~C.~Cai and Y.~Saad, %% X.~C.~Cai and Y.~Saad,
%% {\em Overlapping Domain Decomposition Algorithms for General Sparse Matrices}, %% {\em Overlapping Domain Decomposition Algorithms for General Sparse Matrices},
%% Numerical Linear Algebra with Applications, 3(3), pp.~221--237, 1996. %% Numerical Linear Algebra with Applications, 3(3), pp.~221--237, 1996.
%% % %% %
%% \bibitem{CAI_SARKIS} %% \bibitem{CAI_SARKIS}
%% X.C.~Cai and M.~Sarkis, %% X.C.~Cai and M.~Sarkis,
%% {\em A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems}, %% {\em A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems},
%% SIAM Journal on Scientific Computing, 21(2), pp.~792--797, 1999. %% SIAM Journal on Scientific Computing, 21(2), pp.~792--797, 1999.
% %
\bibitem{Cai_Widlund_92} \bibitem{Cai_Widlund_92}
X.C.~Cai and O.~B.~Widlund, X.C.~Cai and O.~B.~Widlund,
{\em Domain Decomposition Algorithms for Indefinite Elliptic Problems}, {\em Domain Decomposition Algorithms for Indefinite Elliptic Problems},
SIAM Journal on Scientific and Statistical Computing, 13(1), pp.~243--258, 1992. SIAM Journal on Scientific and Statistical Computing, 13(1), pp.~243--258, 1992.
% %
\bibitem{dd1_94} \bibitem{dd1_94}
T.~Chan and T.~Mathew, T.~Chan and T.~Mathew,
{\em Domain Decomposition Algorithms}, {\em Domain Decomposition Algorithms},
in A.~Iserles, editor, Acta Numerica 1994, pp.~61--143, 1994. in A.~Iserles, editor, Acta Numerica 1994, pp.~61--143, 1994.
Cambridge University Press. Cambridge University Press.
%% % %% %
%% \bibitem{UMFPACK} %% \bibitem{UMFPACK}
%% T.A.~Davis, %% T.A.~Davis,
%% {\em Algorithm 832: UMFPACK - an Unsymmetric-pattern Multifrontal %% {\em Algorithm 832: UMFPACK - an Unsymmetric-pattern Multifrontal
%% Method with a Column Pre-ordering Strategy}, %% Method with a Column Pre-ordering Strategy},
%% ACM Transactions on Mathematical Software, 30, pp.~196--199, 2004. %% ACM Transactions on Mathematical Software, 30, pp.~196--199, 2004.
%% (See also {\tt http://www.cise.ufl.edu/~davis/}) %% (See also {\tt http://www.cise.ufl.edu/~davis/})
%% % %% %
%% \bibitem{SUPERLU} %% \bibitem{SUPERLU}
%% J.W.~Demmel, S.C.~Eisenstat, J.R.~Gilbert, X.S.~Li and J.W.H.~Liu, %% J.W.~Demmel, S.C.~Eisenstat, J.R.~Gilbert, X.S.~Li and J.W.H.~Liu,
%% A supernodal approach to sparse partial pivoting, %% A supernodal approach to sparse partial pivoting,
%% SIAM Journal on Matrix Analysis and Applications, 20(3), pp.~720--755, 1999. %% SIAM Journal on Matrix Analysis and Applications, 20(3), pp.~720--755, 1999.
% %
\bibitem{BLACS} \bibitem{BLACS}
J.~J.~Dongarra and R.~C.~Whaley, J.~J.~Dongarra and R.~C.~Whaley,
{\em A User's Guide to the BLACS v.~1.1}, {\em A User's Guide to the BLACS v.~1.1},
Lapack Working Note 94, Tech.\ Rep.\ UT-CS-95-281, University of Lapack Working Note 94, Tech.\ Rep.\ UT-CS-95-281, University of
Tennessee, March 1995 (updated May 1997). Tennessee, March 1995 (updated May 1997).
% %
\bibitem{sblas_97} \bibitem{sblas_97}
I.~Duff, M.~Marrone, G.~Radicati and C.~Vittoli, I.~Duff, M.~Marrone, G.~Radicati and C.~Vittoli,
{\em Level 3 Basic Linear Algebra Subprograms for Sparse Matrices: {\em Level 3 Basic Linear Algebra Subprograms for Sparse Matrices:
a User Level Interface}, a User Level Interface},
ACM Transactions on Mathematical Software, 23(3), pp.~379--401, 1997. ACM Transactions on Mathematical Software, 23(3), pp.~379--401, 1997.
% %
\bibitem{sblas_02} \bibitem{sblas_02}
I.~Duff, M.~Heroux and R.~Pozo, I.~Duff, M.~Heroux and R.~Pozo,
{\em An Overview of the Sparse Basic Linear {\em An Overview of the Sparse Basic Linear
Algebra Subprograms: the New Standard from the BLAS Technical Forum}, Algebra Subprograms: the New Standard from the BLAS Technical Forum},
ACM Transactions on Mathematical Software, 28(2), pp.~239--267, 2002. ACM Transactions on Mathematical Software, 28(2), pp.~239--267, 2002.
% %
\bibitem{psblas_00} \bibitem{psblas_00}
S.~Filippone and M.~Colajanni, S.~Filippone and M.~Colajanni,
{\em PSBLAS: A Library for Parallel Linear Algebra {\em PSBLAS: A Library for Parallel Linear Algebra
Computation on Sparse Matrices}, Computation on Sparse Matrices},
\newblock \newblock
ACM Transactions on Mathematical Software, 26(4), pp.~527--550, 2000. ACM Transactions on Mathematical Software, 26(4), pp.~527--550, 2000.
% %
\bibitem{KIVA3PSBLAS} \bibitem{KIVA3PSBLAS}
S.~Filippone, P.~D'Ambra, M.~Colajanni, S.~Filippone, P.~D'Ambra, M.~Colajanni,
{\em Using a Parallel Library of Sparse Linear Algebra in a Fluid Dynamics {\em Using a Parallel Library of Sparse Linear Algebra in a Fluid Dynamics
Applications Code on Linux Clusters}, Applications Code on Linux Clusters},
in G.~Joubert, A.~Murli, F.~Peters, M.~Vanneschi, editors, in G.~Joubert, A.~Murli, F.~Peters, M.~Vanneschi, editors,
Parallel Computing - Advances \& Current Issues, Parallel Computing - Advances \& Current Issues,
pp.~441--448, Imperial College Press, 2002. pp.~441--448, Imperial College Press, 2002.
% %
\bibitem{METIS} \bibitem{METIS}
Karypis, G. and Kumar, V., Karypis, G. and Kumar, V.,
{\em {METIS}: Unstructured Graph Partitioning and Sparse Matrix {\em {METIS}: Unstructured Graph Partitioning and Sparse Matrix
Ordering System}. Ordering System}.
Minneapolis, MN 55455: University of Minnesota, Department of Minneapolis, MN 55455: University of Minnesota, Department of
Computer Science, 1995. Computer Science, 1995.
Internet Address: {\verb|http://www.cs.umn.edu/~karypis|}. Internet Address: {\verb|http://www.cs.umn.edu/~karypis|}.
\bibitem{BLAS1} \bibitem{BLAS1}
Lawson, C., Hanson, R., Kincaid, D. and Krogh, F., Lawson, C., Hanson, R., Kincaid, D. and Krogh, F.,
Basic {L}inear {A}lgebra {S}ubprograms for {F}ortran usage, Basic {L}inear {A}lgebra {S}ubprograms for {F}ortran usage,
{ACM Trans. Math. Softw.} vol.~{5}, 38--329, 1979. {ACM Trans. Math. Softw.} vol.~{5}, 38--329, 1979.
\bibitem{machiels} \bibitem{machiels}
{Machiels, L. and Deville, M.} {Machiels, L. and Deville, M.}
{\em Fortran 90: An entry to object-oriented programming for the solution {\em Fortran 90: An entry to object-oriented programming for the solution
of partial differential equations.} of partial differential equations.}
{ACM Trans. Math. Softw.} vol.~{23}, 32--49. {ACM Trans. Math. Softw.} vol.~{23}, 32--49.
\bibitem{metcalf} \bibitem{metcalf}
{Metcalf, M., Reid, J. and Cohen, M.} {Metcalf, M., Reid, J. and Cohen, M.}
{\em Fortran 95/2003 explained.} {\em Fortran 95/2003 explained.}
{Oxford University Press}, 2004. {Oxford University Press}, 2004.
\bibitem{dd2_96} \bibitem{dd2_96}
B.~Smith, P.~Bjorstad and W.~Gropp, B.~Smith, P.~Bjorstad and W.~Gropp,
{\em Domain Decomposition: Parallel Multilevel Methods for Elliptic {\em Domain Decomposition: Parallel Multilevel Methods for Elliptic
Partial Differential Equations}, Partial Differential Equations},
Cambridge University Press, 1996. Cambridge University Press, 1996.
\bibitem{MPI1} \bibitem{MPI1}
M.~Snir, S.~Otto, S.~Huss-Lederman, D.~Walker and J.~Dongarra, M.~Snir, S.~Otto, S.~Huss-Lederman, D.~Walker and J.~Dongarra,
{\em MPI: The Complete Reference. Volume 1 - The MPI Core}, second edition, {\em MPI: The Complete Reference. Volume 1 - The MPI Core}, second edition,
MIT Press, 1998. MIT Press, 1998.
% %
\bibitem{BREZINA_VANEK} \bibitem{BREZINA_VANEK}
M.~Brezina and P.~Van{\v e}k, M.~Brezina and P.~Van{\v e}k,
{\em A Black-Box Iterative Solver Based on a Two-Level Schwarz Method}, {\em A Black-Box Iterative Solver Based on a Two-Level Schwarz Method},
Computing, 1999, 63, 233-263. Computing, 1999, 63, 233-263.
% %
% %
\bibitem{VANEK_MANDEL_BREZINA} \bibitem{VANEK_MANDEL_BREZINA}
P.~Van{\v e}k, J.~Mandel and M.~Brezina, P.~Van{\v e}k, J.~Mandel and M.~Brezina,
{\em Algebraic Multigrid by Smoothed Aggregation for Second and Fourth Order Elliptic Problems}, {\em Algebraic Multigrid by Smoothed Aggregation for Second and Fourth Order Elliptic Problems},
Computing, 1996, 56, 179-196. Computing, 1996, 56, 179-196.
% %
\end{thebibliography} \end{thebibliography}

@ -1,7 +1,7 @@
\section{Configuring and Building MLD2P4\label{sec:configuring}} \section{Configuring and Building MLD2P4\label{sec:configuring}}
- uso di GNU autoconf e automake \\ - uso di GNU autoconf e automake \\
- software di base necessario (MPI, BLACS, BLAS, PSBLAS - specificare versioni) \\ - software di base necessario (MPI, BLACS, BLAS, PSBLAS - specificare versioni) \\
- software opzionale (UMFPACK, SuperLU, SuperLUdist - specificare versioni e opzioni di configure) \\ - software opzionale (UMFPACK, SuperLU, SuperLUdist - specificare versioni e opzioni di configure) \\
- sistemi operativi e compilatori su cui MLD2P4 e' stato costruito con successo \\ - sistemi operativi e compilatori su cui MLD2P4 e' stato costruito con successo \\
- sono previste opzioni di configurazione per il debugging o per il profiling? \\ - sono previste opzioni di configurazione per il debugging o per il profiling? \\
- albero delle directory \\ - albero delle directory \\

@ -1,6 +1,6 @@
\section{Notational Conventions\label{sec:conventions}} \section{Notational Conventions\label{sec:conventions}}
- caratteri tipografici usati nella guida (vedi guida ML recente e guida Aztec) \\ - caratteri tipografici usati nella guida (vedi guida ML recente e guida Aztec) \\
- convenzioni sui nomi di routine (differenza tra high-level e medium-level), - convenzioni sui nomi di routine (differenza tra high-level e medium-level),
strutture dati,\\ strutture dati,\\
moduli, costanti, etc. (vedi guida psblas) \\ moduli, costanti, etc. (vedi guida psblas) \\
- versione reale e complessa\\ - versione reale e complessa\\

@ -1,41 +1,42 @@
\section{Code Distribution\label{sec:distribution}} \section{Code Distribution\label{sec:distribution}}
The MLD2P4 is freely distributable under the following copyright The MLD2P4 is freely distributable under the following copyright
terms: terms: {\small
\begin{verbatim} \begin{verbatim}
MLD2P4 version 1.0 MLD2P4 version 1.0
MultiLevel Domain Decomposition Parallel Preconditioners Package MultiLevel Domain Decomposition Parallel Preconditioners Package
based on PSBLAS (Parallel Sparse BLAS version 2.3) based on PSBLAS (Parallel Sparse BLAS version 2.3)
(C) Copyright 2008 (C) Copyright 2008
Salvatore Filippone University of Rome Tor Vergata Salvatore Filippone University of Rome Tor Vergata
Alfredo Buttari University of Rome Tor Vergata Alfredo Buttari University of Rome Tor Vergata
Pasqua D'Ambra ICAR-CNR, Naples Pasqua D'Ambra ICAR-CNR, Naples
Daniela di Serafino Second University of Naples Daniela di Serafino Second University of Naples
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
are met: are met:
1. Redistributions of source code must retain the above copyright 1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer. notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright 2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions, and the following disclaimer in the notice, this list of conditions, and the following disclaimer in the
documentation and/or other materials provided with the distribution. documentation and/or other materials provided with the distribution.
3. The name of the MLD2P4 group or the names of its contributors may 3. The name of the MLD2P4 group or the names of its contributors may
not be used to endorse or promote products derived from this not be used to endorse or promote products derived from this
software without specific written permission. software without specific written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE MLD2P4 GROUP OR ITS CONTRIBUTORS PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE MLD2P4 GROUP OR ITS CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE. POSSIBILITY OF SUCH DAMAGE.
\end{verbatim} \end{verbatim}
}

@ -1,9 +1,9 @@
\section{Error Handling}\label{sec:errors} \section{Error Handling}\label{sec:errors}
Error handling Error handling
- Breve descrizione con rinvio alla guida di PSBLAS - Breve descrizione con rinvio alla guida di PSBLAS
%%% Local Variables: %%% Local Variables:
%%% mode: latex %%% mode: latex
%%% TeX-master: "userguide" %%% TeX-master: "userguide"
%%% End: %%% End:

@ -1,224 +1,231 @@
\section{Getting Started\label{sec:started}} \section{Getting Started\label{sec:started}}
We describe the basics for building and applying MLD2P4 one-level and multi-level We describe the basics for building and applying MLD2P4 one-level and multi-level
Schwarz preconditioners with the Krylov solvers included in PSBLAS \cite{}. Schwarz preconditioners with the Krylov solvers included in PSBLAS \cite{}.
The following five steps are required: The following steps are required:
\begin{enumerate} \begin{enumerate}
\item \emph{Allocate and initialize the preconditioner data structure, according to \item \emph{Declare the preconditioner data structure}. It is a derived data type,
a preconditioner type chosen by the user}. This is performed by the routine \verb|mld_|\emph{x}\verb|prec_type|,where \emph{x} may be \verb|s|, \verb|d|, \verb|c|
\verb|mld_precinit|, which also sets a default preconditioner for each preconditioner or \verb|z|, according to the basic data type of the sparse matrix
type selected by the user. The default preconditioner associated to each preconditioner (\verb|s| = real single precision; \verb|s| = real double precision;
type is listed in Table~\ref{tab:precinit}; the string used by \verb|mld_precinit| \verb|c| = complex single precision; \verb|z| = complex double precision).
to identify each preconditioner type is also given. The preconditioner data structure is This data structure is accessed by the user only through the MLD2P4 routines,
the derived data type \verb|mld_prec_type|, which is accessed to the user only following an object-oriented approach.
through the MLD2P4 routines. \item \emph{Allocate and initialize the preconditioner data structure, according to
\item \emph{Choose a specific variant of the selected preconditioner type, by setting a preconditioner type chosen by the user}. This is performed by the routine
the preconditioner parameters.} This is performed by the routine \verb|mld_precset|. \verb|mld_precinit|, which also sets a default preconditioner for each preconditioner
A few examples concerning the use of \verb|mld_precset| are given in type selected by the user. The default preconditioner associated to each preconditioner
Sections~\ref{sec:example1} and \ref{sec:example1}; a complete list of all the type is listed in Table~\ref{tab:precinit}; the string used by \verb|mld_precinit|
preconditioner parameters and their allowed values is provided in to identify each preconditioner type is also given.
Section~\ref{sec:highlevel}. \item \emph{Choose a specific preconditioner within the selected preconditioner type, by setting
\item \emph{Build the preconditioner for a given matrix.} This is performed by the preconditioner parameters.} This is performed by the routine \verb|mld_precset|.
the routine \verb|mld_precbld|. A few examples concerning the use of \verb|mld_precset| are given in
\item \emph{Apply the preconditioner at each iteration of a Krylov solver.} Section~\ref{sec:examples}; a complete list of all the
This is performed by the routine \verb|mld_precaply|. When using the PSBLAS Krylov solvers, preconditioner parameters and their allowed values is provided in
this step is completely transparent to the user, since \verb|mld_precaply| is called Section~\ref{sec:highlevel}.
by the PSBLAS routine implementing the Krylov solver (\verb|psb_krylov|). \item \emph{Build the preconditioner for a given matrix.} This is performed by
\item \emph{Deallocate the preconditioner data structure}. This is performed by the routine \verb|mld_precbld|.
the routine \verb|mld_precfree|. This step is complementary to step 1 and should \item \emph{Apply the preconditioner at each iteration of a Krylov solver.}
be performed when the preconditioner is no more used. This is performed by the routine \verb|mld_precaply|. When using the PSBLAS Krylov solvers,
\end{enumerate} this step is completely transparent to the user, since \verb|mld_precaply| is called
A detailed description of the above routines is given in Section~\ref{sec:highlevel}. by the PSBLAS routine implementing the Krylov solver (\verb|psb_krylov|).
\item \emph{Deallocate the preconditioner data structure}. This is performed by
Note that the Fortran 95 module \verb|mld_prec_mod| must be used in the program the routine \verb|mld_precfree|. This step is complementary to step 1 and should
calling the MLD2P4 routines. Furthermore, to apply MLD2P4 with the Krylov solvers be performed when the preconditioner is no more used.
from PSBLAS, the module \verb|psb_krylov_mod| must be used too. \end{enumerate}
A detailed description of the above routines is given in Section~\ref{sec:highlevel}.
Two simple example programs showing the (basic) use of MLD2P4 are reported in
Section~\ref{sec:examples}. Note that the Fortran 95 module \verb|mld_prec_mod| must be used in the program
calling the MLD2P4 routines. Furthermore, to apply MLD2P4 with the Krylov solvers
\begin{table}[th] from PSBLAS, the module \verb|psb_krylov_mod| must be used too.
{
\begin{center} Examples showing the basic use of MLD2P4 are reported in Section~\ref{sec:examples}.
\begin{tabular}{|l|l|p{6.7cm}|}
\hline \begin{table}[th]
Type & String & Default preconditioner \\ \hline {
No preconditioner &'NOPREC'& (Considered only to use the PSBLAS \begin{center}
Krylov solvers with no preconditioner.) \\ \begin{tabular}{|l|l|p{6.7cm}|}
Diagonal & 'DIAG' & --- \\ \hline
Block Jacobi & 'BJAC' & ILU(0) on the local blocks.\\ Type & String & Default preconditioner \\ \hline
Additive Schwarz & 'AS' & Restricted Additive Schwarz (RAS), No preconditioner &\verb|'NOPREC'|& (Considered only to use the PSBLAS
with overlap 1 and ILU(0) on the local blocks. \\ Krylov solvers with no preconditioner.) \\
Multilevel &'ML' & Multi-level hybrid preconditioner (additive on the Diagonal & \verb|'DIAG'| & --- \\
same level and multiplicative through the levels), Block Jacobi & \verb|'BJAC'| & Block Jacobi with ILU(0) on the local blocks.\\
with post-smoothing only. Number of levels: 2; Additive Schwarz & \verb|'AS'| & Restricted Additive Schwarz (RAS),
post-smoother: block-Jacobi preconditioner, with ILU(0) with overlap 1 and ILU(0) on the local blocks. \\
on the local blocks; coarsest matrix: distributed among the Multilevel &\verb|'ML'| & Multi-level hybrid preconditioner (additive on the
processors; corase-level solver: 4 sweeps of the same level and multiplicative through the levels),
block-Jacobi solver, with ILU(0) on the blocks. \\ with post-smoothing only. Number of levels: 2;
\hline post-smoother: block-Jacobi preconditioner with ILU(0)
\end{tabular} on the local blocks; coarsest matrix: distributed among the
\end{center} processors; corase-level solver: 4 sweeps of the
} block-Jacobi solver, with ILU(0) on the blocks. \\
\caption{Preconditioner types and default choices.\label{tab:precinit}} \hline
\end{table} \end{tabular}
\end{center}
\subsection{Examples\label{sec:examples}} }
\caption{Preconditioner types and default choices.\label{tab:precinit}}
The simple code reported below shows how to set and apply the MLD2P4 default multi-level \end{table}
preconditioned, i.e.\ the two-level hybrid post-smoothed Schwarz preconditioner, using block-Jacobi with ILU(0) on the blocks as basic preconditioner,
a coarse matrix distributed among the processors, and four block-Jacobi sweeps with ILU(0) on the blocks as approximate coarse-level solver. The choice of this preconditioner is made \subsection{Examples\label{sec:examples}}
by simply specifying \verb|'ML'| as second argument of \verb|mld_precinit|
(a call to \verb|mld_precset| is not needed). The code reported below shows how to set and apply the MLD2P4 default multi-level
The preconditioner is applied within the BiCGSTAB solver provided by PSBLAS. preconditioned, i.e.\ the two-level hybrid post-smoothed Schwarz preconditioner,
using block-Jacobi with ILU(0) on the blocks as basic preconditioner,
The part of the code concerning the a coarse matrix distributed among the processors, and four block-Jacobi
reading and assembling of the sparse matrix and the right-hand side vector, performed sweeps with ILU(0) on the blocks as approximate coarse-level solver.
through the PSBLAS routines for sparse matrix and vector management, is not reported The choice of this preconditioner is made
here for brevity. Other statements concerning the use of PSBLAS are neglected too. by simply specifying \verb|'ML'| as second argument of \verb|mld_precinit|
The complete code can be found in the example program file \verb|example_2lev_default.f90| (a call to \verb|mld_precset| is not needed).
in the directory \textbf{XXXXXX (SPECIFICARE).} Note that the modules \verb|psb_base_mod| The preconditioner is applied within the BiCGSTAB solver provided by PSBLAS.
and \verb|psb_util_mod| at the beginning of the code are required by PSBLAS.
For details on the use of the PSBLAS routines, see the PSBLAS User's Guide \cite{}. The part of the code concerning the
reading and assembling of the sparse matrix and the right-hand side vector, performed
\begin{verbatim} through the PSBLAS routines for sparse matrix and vector management, is not reported
use psb_base_mod here for brevity. Other statements concerning the use of PSBLAS are neglected too.
use psb_util_mod The complete code can be found in the example program file \verb|example_2lev_default.f90|
use mld_prec_mod in the directory \textbf{XXXXXX (SPECIFICARE).} Note that the modules \verb|psb_base_mod|
use psb_krylov_mod and \verb|psb_util_mod| at the beginning of the code are required by PSBLAS.
... ... For details on the use of the PSBLAS routines, see the PSBLAS User's Guide \cite{}.
!
! sparse matrix \begin{verbatim}
type(psb_dspmat_type) :: A use psb_base_mod
! sparse matrix descriptor use psb_util_mod
type(psb_desc_type) :: DESC_A use mld_prec_mod
! preconditioner use psb_krylov_mod
type(mld_prec_type) :: PRE ... ...
... ... !
! ! sparse matrix
! initialize the parallel environment type(psb_dspmat_type) :: A
call psb_init(ictxt) ! sparse matrix descriptor
call psb_info(ictxt,iam,np) type(psb_desc_type) :: DESC_A
... ... ! preconditioner
! type(mld_dprec_type) :: PRE
! read and assemble the matrix A and the right-hand ... ...
! side b using PSBLAS routines for sparse matrix / !
! vector management ! initialize the parallel environment
... ... call psb_init(ictxt)
! call psb_info(ictxt,iam,np)
! initialize the default multi-level preconditioner ... ...
! (two-level hybrid post-smoothed Schwarz) !
call mld_precinit(PRE,'ML',info) ! read and assemble the matrix A and the right-hand
! ! side b using PSBLAS routines for sparse matrix /
! build the preconditioner ! vector management
call psb_precbld(A,PRE,DESC_A,info) ... ...
! !
! set the solver parameters and the initial guess ! initialize the default multi-level preconditioner
... ... ! (two-level hybrid post-smoothed Schwarz)
! call mld_precinit(PRE,'ML',info)
! solve Ax=b with preconditioned BiCGSTAB !
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info) ! build the preconditioner
... ... call psb_precbld(A,PRE,DESC_A,info)
! !
! cleanup the preconditioner ! set the solver parameters and the initial guess
call mld_precfree(PRE,info) ... ...
! !
! cleanup other data structures ! solve Ax=b with preconditioned BiCGSTAB
... ... call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
! ... ...
! exit the parallel environment !
call psb_exit(ictxt) ! cleanup the preconditioner
stop call mld_precfree(PRE,info)
\end{verbatim} !
! cleanup other data structures
... ...
\textbf{MODIFICARE TUTTA LA PARTE CHE SEGUE:\\ !
- solo istruzioni diverse dall'esempio precedente (essenzialmente il setting del precondizionatore, magari con piu' chiamate a precset;\\ ! exit the parallel environment
- lasciare l'osservazione sulla specifica esplicita del numero di livelli;\\ call psb_exit(ictxt)
- rimandare al paragrafo successivo per una decrizione accurata di tutti i parametri;\\ stop
- lasciare l'osservazione sui vecchi utenti di PSBLAS.}\\ \end{verbatim}
In the following we describe the general procedure for setting and building one of the MLD2P4 preconditioners.
The user has first to prepare the preconditioner data structure by using the routine \verb|mld_precinit|. Input parameters \textbf{MODIFICARE TUTTA LA PARTE CHE SEGUE:\\
for this routine include a string parameter, needed to define the preconditioner type, and an optional integer parameter - solo istruzioni diverse dall'esempio precedente (essenzialmente il setting del precondizionatore, magari con piu' chiamate a precset;\\
specifying the number of the levels in the case of a multi-level preconditioner. - lasciare l'osservazione sulla specifica esplicita del numero di livelli;\\
Note that if the optional parameter is not present and a multi-level preconditioner has been chosen, - rimandare al paragrafo successivo per una decrizione accurata di tutti i parametri;\\
a two-level preconditioner is set. On the other hand, the integer parameter is ignored if the type of the preconditioner is not multilevel. - lasciare l'osservazione sui vecchi utenti di PSBLAS.}\\
In Table \ref{tab:precinit} we report both the possible choices for the preconditioner type
and the related default preconditioners. In the following we describe the general procedure for setting and building one of the MLD2P4 preconditioners.
The user has first to prepare the preconditioner data structure by using the routine \verb|mld_precinit|. Input parameters
for this routine include a string parameter, needed to define the preconditioner type, and an optional integer parameter
The user of MLD2P4 may set a lot of parameters for one-level and multi-level Schwarz, in order specifying the number of the levels in the case of a multi-level preconditioner.
to define a different preconditioner than that of default choices. The parameters Note that if the optional parameter is not present and a multi-level preconditioner has been chosen,
can be set through the routine \verb|mld_precset|. The APIs of \verb|mld_precinit| and \verb|mld_precset| as well as the complete a two-level preconditioner is set. On the other hand, the integer parameter is ignored if the type of the preconditioner is not multilevel.
list of the parameters that can be set with the corresponding allowed values are reported in Section \ref{sec:highlevel}. In the following a simple code In Table \ref{tab:precinit} we report both the possible choices for the preconditioner type
for a three-level hybrid post-smoothed Schwarz preconditioner, using RAS with overlap 1 as local preconditioner, and the related default preconditioners.
with ILU(0) on the local blocks, a distributed coarse matrix, four block-Jacobi sweeps with the UMFPACK LU
factorization on the blocks as coarse-matrix solver, is reported. Note that for the multi-level preconditioners, the levels are numbered in increasing
order starting from the finest one, i.e. level 1 is the finest level. The user of MLD2P4 may set a lot of parameters for one-level and multi-level Schwarz, in order
For more details, see the test program \verb|example2.f90| in xxxx(directory dei test).\\[0.5cm] to define a different preconditioner than that of default choices. The parameters
can be set through the routine \verb|mld_precset|. The APIs of \verb|mld_precinit| and \verb|mld_precset| as well as the complete
\begin{verbatim} list of the parameters that can be set with the corresponding allowed values are reported in Section \ref{sec:highlevel}. In the following a simple code
use psb_base_mod for a three-level hybrid post-smoothed Schwarz preconditioner, using RAS with overlap 1 as local preconditioner,
use psb_util_mod with ILU(0) on the local blocks, a distributed coarse matrix, four block-Jacobi sweeps with the UMFPACK LU
use mld_prec_mod factorization on the blocks as coarse-matrix solver, is reported. Note that for the multi-level preconditioners, the levels are numbered in increasing
use psb_krylov_mod order starting from the finest one, i.e. level 1 is the finest level.
... ... For more details, see the test program \verb|example2.f90| in xxxx(directory dei test).\\[0.5cm]
!
! sparse matrix \begin{verbatim}
type(psb_dspmat_type) :: A use psb_base_mod
! sparse matrix descriptor use psb_util_mod
type(psb_desc_type) :: DESC_A use mld_prec_mod
! preconditioner data use psb_krylov_mod
type(mld_dprec_type) :: PRE ... ...
... ... !
! ! sparse matrix
! initialization of the parallel environment type(psb_dspmat_type) :: A
! sparse matrix descriptor
call psb_init(ictxt) type(psb_desc_type) :: DESC_A
call psb_info(ictxt,iam,np) ! preconditioner data
... ... type(mld_dprec_type) :: PRE
! read and assemble the matrix A and the right-hand ... ...
! side vector b using PSBLAS routines for sparse !
! matrix/vector management ! initialization of the parallel environment
... ...
! prepare the three-level hybrid post-smoothed Schwarz call psb_init(ictxt)
! using RAS with overlap 1 as local preconditioner call psb_info(ictxt,iam,np)
! ... ...
call mld_precinit(PRE,'ML',info,nlev=3) ! read and assemble the matrix A and the right-hand
call mld_precset(PRE,mld_n_ovr_,novr=1,info,ilev=1) ! side vector b using PSBLAS routines for sparse
call mld_precset(PRE,mld_sub_restr_,psb_halo_,info,ilev=1) ! matrix/vector management
NOTA: e' PROPRIO BRUTTO "PSB_HALO_", BISOGNEREBBE AVERE COSTANTI CHE HANNO IL PREFISSO MLD! ... ...
! ! prepare the three-level hybrid post-smoothed Schwarz
! build preconditioner ! using RAS with overlap 1 as local preconditioner
call psb_precbld(A,PRE,DESC_A,info) !
! call mld_precinit(PRE,'ML',info,nlev=3)
! set solver parameters and initial guess call mld_precset(PRE,mld_n_ovr_,novr=1,info,ilev=1)
... ... call mld_precset(PRE,mld_sub_restr_,psb_halo_,info,ilev=1)
! solve Ax=b with preconditioned BiCGSTAB NOTA: e' PROPRIO BRUTTO "PSB_HALO_", BISOGNEREBBE AVERE COSTANTI CHE HANNO IL PREFISSO MLD!
!
call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info) ! build preconditioner
... ... call psb_precbld(A,PRE,DESC_A,info)
! !
! cleanup storage and exit ! set solver parameters and initial guess
! ... ...
call mld_precfree(PRE,info) ! solve Ax=b with preconditioned BiCGSTAB
!
call psb_gefree(b,DESC_A,info) call psb_krylov('BICGSTAB',A,PRE,b,x,tol,DESC_A,info)
call psb_gefree(x,DESC_A,info) ... ...
call psb_spfree(A,DESC_A,info) !
call psb_cdfree(DESC_A,info) ! cleanup storage and exit
! !
call psb_exit(ictxt) call mld_precfree(PRE,info)
stop !
call psb_gefree(b,DESC_A,info)
\end{verbatim} call psb_gefree(x,DESC_A,info)
call psb_spfree(A,DESC_A,info)
{\bf Remark for users with PSBLAS-based legacy codes:} when MLD2P4 is installed, a PSBLAS user, with a PSBLAS-based legacy code call psb_cdfree(DESC_A,info)
calling base preconditioners included in PSBLAS (NOPREC, DIAG and BJAC), is able to use the same preconditioners without changes to the code, if she/he !
includes in her/his program the file \verb|psb_prec_mod|. call psb_exit(ictxt)
stop
%%% Local Variables:
%%% mode: latex \end{verbatim}
%%% TeX-master: "userguide"
%%% End: {\bf Remark for users with PSBLAS-based legacy codes:} when MLD2P4 is installed, a PSBLAS user, with a PSBLAS-based legacy code
calling base preconditioners included in PSBLAS (NOPREC, DIAG and BJAC), is able to use the same preconditioners without changes to the code, if she/he
includes in her/his program the file \verb|psb_prec_mod|.
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "userguide"
%%% End:

@ -1,10 +1,10 @@
\section{List of Routines}\label{sec:routines} \section{List of Routines}\label{sec:routines}
Elenco (ordine alfabetico) di tutte le routine, con rinvio (ipertestuale e num. pag.) alla descrizione Elenco (ordine alfabetico) di tutte le routine, con rinvio (ipertestuale e num. pag.) alla descrizione
di ciascuna in qualche paragrafo precedente di ciascuna in qualche paragrafo precedente
(una specie di indice analitico, che rimanda alle routine descritte precedentemente nei rispettivi paragrafi) (una specie di indice analitico, che rimanda alle routine descritte precedentemente nei rispettivi paragrafi)
%%% Local Variables: %%% Local Variables:
%%% mode: latex %%% mode: latex
%%% TeX-master: "userguide" %%% TeX-master: "userguide"
%%% End: %%% End:

@ -1,62 +1,62 @@
\section{General Overview\label{sec:overview}} \section{General Overview\label{sec:overview}}
The \emph{Multi-Level Domain Decomposition Parallel Preconditioners Package based on The \emph{Multi-Level Domain Decomposition Parallel Preconditioners Package based on
PSBLAS (MLD2P4}) provides various versions of multi-level Schwarz preconditioners~\cite{DD2}, PSBLAS (MLD2P4}) provides various versions of multi-level Schwarz preconditioners~\cite{DD2},
to be used in the iterative solutions of sparse linear systems $Ax=b$, where to be used in the iterative solutions of sparse linear systems $Ax=b$, where
$A$ is a square, real or complex, sparse matrix with a symmetric sparsity pattern. $A$ is a square, real or complex, sparse matrix with a symmetric sparsity pattern.
\textbf{Ma non abbiamo detto che, se il pattern di sparista' non e' simmetrico, \textbf{Ma non abbiamo detto che, se il pattern di sparista' non e' simmetrico,
lavoriamo su $(A+A^T)/2$? Ma questo vale solo per l'aggregazione? Dovremmo fare lavoriamo su $(A+A^T)/2$? Ma questo vale solo per l'aggregazione? Dovremmo fare
qualcosa di consistente anche con 1-lev Schwarz.} qualcosa di consistente anche con 1-lev Schwarz.}
Both additive and hybrid preconditioners, i.e.\ multiplicative among the levels Both additive and hybrid preconditioners, i.e.\ multiplicative among the levels
and additive inside a level, are implemented; the basic additive Schwarz preconditioners and additive inside a level, are implemented; the basic additive Schwarz preconditioners
are obtained by considering only one level. A purely algebraic approach is used to are obtained by considering only one level. A purely algebraic approach is used to
generate a sequence of coarse-level corrections to a basic preconditioner, without generate a sequence of coarse-level corrections to a basic preconditioner, without
explicitly using any information on the geometry of the original problem (e.g.\ the explicitly using any information on the geometry of the original problem (e.g.\ the
discretization of a PDE). The smoothed aggregation technique is applied discretization of a PDE). The smoothed aggregation technique is applied
as algebraic coarsening strategy~\cite{}. as algebraic coarsening strategy~\cite{}.
The package is written in Fortran~95, using object-oriented techniques, The package is written in Fortran~95, using object-oriented techniques,
and is based on a distributed-memory parallel programming paradigm. \textbf{SALVATORE, and is based on a distributed-memory parallel programming paradigm. \textbf{SALVATORE,
potresti aggiungere due righe sulla scelta del Fortran 95 e sul semplice interfacciamento potresti aggiungere due righe sulla scelta del Fortran 95 e sul semplice interfacciamento
con i legacy codes, senza ripetere quello che e' detto sotto sulla scelta di PSBLAS?} con i legacy codes, senza ripetere quello che e' detto sotto sulla scelta di PSBLAS?}
Single and double precision implementations of MLD2P4 are available for both the Single and double precision implementations of MLD2P4 are available for both the
real and the complex case, that can be used through a single interface. real and the complex case, that can be used through a single interface.
\textbf{SALVATORE, funziona tutto?} \textbf{SALVATORE, funziona tutto?}
MLD2P4 has been designed to implement scalable and easy-to-use multilevel preconditioners MLD2P4 has been designed to implement scalable and easy-to-use multilevel preconditioners
in the context of the PSBLAS (Parallel Sparse BLAS) computational framework~\cite{}. in the context of the PSBLAS (Parallel Sparse BLAS) computational framework~\cite{}.
PSBLAS is a library originally developed to address the parallel implementation of PSBLAS is a library originally developed to address the parallel implementation of
iterative solvers for sparse linear system, by providing basic linear algebra iterative solvers for sparse linear system, by providing basic linear algebra
operators and data management facilities for distributed sparse matrices; it operators and data management facilities for distributed sparse matrices; it
also includes parallel Krylov solvers, built on the top of the basic PSBLAS kernels. also includes parallel Krylov solvers, built on the top of the basic PSBLAS kernels.
The preconditioners available in MLD2P4 can be used with these Krylov solvers. The preconditioners available in MLD2P4 can be used with these Krylov solvers.
The choice of PSBLAS has been mainly motivated by the need of having The choice of PSBLAS has been mainly motivated by the need of having
a portable and efficient software infrastructure implementing ``de facto'' standard a portable and efficient software infrastructure implementing ``de facto'' standard
parallel sparse linear algebra kernels, to pursue goals such as performance, parallel sparse linear algebra kernels, to pursue goals such as performance,
portability, modularity ed extensibility in the development of the preconditioner portability, modularity ed extensibility in the development of the preconditioner
package. On the other hand, the implementation of MLD2P4 has led to some package. On the other hand, the implementation of MLD2P4 has led to some
revisions and extentions of the PSBLAS kernels, leading to the revisions and extentions of the PSBLAS kernels, leading to the
recent PSBLAS 2.0 version~\cite{}. The inter-process comunication required recent PSBLAS 2.0 version~\cite{}. The inter-process comunication required
by MLD2P4 is encapsulated into the PSBLAS routines, except few cases where by MLD2P4 is encapsulated into the PSBLAS routines, except few cases where
MPI~\cite{} is explicitly called. Therefore, MLD2P4 can be run on any parallel MPI~\cite{} is explicitly called. Therefore, MLD2P4 can be run on any parallel
machine where PSBLAS and MPI implementations are available. machine where PSBLAS and MPI implementations are available.
MLD2P4 has a layered and modular software architecture where three main layers can be identified. The lower layer consists of the PSBLAS kernels, the middle one implements MLD2P4 has a layered and modular software architecture where three main layers can be identified. The lower layer consists of the PSBLAS kernels, the middle one implements
the construction and application phases of the preconditioners, and the upper one the construction and application phases of the preconditioners, and the upper one
provides a uniform and easy-to-use interface to all the preconditioners. provides a uniform and easy-to-use interface to all the preconditioners.
This architecture allows for different levels of use of the package: This architecture allows for different levels of use of the package:
few black-box routines at the upper level allow non-expert users to easily few black-box routines at the upper level allow non-expert users to easily
build any preconditioner available in MLD2P4 and to apply it within a PSBLAS Krylov solver. build any preconditioner available in MLD2P4 and to apply it within a PSBLAS Krylov solver.
On the other hand, the routines of the middle and lower layer can be used and extended On the other hand, the routines of the middle and lower layer can be used and extended
by expert users to build new versions of multi-level Schwarz preconditioners.\\ by expert users to build new versions of multi-level Schwarz preconditioners.\\
\textbf{Organizzazione della guida:\\ \textbf{Organizzazione della guida:\\
dire che per il momento non dire che per il momento non
forniamo anche la documentazione del middle layer, ma lo faremo in seguito\\} forniamo anche la documentazione del middle layer, ma lo faremo in seguito\\}
\textbf{Evidenziare le parole chiave che caratterizzano il nostro package} \textbf{Evidenziare le parole chiave che caratterizzano il nostro package}
%%% Local Variables: %%% Local Variables:
%%% mode: latex %%% mode: latex
%%% TeX-master: "userguide" %%% TeX-master: "userguide"
%%% End: %%% End:

@ -5,37 +5,46 @@
\ifx\pdfoutput\undefined % We're not running pdftex \ifx\pdfoutput\undefined % We're not running pdftex
\else \else
\pdfbookmark{MLD2P4-1.0 User's Guide}{title} \pdfbookmark{MLD2P4 User's and Reference Guide}{title}
\fi \fi
\newlength{\centeroffset} %\newlength{\centeroffset}
\setlength{\centeroffset}{-0.5\oddsidemargin} %\setlength{\centeroffset}{-0.5\oddsidemargin}
\addtolength{\centeroffset}{0.5\evensidemargin} %\addtolength{\centeroffset}{0.5\evensidemargin}
%\addtolength{\textwidth}{-\centeroffset} %\addtolength{\textwidth}{-\centeroffset}
\thispagestyle{empty} \thispagestyle{empty}
\vspace*{\stretch{1}} \vspace*{\stretch{1}}
\noindent\hspace*{\centeroffset}\makebox[0pt][l]{\begin{minipage}{\textwidth} \noindent\hspace*{\centeroffset}\makebox[0pt][l]{\begin{minipage}{\textwidth}
\flushright \flushright
{\Huge\bfseries MLD2P4-1.0 User's guide {\Huge\bfseries MLD2P4\\[.8ex] User's and Reference Guide
} }
\noindent\rule[-1ex]{\textwidth}{5pt}\\[2.5ex] \noindent\rule[-1ex]{\textwidth}{5pt}\\[2.5ex]
\hfill\emph{\Large A reference guide for the MultiLevel Domain \hfill\emph{\Large A guide for the Multi-Level Domain Decomposition \\[.6ex]
Decomposition Parallel Preconditioners Package based on Parallel Sparse BLAS} Parallel Preconditioners Package
based on PSBLAS}
\end{minipage}}
\vspace{\stretch{1}}
\noindent\hspace*{\centeroffset}\makebox[0pt][l]{\begin{minipage}{\textwidth}
\flushright
{\large\bfseries Pasqua D'Ambra}\\
\large ICAR-CNR, Naples, Italy\\[3ex]
{\large\bfseries Daniela di Serafino}\\
\large Second University of Naples, Italy\\[3ex]
{\large\bfseries Salvatore Filippone} \\
\large University of Rome ``Tor Vergata'', Italy
%\\[10ex]
%\today
\end{minipage}} \end{minipage}}
\vspace{\stretch{1}} \vspace{\stretch{1}}
\noindent\hspace*{\centeroffset}\makebox[0pt][l]{\begin{minipage}{\textwidth} \noindent\hspace*{\centeroffset}\makebox[0pt][l]{\begin{minipage}{\textwidth}
\flushright \flushright
{\bfseries \large Software version: 1.0\\
by Salvatore Filippone\\
Alfredo Buttari} \\
University of Rome ``Tor Vergata'' \\[3ex]
{\bfseries Daniela di Serafino }\\
Second University of Naples\\[3ex]
{\bfseries Pasqua D'Ambra}\\
ICAR-CNR, Naples\\[3ex]
\today \today
\end{minipage}} \end{minipage}}
%\addtolength{\textwidth}{\centeroffset} %\addtolength{\textwidth}{\centeroffset}
\vspace{\stretch{2}} \vspace{\stretch{2}}
@ -48,4 +57,3 @@ ICAR-CNR, Naples\\[3ex]
% mode: latex % mode: latex
% mode: flyspell % mode: flyspell
% End: % End:

@ -35,6 +35,14 @@
% /URI (http://ce.uniroma2.it/psblas) % /URI (http://ce.uniroma2.it/psblas)
} }
\setlength\oddsidemargin{.7in}
\setlength\evensidemargin{.7in}
\newlength{\centeroffset}
\setlength{\centeroffset}{0.5\oddsidemargin}
\addtolength{\centeroffset}{0.5\evensidemargin}
\addtolength{\textwidth}{-\centeroffset}
\pagestyle{myheadings}
\newcounter{subroutine}[subsection] \newcounter{subroutine}[subsection]
\newcounter{example}[subroutine] \newcounter{example}[subroutine]
\makeatletter \makeatletter

File diff suppressed because one or more lines are too long
Loading…
Cancel
Save