You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
280 lines
14 KiB
TeX
280 lines
14 KiB
TeX
\section{High-Level User Interface\label{sec:highlevel}}
|
|
|
|
At the upper layer of MLD2P4, five black-box routines encapsulate all the functionalities for the construction
|
|
and the application of any of the multi-level preconditioners.
|
|
In the following we give the details of the above routines. Note that for each routine are available four
|
|
different versions depending on involved data types: Real-Single/Double Precision, Complex-Single/Double Precision.
|
|
|
|
\subsection{Preconditioner Setup and Building}\label{sec:setup}
|
|
|
|
The setup of a MLD2P4 preconditioner is obtained by using the \verb|mld_precinit| routine, which
|
|
allocates and initializes the preconditioner data structure.
|
|
The API of this routine as well as the description of the arguments is reported in Fig.~\ref{fig:prcinit}.
|
|
Note that the allowed values for the \verb|ptype| argument are reported in Table~\ref{tab:precinit} (Sec. \ref{sec:started}).
|
|
%
|
|
\begin{figure}[h]
|
|
\begin{center}
|
|
{\small
|
|
\begin{verbatim}
|
|
mld_precinit(p,ptype,info,nlev)
|
|
|
|
Arguments:
|
|
p type(mld_dprec_type), input/output.
|
|
The preconditioner data structure.
|
|
ptype character, input. The type of preconditioner.
|
|
info integer, output. Error code.
|
|
nlev integer, optional, input.
|
|
The number of levels of the multilevel preconditioner.
|
|
If nlev is not present and ptype=`ML'/`ml',
|
|
then nlev=2 is assumed.
|
|
Otherwise, nlev is ignored.
|
|
\end{verbatim}
|
|
}
|
|
\end{center}
|
|
\caption{API of the routine for preconditioner allocation and inizialization.\label{fig:prcinit}}
|
|
\end{figure}
|
|
%
|
|
%
|
|
\begin{figure}[h]
|
|
\begin{center}
|
|
{\small
|
|
\begin{verbatim}
|
|
mld_precfree(p,info)
|
|
|
|
Arguments:
|
|
p - type(mld_dprec_type), input/output.
|
|
The preconditioner data structure to be deallocated.
|
|
info - integer, output.
|
|
Error code.
|
|
\end{verbatim}
|
|
}
|
|
\end{center}
|
|
\caption{API of the routine for preconditioner deallocation.\label{fig:prcfree}}
|
|
\end{figure}
|
|
|
|
A twin routine for deallocation of the preconditioner data structure is the \verb|mld_precfree| routine, whose API is reported in
|
|
Fig.~\ref{fig:prcfree}.
|
|
As mentioned in Section~\ref{sec:multilevel}, a multi-level preconditioner is a combination
|
|
of coarse-level corrections and one-level preconditioner (or smoothers).
|
|
Different combinations of these components together with different type of one-level preconditioner
|
|
as well as different algorithms to build and apply coarse-level corrections allow to the user of defining different multi-level
|
|
preconditioners.
|
|
The user of MLD2P4 may specify the type of multi-level framework (additive or multiplicative), details on the
|
|
aggregation algorithm, details on the type and the way for applying the one-level preconditioner
|
|
(as pre-smoother, post-smoother or both), the coarsest matrix storage
|
|
(distributed or replicated), the type of the solver to be employed at the coarsest level
|
|
and related details, by setting some parameters through the routine \verb|mld_precset| (see Section~\ref{sec:list}).
|
|
The API of this routine is reported in Fig.~\ref{fig:prcset}.
|
|
%
|
|
\begin{figure}[h]
|
|
\begin{center}
|
|
{\small
|
|
\begin{verbatim}
|
|
mld_precset(p,what,val,info,ilev)
|
|
|
|
Arguments:
|
|
p - type(mld_dprec_type), input/output.
|
|
The preconditioner data structure.
|
|
what - integer, input.
|
|
The number identifying the parameter to be set.
|
|
A mnemonic constant has been associated to each of these
|
|
numbers.
|
|
val - integer/character, input.
|
|
The value of the parameter to be set.
|
|
info - integer, output.
|
|
Error code.
|
|
ilev - integer, optional, input.
|
|
For the multilevel preconditioner, the level at which the
|
|
preconditioner parameter has to be set.
|
|
If nlev is not present, the parameter identified by 'what'
|
|
is set at all the appropriate levels.
|
|
\end{verbatim}
|
|
}
|
|
\end{center}
|
|
\caption{API of the routine for preconditioner setup.\label{fig:prcset}}
|
|
\end{figure}
|
|
%
|
|
Finally, to build a preconditioner, according to the requirements made trough the routines \verb|mld_precinit| and \verb|mld_precset|,
|
|
a user of MLD2P4 have to call the \verb|prec_build| routine, whose API is reported in Figure~\ref{fig:prcbld}.
|
|
%
|
|
\begin{figure}[h]
|
|
\begin{center}
|
|
{\small
|
|
\begin{verbatim}
|
|
mld_precbld(a,desc_a,prec,info)
|
|
|
|
Arguments:
|
|
a - type(psb_dspmat_type).
|
|
The sparse matrix structure containing the local part of the
|
|
matrix to be preconditioned.
|
|
desc_a - type(psb_desc_type), input.
|
|
The communication descriptor of a.
|
|
p - type(mld_dprec_type), input/output.
|
|
The preconditioner data structure containing the local part
|
|
of the preconditioner to be built.
|
|
info - integer, output.
|
|
Error code.
|
|
\end{verbatim}
|
|
}
|
|
\end{center}
|
|
\caption{API of the routine for preconditioner building.\label{fig:prcbld}}
|
|
\end{figure}
|
|
|
|
\subsubsection{List of the preconditioner parameters\label{sec:list}}
|
|
|
|
In the following we report the list of possible parameters to be set through the \verb|mld_precset| routine,
|
|
in order to choose the type of multi-level preconditioner. The parameters are classified depending on their scope.
|
|
Note that for character data both uppercase and lowercase strings are allowed.
|
|
\begin{table}[h]
|
|
{\small \label{tab:prec_type}
|
|
\begin{tabular}{ll}
|
|
Parameter (\verb|what|) & Allowed values ( \verb|val|)\\
|
|
\verb|mld_ml_type_| & 'ADD', 'MULT'\\
|
|
& Define the type of multi-level preconditioner.\\
|
|
\verb|mld_prec_type_| & 'DIAG', 'BJAC', 'AS' \\
|
|
& Define the smoother at a certain level.\\
|
|
\verb|mld_smooth_pos_| & 'PRE', 'POST', 'BOTH'\\
|
|
& Define the way to apply the smoother.\\
|
|
\end{tabular}
|
|
\caption{Parameters for preconditioner type.}
|
|
}
|
|
\end{table}
|
|
|
|
In order to build a coarse matrix from a fine one, this version of MLD2P4 implements the
|
|
smoothed aggregation algorithm described in Section~\ref{sec:aggregation}. However, since for nonsymmetric problems the
|
|
application of a correct smoothed procedure is yet an open problem~\cite{lin}, the user
|
|
may also choose to apply a nonsmoothed aggregation technique, where the prolongator operator from
|
|
the coarse to fine-space vertices is the simple piecewice constant interpolation
|
|
(the tentative prolongator) operator defined in Section~\ref{sec:aggregation}.
|
|
The coarsening scheme takes into account possible anisotropic features of the problems, by using
|
|
a threshold level to be used for dropping matrix coefficients during the process.
|
|
The parallel implementation of the coarsening algorithm is based on a decoupled approach, where each process applies the coarsening scheme
|
|
to its own local data. The uncoupled scheme can be applied to the matrix $A+A^T$, in the case of matrices with nonsymmetric sparsity pattern.
|
|
In the Table \ref{tab:aggr_type} we list the parameters that the user can specify for the aggregation algorithm.
|
|
\begin{table}[h]
|
|
{\small \label{tab:aggr_type}
|
|
\begin{tabular}{ll}
|
|
Parameter & Allowed values \\
|
|
(\verb|what|) & ( \verb|val|)\\
|
|
\verb|mld_aggr_alg_| & 'DEC', 'SYMDEC'\\
|
|
& Define the aggregation scheme\\
|
|
& Now, only decoupled aggregation is available \\
|
|
& (if 'SYMDEC' is set, the symmetric part of the matrix is considered)\\
|
|
\verb|mld_aggr_kind_| & 'SMOOTH', 'RAW'\\
|
|
& Define the type of aggregation technique (smoothed or nonsmoothed).\\
|
|
\verb|mld_aggr_thresh_| & Dropping threshold in aggregation.\\
|
|
& Default 0.0\\
|
|
\verb|mld_aggr_eig_| & NON E' DEFINITA LA STRINGA CORRISPONDENTE a mldmaxnorm\\
|
|
& Define the algorithm to evaluate the maximum eigenvalue\\
|
|
& of $D^{-1}A$ for smoothed aggregation. Now only the A-norm of the\\
|
|
& matrix is available.\\
|
|
\end{tabular}
|
|
\caption{Parameters for aggregation type.}
|
|
}
|
|
\end{table}
|
|
|
|
Some options are available for the system involving the coarsest matrix.
|
|
Indeed, this matrix can be replicated or distributed among the processors.
|
|
In the former case, various versions of incomplete LU (ILU) factorizations of the
|
|
coarsest matrix are available in order to solve the coarsest system.
|
|
In the current version of MLD2P4, the following factorizations are available~\cite{saad}:
|
|
\begin{description}
|
|
\item[ILU(k):] ILU factorization with fill-in level $k$;
|
|
\item[MILU(k):] modified ILU factorization with fill-in level $k$;
|
|
\item[ILU(k,t):] ILU with threshold $t$ and $k$ additional entries in each row of the L and U factors with respect to the initial sparsity pattern.
|
|
\end{description}
|
|
Furthermore, interfaces to UMFPACK~\cite{UMFPACK}, version 4.4, and to SuperLU package~\cite{SUPERLU}, version 3.0, have been also available to deal
|
|
with the coarsest system, when the coarsest matrix is replicated among the processors.
|
|
On the other hand, to solve the coarsest-level system when the coarsest matrix is distributed,
|
|
a block-Jacobi routine has been developed. It uses the different versions of ILU or the LU
|
|
factorization on the coarse matrix diagonal blocks held by the processors. In the case of
|
|
distributed coarsest matrix is also available an interface to SupeLU$\_$dist~\cite{SUPERLUDIST}, version 2.0, for distributed
|
|
sparse factorization and solve.
|
|
See the Table \ref{tab:coarse_mat} for details.
|
|
\begin{table}[h]
|
|
{\small \label{tab:coarse_mat}
|
|
\begin{tabular}{ll}
|
|
Parameter & Allowed values\\
|
|
( \verb|what|) & ( \verb|val|)\\
|
|
\verb|mld_coarse_mat_| & 'DISTR', 'REPL' \\
|
|
& Coarse Matrix: distributed or replicated \\
|
|
\verb|mld_coarse_solve_| & 'ILU', 'MILU', 'ILUT', 'SLU', 'UMF', SLUDIST', BJAC????\\
|
|
& Available Coarse solver.\\
|
|
& Only SLUDIST e BJAC can be used when coarse matrix is distributed\\
|
|
\verb|mld_coarse_BJAC_sweeps_| & (NON VA BENE mldcoarsesweeps) number of Block-Jacobi sweeps when BJAC is used as coarsest solver\\
|
|
\verb|mld_coarse_fill_in_| & level of fill-in in MILU and ILU factorization\\
|
|
& E IL THRESHOLD PER ILUT? \\
|
|
\end{tabular}
|
|
\caption{Parameters for coarsest matrix solver.}
|
|
}
|
|
\end{table}
|
|
|
|
When a Schwarz algorithm is considered as smoother at a certain level or as one-level preconditioner, the user may set many parameters
|
|
in order to choose the type of additive Schwarz version (AS,RAS,ASH), the number of overlaps as well as the local solver.
|
|
All the parameters are reported in Table \ref{tab:schwarz_type}.
|
|
\begin{table}[h]
|
|
{\small \label{tab:schwarz_type}
|
|
\begin{tabular}{ll}
|
|
Parameter & Allowed values\\
|
|
(\verb|what|) & (\verb|val|)\\
|
|
\verb|mld_n_ovr_| & Number of overlaps \\
|
|
\verb|mld_sub_restr_| & 'HALO', 'NONE'\\
|
|
\verb|mld_sub_prol_| & 'SUM', 'NONE'\\
|
|
\verb|mld_sub_solve_| & 'ILU', 'MILU', 'ILUT', 'SLU', 'UMF'\\
|
|
\verb|mld_sub_ren_| & MANCANO LE STRINGHE\\
|
|
\verb|mld_sub_fill_in_| & level of fill-in in local diagonal blocks, when ILU-type factorizations are used\\
|
|
\end{tabular}
|
|
\caption{Parameters for Schwarz smoother/preconditioner type.}
|
|
}
|
|
\end{table}
|
|
Its worth noting that, the classical AS method corresponds to the couple of values 'HALO' and 'SUM' of the argument \verb|val|,
|
|
for the values \verb|mld_sub_restr_| and \verb|mld_sub_prol_| of the argument \verb|what|, respectively. While, the RAS method corresponds to
|
|
the couple of values 'NONE' and 'SUM' and ASH method corresponds to the couple of values 'HALO' and 'NONE'.
|
|
|
|
\subsection{Preconditioner Application} \label{sec:application}
|
|
|
|
Once the preconditioner has been built, it may be applied at each iteration
|
|
of a Krylov solver by calling the routine \verb|mld_precaply| (CAMBIARE NOME ROUTINE NEL SOFTWARE EVITANDO L'UNDERSCORE),
|
|
whose API is shown in Figure~\ref{fig:prcaply}.
|
|
This routine computes $y = op(M^{-1})\, x$, where $M$ is the previously built
|
|
preconditioner, stored in the \verb|prec| data structure, and $op$
|
|
denotes the matrix itself or its transpose, according to the value of \verb|trans|.
|
|
Note that this routine is called within the PSBLAS-based Krylov solver available in the PSBLAS library (see the PSBLAS User's Guide for details),
|
|
therefore, the use of this routine is generally transparent to the MLD2P4 user.
|
|
%
|
|
\begin{figure}[h]
|
|
\begin{center}
|
|
{\small
|
|
\begin{verbatim}
|
|
mld_precaply(prec,x,y,desc_data,info,trans,work)
|
|
|
|
Arguments:
|
|
prec - type(mld_dprec_type), input.
|
|
The preconditioner data structure containing the local part
|
|
of the preconditioner to be applied.
|
|
x - real(psb_dpk_), dimension(:), input.
|
|
The local part of the vector X in Y := op(M^(-1)) * X.
|
|
y - real(psb_dpk_), dimension(:), output.
|
|
The local part of the vector Y in Y := op(M^(-1)) * X.
|
|
desc_data - type(psb_desc_type), input.
|
|
The communication descriptor associated to the matrix to be
|
|
preconditioned.
|
|
info - integer, output.
|
|
Error code.
|
|
trans - character(len=1), optional.
|
|
If trans='N','n' then op(M^(-1)) = M^(-1);
|
|
if trans='T','t' then op(M^(-1)) = M^(-T) (transpose of M^(-1)).
|
|
work - real(psb_dpk_), dimension (:), optional, target.
|
|
Workspace. Its size must be at
|
|
least 4*psb_cd_get_local_cols(desc_data).
|
|
\end{verbatim}
|
|
}
|
|
\end{center}
|
|
\caption{API of the routine for preconditioner application.\label{fig:prcaply}}
|
|
\end{figure}
|
|
|
|
%%% Local Variables:
|
|
%%% mode: latex
|
|
%%% TeX-master: "userguide"
|
|
%%% End:
|