\section{Data Structures} \label{sec:datastruct} %\ifthenelse{\boolean{mtc}}{\minitoc}{} In this chapter we illustrate the data structures used for definition of routines interfaces. They include data structures for sparse matrices, communication descriptors and preconditioners.%% These data structures %% are used for calling PSBLAS routines in Fortran~90 language and will %% be used to next chapters containing these callings. All the data types and the basic subroutine interfaces are defined in the module \verb|psb_base_mod|; this will have to be included by every user subroutine that makes use of the library. Real and complex data types are parametrized with a kind type defined in the library as follows: \begin{description} \item[psb\_spk\_] Kind parameter for short precision real and complex data; corresponds to a \verb|REAL| declaration and is normally 4 bytes. \item[psb\_dpk\_] Kind parameter for long precision real and complex data; corresponds to a \verb|DOUBLE PRECISION| declaration and is normally 8 bytes. \end{description} \subsection{Descriptor data structure} \label{sec:desc} All the general matrix informations and elements to be exchanged among processes are stored within a data structure of the type \hypertarget{descdata}{{\tt psb\_desc\_type}}. Every structure of this type is associated to a sparse matrix, it contains data about general matrix informations and elements to be exchanged among processes. It is not necessary for the user to know the internal structure of \verb|psb_desc_type|, it is set in a transparent mode by the tools routines of Sec.~\ref{sec:toolsrout}, and its fields may be accessed if necessary via the routines of sec.~\ref{sec:dataquery}; nevertheless we include a description for the curious reader: \begin{description} \item[{\bf matrix\_data}] includes general information about matrix and process grid, such as the communication context, the size of the global matrix, the size of the portion of matrix stored on the current process, and so on. %% More precisely: %% \begin{description} %% \item[matrix\_data[psb\_dec\_type\_\hbox{]}] Identifies the decomposition type %% (global); the actual values are internally defined, so they should %% never be accessed directly. %% \item[matrix\_data[psb\_ctxt\_\hbox{]}] Communication context %% associated with the processes comprised in the virtual parallel %% machine (global). %% \item[matrix\_data[psb\_m\_\hbox{]}] Total number of equations (global). %% \item[matrix\_data[psb\_n\_\hbox{]}] Total number of variables (global). %% \item[matrix\_data[psb\_n\_row\_\hbox{]}] Number of grid variables owned by the %% current process (local); equivalent to the number of local rows in the %% sparse coefficient matrix. %% \item[matrix\_data[psb\_n\_col\_\hbox{]}] Total number of grid variables read by the %% current process (local); equivalent to the number of local columns in %% the sparse coefficient matrix. They include the halo. %% \end{description} Specified as: an allocatable integer array of dimension \verb|psb_mdata_size_|. \item[{\bf halo\_index}] A list of the halo and boundary elements for the current process to be exchanged with other processes; for each processes with which it is necessary to communicate: \begin{enumerate} \item Process identifier; \item Number of points to be received; \item Indices of points to be received; \item Number of points to be sent; \item Indices of points to be sent; \end{enumerate} The list may contain an arbitrary number of groups; its end is marked by a -1.\\ Specified as: an allocatable integer array of rank one. \item[{\bf ext\_index}] A list of element indices to be exchanged to implement the mapping between a base descriptor and a descriptor with overlap. \item [{\bf ovrlap\_index}] A list of the overlap elements for the current process, organized in groups like the previous vector: \begin{enumerate} \item Process identifier; \item Number of points to be received; \item Indices of points to be received; \item Number of points to be sent; \item Indices of points to be sent; \end{enumerate} The list may contain an arbitrary number of groups; its end is marked by a -1.\\ Specified as: an allocatable integer array of rank one. \item [{\bf ovr\_mst\_idx}] A list to retrieve the value of each overlap element from the respective master process.\\ Specified as: an allocatable integer array of rank one. \item [{\bf ovrlap\_elem}] For all overlap points belonging to th ecurrent process: \begin{enumerate} \item Overlap point index; \item Number of processes sharing that overlap points; \item Index of a ``master'' process: \end{enumerate} Specified as: an allocatable integer array of rank two. \item[{\bf loc\_to\_glob}] each element $i$ of this array contains global identifier of the local variable $i$.\\ Specified as: an allocatable integer array of rank one. \item[{\bf glob\_to\_loc, glb\_lc, hashv}] Contain a mapping from global to local indices. \end{description} The Fortran~95 definition for \verb|psb_desc_type| structures is as follows: \begin{figure}[h!] \begin{Sbox} \begin{minipage}[tl]{0.9\textwidth} \begin{verbatim} type psb_desc_type integer, allocatable :: matrix_data(:), halo_index(:) integer, allocatable :: ext_index(:) integer, allocatable :: ovrlap_elem(:,:) integer, allocatable :: ovrlap_index(:) integer, allocatable :: ovr_mst_idx(:) integer, allocatable :: loc_to_glob(:), glob_to_loc(:) integer, allocatable :: hashv(:), glb_lc(:,:) end type psb_desc_type \end{verbatim} \end{minipage} \end{Sbox} \setlength{\fboxsep}{8pt} \begin{center} \fbox{\TheSbox} \end{center} \caption{\label{fig:desctype}The PSBLAS defined data type that contains the communication descriptor.} \end{figure} A communication descriptor associated with a sparse matrix has a state, which can take the following values: \begin{description} \item[Build:] State entered after the first allocation, and before the first assembly; in this state it is possible to add communication requirements among different processes. \item[Assembled:] State entered after the assembly; computations using the associated sparse matrix, such as matrix-vector products, are only possible in this state. \end{description} The global to local index mapping may be stored in two different formats: the first is simpler but more expensive, as it requires on each process an amount of memory proportional to the global size of the index space; the second is more complex, but only requires memory proportional to the local index space size. The choice is made at the time of the initialization according to a threshold; this threshold may be queried and set using the functions in sec.~\ref{sec:dataquery}. \subsubsection{Named Constants} \label{sec:cd_constants} \begin{description} \item[psb\_none\_] Generic no-op; \item[psb\_nohalo\_] Do not fetch halo elements; \item[psb\_halo\_] Fetch halo elements from neighbouring processes; \item[psb\_sum\_] Sum overlapped elements \item[psb\_avg\_] Average overlapped elements \item[psb\_comm\_halo\_] Exchange data based on the \verb|halo_index| list; \item[psb\_comm\_ext\_] Exchange data based on the \verb|ext_index| list; \item[psb\_comm\_ovr\_] Exchange data based on the \verb|ovrlap_index| list; \item[psb\_comm\_mov\_] Exchange data based on the \verb|ovr_mst_idx| list; %% \item[psb\_square\_root\_] Update with the square root of the average %% of overlapped elements; %% \item[psb\_dec\_type\_] Entry holding decomposition type (in \verb|desc_a%matrix_data|) %% \item[psb\_m\_] Entry holding total number of rows %% \item[psb\_n\_] Entry holding total number of columns %% \item[ psb\_n\_row\_] Entry holding the number of rows stored in the %% current process %% \item[psb\_n\_col\_] Entry holding the number of columns stored in the %% current process %% \item[psb\_ctxt\_] Entry holding a copy of the BLACS communication context %% \item[psb\_desc\_asb\_] State of the descriptor: assembled, %% i.e. suitable for computational tasks. %% \item[psb\_desc\_bld\_] State of the descriptor: build, must be %% assembled before computational use. \end{description} \subsection{Sparse Matrix data structure} \label{sec:spmat} The \hypertarget{spdata}{{\tt psb\_spmat\_type}} data structure contains all information about local portion of the sparse matrix and its storage mode. Most of these fields are set by the tools routines when inserting a new sparse matrix; the user needs only choose, if he/she so whishes, a specific matrix storage mode. \\ \begin{description} \item[{\bf aspk}] Contains values of the local distributed sparse matrix.\\ Specified as: an allocatable array of rank one of type corresponding to matrix entries type. \item[{\bf ia1}] Holds integer information on distributed sparse matrix. Actual information will depend on data format used.\\ Specified as: an allocatable integer array of rank one. \item[{\bf ia2}] Holds integer information on distributed sparse matrix. Actual information will depend on data format used.\\ Specified as: an allocatable integer array of rank one. \item[{\bf infoa}] On entry can hold auxiliary information on distributed sparse matrix. Actual information will depend on data format used.\\ Specified as: an integer array of length \verb|psb_ifasize_|. \item[{\bf fida}] Defines the format of the distributed sparse matrix.\\ Specified as: a string of length 5 \item[{\bf descra}] Describe the characteristic of the distributed sparse matrix.\\ Specified as: array of character of length 9. \item[{\bf pl}] Specifies the local row permutation of distributed sparse matrix. If pl(1) is equal to 0, then there isn't row permutation.\\ Specified as: an allocatable integer array of dimension equal to number of local row (matrix\_data[psb\_n\_row\_\hbox{]}) \item[{\bf pr}] Specifies the local column permutation of distributed sparse matrix. If PR(1) is equal to 0, then there isn't columnm permutation.\\ Specified as: an allocatable integer array of dimension equal to number of local row (matrix\_data[psb\_n\_col\_\hbox{]}) \item[{\bf m}] Number of rows; if row indices are stored explicitly, as in Coordinate Storage, should be greater than or equal to the maximum row index actually present in the sparse matrix. Specified as: integer variable. \item[{\bf k}] Number of columns; if column indices are stored explicitly, as in Coordinate Storage or Compressed Sparse Rows, should be greater than or equal to the maximum column index actually present in the sparse matrix. Specified as: integer variable. \end{description} The Fortran~95 interface for distributed sparse matrices containing double precision real entries is defined as shown in figure~\ref{fig:spmattype}. The definitions for single precision and complex data are identical except for the \verb|real| declaration and for the kind type parameter. \begin{figure}[h!] \begin{Sbox} \begin{minipage}[tl]{0.85\textwidth} \begin{verbatim} type psb_sspmat_type integer :: m, k character :: fida(5) character :: descra(10) integer :: infoa(psb_ifa_size_) real(psb_spk_), allocatable :: aspk(:) integer, allocatable :: ia1(:), ia2(:) integer, allocatable :: pr(:), pl(:) end type psb_sspmat_type type psb_dspmat_type integer :: m, k character :: fida(5) character :: descra(10) integer :: infoa(psb_ifa_size_) real(psb_dpk_), allocatable :: aspk(:) integer, allocatable :: ia1(:), ia2(:) integer, allocatable :: pr(:), pl(:) end type psb_dspmat_type type psb_cspmat_type integer :: m, k character :: fida(5) character :: descra(10) integer :: infoa(psb_ifa_size_) complex(psb_spk_), allocatable :: aspk(:) integer, allocatable :: ia1(:), ia2(:) integer, allocatable :: pr(:), pl(:) end type psb_cspmat_type type psb_zspmat_type integer :: m, k character :: fida(5) character :: descra(10) integer :: infoa(psb_ifa_size_) complex(psb_dpk_), allocatable :: aspk(:) integer, allocatable :: ia1(:), ia2(:) integer, allocatable :: pr(:), pl(:) end type psb_zspmat_type \end{verbatim} \end{minipage} \end{Sbox} \setlength{\fboxsep}{8pt} \begin{center} \fbox{\TheSbox} \end{center} \caption{\label{fig:spmattype} The PSBLAS defined data type that contains a sparse matrix.} \end{figure} The following two cases are among the most commonly used: \begin{description} \item[fida=``CSR''] Compressed storage by rows. In this case the following should hold: \begin{enumerate} \item \verb|ia2(i)| contains the index of the first element of row \verb|i|; the last element of the sparse matrix is thus stored at index $ia2(m+1)-1$. It should contain \verb|m+1| entries in nondecreasing order (strictly increasing, if there are no empty rows). \item \verb|ia1(j)| contains the column index and \verb|aspk(j)| contains the corresponding coefficient value, for all $ia2(1) \le j \le ia2(m+1)-1$. \end{enumerate} \item[fida=``COO''] Coordinate storage. In this case the following should hold: \begin{enumerate} \item \verb|infoa(1)| contains the number of nonzero elements in the matrix; \item For all $1 \le j \le infoa(1)$, the coefficient, row index and column index are stored into \verb|apsk(j)|, \verb|ia1(j)| and \verb|ia2(j)| respectively. \end{enumerate} \end{description} A sparse matrix has an associated state, which can take the following values: \begin{description} \item[Build:] State entered after the first allocation, and before the first assembly; in this state it is possible to add nonzero entries. \item[Assembled:] State entered after the assembly; computations using the sparse matrix, such as matrix-vector products, are only possible in this state; \item[Update:] State entered after a reinitalization; this is used to handle applications in which the same sparsity pattern is used multiple times with different coefficients. In this state it is only possible to enter coefficients for already existing nonzero entries. \end{description} \subsubsection{Named Constants} \label{sec:sp_constants} \begin{description} %% \item[psb\_nztotreq\_] Request to fetch the total number of nonzeroes %% stored in a sparse matrix %% \item[psb\_nzrowreq\_] Request to fetch the number of nonzeroes in a %% given row in a sparse matrix \item[psb\_dupl\_ovwrt\_] Duplicate coefficients should be overwritten (i.e. ignore duplications) \item[psb\_dupl\_add\_] Duplicate coefficients should be added; \item[psb\_dupl\_err\_] Duplicate coefficients should trigger an error conditino \item[psb\_upd\_dflt\_] Default update strategy for matrix coefficients; \item[psb\_upd\_srch\_] Update strategy based on search into the data structure; \item[psb\_upd\_perm\_] Update strategy based on additional permutation data (see tools routine description). \end{description} \subsection{Preconditioner data structure} \label{sec:prec} Our base library offers support for simple well known preconditioners like Diagonal Scaling or Block Jacobi with incomplete factorization ILU(0). A preconditioner is held in the \hypertarget{precdata}{{\tt psb\_prec\_type}} data structure reported in figure~\ref{fig:prectype}. The \verb|psb_prec_type| data type may contain a simple preconditioning matrix with the associated communication descriptor.%% which may be different than the %% system communication descriptor in the case of parallel %% preconditioners like the Additive Schwarz one. Then the %% \verb|psb_prec_type| may contain more than one preconditioning matrix %% like in the case of Two-Level (in general Multi-Level) preconditioners. %% The user can choose the type of preconditioner to be used by means of %% the \verb|psb_precset| subroutine; once the type of preconditioning %% method is specified, along with all the parameters that characterize %% it, the preconditioner data structure can be built using the %% \verb|psb_precbld| subroutine. %% This data structure wants to be flexible enough to easily allow the %% implementation of new kind of preconditioners. The values contained in the \verb|iprcparm| and \verb|rprcparm| define tha type of preconditioner along with all the parameters related to it; thus, \verb|iprcparm| and \verb|rprcparm| define how the other records have to be interpreted. This data structure is the basis of more complex preconditioning strategies, which are the subject of further research. \begin{figure}[h!] \small \begin{Sbox} \begin{minipage}[tl]{0.9\textwidth} \begin{verbatim} type psb_sprec_type type(psb_sspmat_type), allocatable :: av(:) real(psb_spk_), allocatable :: d(:) type(psb_desc_type) :: desc_data integer, allocatable :: iprcparm(:) real(psb_spk_), allocatable :: rprcparm(:) integer, allocatable :: perm(:), invperm(:) integer :: prec, base_prec end type psb_sprec_type type psb_dprec_type type(psb_dspmat_type), allocatable :: av(:) real(psb_dpk_), allocatable :: d(:) type(psb_desc_type) :: desc_data integer, allocatable :: iprcparm(:) real(psb_dpk_), allocatable :: rprcparm(:) integer, allocatable :: perm(:), invperm(:) integer :: prec, base_prec end type psb_dprec_type type psb_cprec_type type(psb_cspmat_type), allocatable :: av(:) complex(psb_spk_), allocatable :: d(:) type(psb_desc_type) :: desc_data integer, allocatable :: iprcparm(:) real(psb_spk_), allocatable :: rprcparm(:) integer, allocatable :: perm(:), invperm(:) integer :: prec, base_prec end type psb_cprec_type type psb_zprec_type type(psb_zspmat_type), allocatable :: av(:) complex(psb_dpk_), allocatable :: d(:) type(psb_desc_type) :: desc_data integer, allocatable :: iprcparm(:) real(psb_dpk_), allocatable :: rprcparm(:) integer, allocatable :: perm(:), invperm(:) integer :: prec, base_prec end type psb_zprec_type \end{verbatim} \end{minipage} \end{Sbox} \setlength{\fboxsep}{8pt} \begin{center} \fbox{\TheSbox} \end{center} \caption{\label{fig:prectype}The PSBLAS defined data type that contains a preconditioner.} \end{figure} %% \subsection{Named Constants} %% \label{sec:prec_constants} %% \begin{description} %% \item[f\_ilu\_n\_] Incomplete LU factorization with $n$ levels of %% fill-in; currently only $n=0$ is implemented; %% \item[f\_slu\_] Sparse factorization using SuperLU; %% \item[f\_umf\_] Sparse factorization using UMFPACK; %% \item[add\_ml\_prec\_] Additive multilevel correction; %% \item[mult\_ml\_prec\_] Multiplicative multilevel correction; %% \item[pre\_smooth\_] Pre-smoothing in applying multiplicative %% multilevel corrections; %% \item[post\_smooth\_] Post-smoothing in applying multiplicative %% multilevel corrections; %% \item[smooth\_both\_] Two-sided (i.e. symmetric) smoothing in applying multiplicative %% multilevel corrections; %% \item[mat\_distr\_] Coarse matrix distributed among processes %% \item[mat\_repl\_] Coarse matrix replicated among processes %% \end{description} \subsection{Data structure query routines} \label{sec:dataquery} \subsubroutine{psb\_cd\_get\_local\_rows}{Get number of local rows} \syntax{nr = psb\_cd\_get\_local\_rows}{desc} \begin{description} \item[Type:] Asynchronous. \item[\bf On Entry] \item[desc] the communication descriptor.\\ Scope: {\bf local}.\\ Type: {\bf required}.\\ Intent: {\bf in}.\\ Specified as: a structured data of type \descdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The number of local rows, i.e. the number of rows owned by the current process; as explained in~\ref{sec:intro}, it is equal to $|{\cal I}_i| + |{\cal B}_i|$. The returned value is specific to the calling process. \end{description} \subsubroutine{psb\_cd\_get\_local\_cols}{Get number of local cols} \syntax{nc = psb\_cd\_get\_local\_cols}{desc} \begin{description} \item[\bf On Entry] \item[Type:] Asynchronous. \item[desc] the communication descriptor.\\ Scope: {\bf local}.\\ Type: {\bf required}.\\ Intent: {\bf in}.\\ Specified as: a structured data of type \descdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The number of local cols, i.e. the number of indices used by the current process, including both local and halo indices; as explained in~\ref{sec:intro}, it is equal to $|{\cal I}_i| + |{\cal B}_i| +|{\cal H}_i|$. The returned value is specific to the calling process. \end{description} \subsubroutine{psb\_cd\_get\_global\_rows}{Get number of global rows} \syntax{nr = psb\_cd\_get\_global\_rows}{desc} \begin{description} \item[\bf On Entry] \item[Type:] Asynchronous. \item[desc] the communication descriptor.\\ Scope: {\bf local}.\\ Type: {\bf required}.\\ Intent: {\bf in}.\\ Specified as: a structured data of type \descdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The number of global rows in the mesh \end{description} \subsubroutine{psb\_cd\_get\_global\_cols}{Get number of global cols} \syntax{nr = psb\_cd\_get\_global\_cols}{desc} \begin{description} \item[Type:] Asynchronous. \item[\bf On Entry] \item[desc] the communication descriptor.\\ Scope: {\bf local}.\\ Type: {\bf required}.\\ Intent: {\bf in}.\\ Specified as: a structured data of type \descdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The number of global cols in the mesh \end{description} \subsubroutine{psb\_cd\_get\_context}{Get communication context} \syntax{ictxt = psb\_cd\_get\_context}{desc} \begin{description} \item[Type:] Asynchronous. \item[\bf On Entry] \item[desc] the communication descriptor.\\ Scope: {\bf local}.\\ Type: {\bf required}.\\ Intent: {\bf in}.\\ Specified as: a structured data of type \descdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The communication context. \end{description} \subsubroutine{psb\_cd\_get\_large\_threshold}{Get threshold for index mapping switch} \syntax{ith = psb\_cd\_get\_large\_threshold}{} \begin{description} \item[Type:] Asynchronous. \item[\bf On Return] \item[Function value] The current value for the size threshold. \end{description} \subsubroutine{psb\_cd\_set\_large\_threshold}{Set threshold for index mapping switch} \syntax{call psb\_cd\_set\_large\_threshold}{ith} \begin{description} \item[Type:] Asynchronous. \item[\bf On Entry] \item[ith] the new threshold for communication descriptors.\\ Scope: {\bf global}.\\ Type: {\bf required}.\\ Intent: {\bf in}.\\ Specified as: an integer value greater than zero. \end{description} Note: the threshold value is only queried by the library at the time a call to \verb|psb_cdall| is executed, therefore changing the threshold has no effect on communication descriptors that have already been initialized. \subsubroutine{psb\_sp\_get\_nrows}{Get number of rows in a sparse matrix} \syntax{nr = psb\_sp\_get\_nrows}{a} \begin{description} \item[Type:] Asynchronous. \item[\bf On Entry] \item[a] the sparse matrix\\ Scope: {\bf local}\\ Type: {\bf required}\\ Intent: {\bf in}.\\ Specified as: a structured data of type \spdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The number of rows of sparse matrix \verb|a|. \end{description} \subsubroutine{psb\_sp\_get\_ncols}{Get number of columns in a sparse matrix} \syntax{nr = psb\_sp\_get\_ncols}{a} \begin{description} \item[Type:] Asynchronous. \item[\bf On Entry] \item[a] the sparse matrix\\ Scope: {\bf local}\\ Type: {\bf required}\\ Intent: {\bf in}.\\ Specified as: a structured data of type \spdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The number of columns of sparse matrix \verb|a|. \end{description} \subsubroutine{psb\_sp\_get\_nnzeros}{Get number of nonzero elements in a sparse matrix} \syntax{nr = psb\_sp\_get\_nnzeros}{a} \begin{description} \item[Type:] Asynchronous. \item[\bf On Entry] \item[a] the sparse matrix\\ Scope: {\bf local}\\ Type: {\bf required}\\ Intent: {\bf in}.\\ Specified as: a structured data of type \spdata. \end{description} \begin{description} \item[\bf On Return] \item[Function value] The number of nonzero elements stored in sparse matrix \verb|a|. \end{description} \subsubsection*{Notes} \begin{enumerate} \item The function value is specific to the storage format of matrix \verb|a|; some storage formats employ padding, thus the returned value for the same matrix may be different for different storage choices. \end{enumerate} %%% Local Variables: %%% mode: latex %%% TeX-master: "userguide" %%% End: