You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
psblas3/test/nested
Stack-1 8bd49c43b1 [UPDATE] Convert the CG test to the builder API and drop the redundant builder test
Rewrite psb_d_nest_cg_test to build the operator through the psb_d_nest_matrix
utility (init/ins/asb + get_owned_rows) instead of the low-level path, so no
per-field descriptor or l2g idiom appears in user-facing test code; x_exact=1
is set with x%set(done) rather than an l2g loop.

With this change psb_d_nest_cg_test fully subsumes psb_d_nest_builder_test
(same operator via the same builder, NONE plus DIAG/BJAC), so the latter is
removed.  The test suite is now glob (square matvec), rect (rectangular
matvec) and cg (builder + preconditioned CG).  Build hooks and README updated.

Author: Simone Staccone (Stack-1)
2 weeks ago
..
CMakeLists.txt [UPDATE] Convert the CG test to the builder API and drop the redundant builder test 2 weeks ago
Makefile [UPDATE] Convert the CG test to the builder API and drop the redundant builder test 2 weeks ago
README.md [UPDATE] Convert the CG test to the builder API and drop the redundant builder test 2 weeks ago
psb_d_nest_cg_test.F90 [UPDATE] Convert the CG test to the builder API and drop the redundant builder test 2 weeks ago
psb_d_nest_glob_test.F90 [UPDATE] User-friendly row queries on the nested matrix 2 weeks ago
psb_d_nest_rect_test.F90 [UPDATE] User-friendly row queries on the nested matrix 2 weeks ago

README.md

Nested (block-structured / MATNEST) matrices in PSBLAS

Author: Simone Staccone (Stack-1)

This directory contains the tests for the nested matrix support added to PSBLAS: a block-structured distributed operator

      [ A11  A12  ... ]
  M = [ A21  A22  ... ]
      [ ...       ... ]

whose blocks are kept as separate sparse matrices (one per field) but which presents itself to Krylov solvers and preconditioners as a single ordinary distributed matrix. It is the PSBLAS analogue of PETSc's MATNEST.

The motivating case is the saddle-point system

  M = [ A    B^T ]
      [ B     0  ]

(symmetric indefinite, with the (2,2) block absent), but the implementation supports any square multi-field block operator with possibly rectangular sub-blocks.

1. Concepts

  • Field — a contiguous index space (e.g. velocity V and pressure Q in a saddle-point problem). Each field has its own psb_desc_type distribution.
  • Block (i,j) — the sub-matrix coupling field i (rows) with field j (columns). It may be rectangular (different field sizes) and may be absent.
  • Global operator — the blocks are concatenated into a single square operator M of size sum(field_sizes), distributed over one composed global descriptor with a union halo (one halo exchange per matrix-vector product, covering all blocks of a given column field at once).
  • Rectangular blocks — PSBLAS does not support rectangular distributed matrices, but it does support rectangular local CSR/COO matrices. The rectangular product therefore happens only in the local block csmv; the only object carrying a descriptor (and hence communication) is the global operator, which is always square.

The global operator (a_glob) and global descriptor (desc_glob) can be passed unchanged to psb_spmm, psb_krylov, and the standard preconditioners.

2. Quick start: psb_d_nest_matrix

The easy way to build a nested matrix is the psb_d_nest_matrix type (module psb_d_nest_builder_mod, re-exported by the umbrella psb_d_nest_mod), which follows the usual PSBLAS init / ins / asb pattern and hides all the descriptor / halo / compose / setup boilerplate:

use psb_d_nest_mod

type(psb_d_nest_matrix) :: nested_matrix
integer(psb_lpk_)       :: n1, n2

! 1) declare the field structure: 2 fields of global size n1, n2
call nested_matrix%init(ctxt, [n1, n2], info)

! 2) insert the block values, owned rows only (PSBLAS convention).
call nested_matrix%ins(1, 1, nz_A,  iaA,  jaA,  valA,  info)   ! A   = block (1,1)
call nested_matrix%ins(1, 2, nz_Bt, iaBt, jaBt, valBt, info)   ! B^T = block (1,2)
call nested_matrix%ins(2, 1, nz_B,  iaB,  jaB,  valB,  info)   ! B   = block (2,1)
!    (the (2,2) block is simply not inserted)

! 3) assemble: builds nested_matrix%a_glob and nested_matrix%desc_glob
call nested_matrix%asb(info)

! 4) from here on it is an ordinary distributed matrix/descriptor
call psb_geall(x, nested_matrix%desc_glob, info)
...
call prec%init(ctxt, 'BJAC', info)
call prec%build(nested_matrix%a_glob, nested_matrix%desc_glob, info)
call psb_krylov('CG', nested_matrix%a_glob, prec, b, x, eps, &
     & nested_matrix%desc_glob, info, itmax=..., iter=..., err=...)

! 5) release
call nested_matrix%free(info)

3. User API reference

All of the public API is available through the umbrella module:

use psb_d_nest_mod
Member Meaning
a_glob type(psb_dspmat_type) — the assembled global operator; pass it to psb_spmm, psb_krylov, prec%build
desc_glob type(psb_desc_type) — the composed global descriptor; pass it wherever a descriptor is expected
field_desc(i) type(psb_desc_type) — the descriptor of field i (advanced use; for the common queries see get_owned_rows below)
n_fields number of fields

To know which rows it must insert, a process asks the matrix directly — no descriptor jargon needed:

integer(psb_lpk_), allocatable :: my_rows(:)

my_rows = nested_matrix%get_owned_rows(1)     ! global rows of field 1 owned here
do k = 1, size(my_rows)
   global_row = my_rows(k)
   ...                                        ! build the entries of this row
end do
Query Result
nested_matrix%get_owned_rows(i_field) integer(psb_lpk_), allocatable (:) — the GLOBAL indices (in the field index space, 1..field size) of the rows of field i_field owned by this process
nested_matrix%get_owned_row_count(i_field) integer(psb_ipk_) — how many rows of field i_field this process owns

Methods (collective over the communicator unless noted):

call nested_matrix%init(ctxt, field_sizes, info)

Create the field structure. One descriptor per field is created with a block row distribution; the total size is independent of the number of processes.

Argument Type Intent Meaning
ctxt type(psb_ctxt_type) in parallel context from psb_init
field_sizes(:) integer(psb_lpk_) in global size of each field, e.g. [n1, n2]
info integer(psb_ipk_) out return code, psb_success_ on success

call nested_matrix%ins(block_row, block_col, n_entries, entry_rows, entry_cols, entry_vals, info)

Insert a batch of coefficients into block (block_row, block_col). May be called any number of times per block, in any order, before asb. Each process inserts only the rows it owns (PSBLAS convention); cross-field columns are registered into the union halo automatically.

Argument Type Intent Meaning
block_row integer(psb_ipk_) in row-field index of the block (1..n_fields)
block_col integer(psb_ipk_) in column-field index of the block (1..n_fields)
n_entries integer(psb_ipk_) in number of triplets in this batch
entry_rows(:) integer(psb_lpk_) in GLOBAL row indices in field block_row (1..field size)
entry_cols(:) integer(psb_lpk_) in GLOBAL column indices in field block_col (1..field size)
entry_vals(:) real(psb_dpk_) in coefficient values
info integer(psb_ipk_) out return code

call nested_matrix%asb(info [, type] [, mold])

Assemble: builds the per-field halos, the (possibly rectangular) local blocks, the composed global descriptor desc_glob and the global operator a_glob. After asb no further ins is allowed, and the object must not be copied/moved (the operator holds internal pointers into it).

The optional arguments select the storage format of the blocks:

Argument Type Meaning
type character(len=*) a base format name: 'CSR' (default), 'CSC', 'COO'
mold class(psb_d_base_sparse_mat) any format class, e.g. psb_d_ell_sparse_mat / psb_d_hll_sparse_mat from psb_ext

The nested operator is format-agnostic: every operation delegates to the blocks' own methods, so each block runs its native kernels.

call nested_matrix%free(info)

Release every internal object (blocks, descriptors, global operator).

3.2 Solvers and preconditioners

a_glob / desc_glob work with the standard PSBLAS infrastructure:

  • Krylov methodspsb_krylov('CG' | 'BICGSTAB' | 'GMRES' | ..., nested_matrix%a_glob, prec, b, x, eps, nested_matrix%desc_glob, info, ...). Remember that CG requires an SPD operator; a genuine saddle-point operator is indefinite and needs MINRES/GMRES.
  • Preconditioners — all the stock PSBLAS one-level preconditioners can be built directly on the nested operator:
    • 'NONE' — identity;
    • 'DIAG' / 'JACOBI' — diagonal scaling (served by the nested get_diag, which concatenates the diagonals of the diagonal blocks; absent blocks contribute zeros);
    • 'BJAC' — block Jacobi with ILU factorization of the local rows (served by the nested csgetrow, which extracts the local rows of the global operator across all blocks).
call prec%init(ctxt, 'BJAC', info)
call prec%build(nested_matrix%a_glob, nested_matrix%desc_glob, info)

3.3 Implemented base-class contract

The nested operator (psb_d_nest_base_mat) implements the standard psb_d_base_sparse_mat contract by delegation to the blocks, so it can be used wherever an assembled PSBLAS matrix is expected:

  • Productscsmv (also transposed, trans='T'), csmm (multi-RHS), vect_mv (encapsulated vectors: gathers/scatters through the vectors' own gth/sct and runs each block through its vect_mv, so device block formats execute their device kernels).
  • Access/conversionsget_diag, csgetrow (and csget/csgetblk through the base generics), cp_to_coo/mv_to_coo (and cscnv, csclip, tril/triu, ... through the base generics built on the COO route).
  • Reductionsrowsum/arwsum, colsum/aclsum, maxval, spnmi (infinity norm), spnm1 (1-norm).
  • Mutation/bookkeepingscal (left/right) and scals (the operator is a view: scaling acts on the blocks), clone (shares the blocks, re-owns the private index maps), mold, sizeof, free, get_nzeros, get_fmt.

Intentionally not implemented (they fail with the standard "missing override" error): cp_from_coo/mv_from_coo (a nested operator cannot be built from a flat matrix without the field structure), csput (insertions go to the blocks before assembly), cssv/cssm (a triangular solve is undefined for a block operator).

3.4 Low-level API (advanced)

psb_d_nest_matrix is built on lower-level pieces, available directly:

  • psb_cd_nest_compose(grid_desc, desc_glob, info) — compose the per-field descriptors into the single global descriptor with the union halo.
  • psb_d_nest_base_setup(nest_op, block_storage, grid_desc, desc_glob, info) — set up the psb_d_nest_base_mat operator (implements the local csmv, get_diag, csgetrow).
  • psb_d_nest_rect_block(blk, nz, ia, ja, val, desc_row, desc_col, info) — build a single (possibly rectangular) local block from global triplets, with rows localized against desc_row and columns against desc_col.

A field-split interface (psb_d_nest_get_block, psb_d_nest_get_field_desc, psb_d_nest_restrict_field, psb_d_nest_prolong_field, psb_d_nest_apply_block) is exposed on psb_d_nest_base_mat as the hook for a future block (field-split / Schur) preconditioner.

4. Tests

Test What it checks
psb_d_nest_glob_test Square 2×2 operator built with psb_d_nest_matrix; the nested psb_spmm is compared bit-for-bit against the same matrix assembled monolithically in CSR.
psb_d_nest_rect_test Same, with fields of different size (nV = 2 nQ) and genuinely rectangular off-diagonal blocks.
psb_d_nest_cg_test Standard PSBLAS CG on an SPD, ill-conditioned operator (1D Laplacian reordered red-black), solved under every stock preconditioner (NONE, DIAG, BJAC/ILU(0)); requires convergence to machine precision for all of them, and that DIAG reproduces the NONE iteration count exactly (a bit-precise check of the nested get_diag, since the diagonal is the constant 2I).

All tests run both serially and in parallel, and the result is invariant with respect to the number of MPI processes.

Build and run

The PSBLAS library must be built/installed first (from the repository root):

make            # or the CMake build

Then, from this directory:

make                                   # builds the executables into ./runs
./runs/psb_d_nest_glob_test            # serial
mpirun -np 4 ./runs/psb_d_nest_rect_test
mpirun -np 4 ./runs/psb_d_nest_cg_test

Each test prints a single [PASS] / [FAIL] line (printed by rank 0).

5. Source files

Library (under base/modules/):

  • desc/psb_desc_nest_mod.f90psb_desc_nest_type (grid of per-field descriptors)
  • serial/psb_d_nest_mat_mod.f90psb_d_nest_sparse_mat (block storage)
  • serial/psb_d_nest_base_mat_mod.F90psb_d_nest_base_mat (the MATNEST operator: csmv, get_diag, csgetrow)
  • tools/psb_cd_nest_tools_mod.F90 — descriptor tools (psb_cd_nest_compose, ...)
  • tools/psb_d_nest_tools_mod.F90 — block tools (psb_d_nest_rect_block, ...)
  • tools/psb_d_nest_builder_mod.F90psb_d_nest_matrix frontend (init/ins/asb)
  • psb_d_nest_mod.f90 — umbrella module (use psb_d_nest_mod)