Complete the integration of the nested (MATNEST) operator into the standard
PSBLAS infrastructure:
- Preconditioners: implement get_diag and csgetrow on psb_d_nest_base_mat so
the stock one-level preconditioners build directly on the nested operator
(DIAG through the concatenated block diagonals, BJAC through the
format-agnostic csget path used by the ILU factorizations).
- Configurable block storage: psb_d_nest_rect_block and psb_d_nest_matrix%asb
accept an optional type ('CSR' default, 'CSC', 'COO') or mold (any class
extending psb_d_base_sparse_mat, e.g. the psb_ext ELL/HLL formats); the
operator is format-agnostic since every operation delegates to the blocks.
- Device-capable matvec: override vect_mv to gather/scatter through the
vectors' own gth/sct with encapsulated index vectors (device kernels on
device vectors) and to run each block through its vect_mv, so device block
formats execute their native kernels; bit-equivalent to csmv on host.
- Full psb_d_base_sparse_mat contract by delegation to the blocks: transposed
csmv (dedicated kernel, ghost contributions left to the transposed halo
exchange), multi-RHS csmm, cp_to_coo/mv_to_coo (unlocking cscnv, csclip,
tril/triu through the base generics), rowsum/arwsum/colsum/aclsum,
maxval/spnmi/spnm1, scal (left/right) and scals, clone (view semantics:
shared blocks, re-owned index maps), mold, sizeof. cp_from_coo/mv_from_coo,
csput and cssv/cssm are intentionally left to the base error (meaningless
for a block-operator view), documented in the type and in the README.
Tests: glob assembles the blocks in HLL (psb_ext) and rect in CSC, both still
bit-identical to the monolithic CSR oracle; the CG test solves under NONE,
DIAG and BJAC/ILU(0), requiring convergence to the exact solution for all of
them and DIAG bit-identical to NONE (exactness check of the nested get_diag).
README updated with the user API reference, the preconditioner section and
the implemented-contract section.
Author: Simone Staccone (Stack-1)
Propagate the latest development (via communication_v2) onto the nested branch:
brings the GMRES refactor, the stopping-criterion change and the restored work
parameter on top of the nested (MATNEST) matrix support. Clean merge, no
conflicts.
Realign communication_v2 with the latest development (10 commits, including the
GMRES refactor and the stopping-criterion change), keeping the communication_v2
work intact.
Conflict resolution:
- base/modules/Makefile (veryclean): keep communication_v2's '/bin/rm -f *.h'.
- linsolve/impl/psb_{c,d,s,z}rgmres.f90: keep development's variable
declaration (itmax_, naux), consistent with development's refactored GMRES
body which references itmax_.
Add a block-structured distributed operator that presents itself to Krylov
solvers and preconditioners as a single ordinary distributed matrix (the
PSBLAS analogue of PETSc MATNEST), targeting saddle-point systems
M = [[A, B^T], [B, 0]] with possibly rectangular sub-blocks.
Library (base/modules):
- psb_desc_nest_mod, psb_d_nest_mat_mod: grid of per-field descriptors and
per-block sparse storage.
- psb_d_nest_base_mat_mod: psb_d_nest_base_mat, the operator extending
psb_d_base_sparse_mat (local csmv, free, field-split hooks for a future
block preconditioner).
- psb_cd_nest_tools_mod / psb_d_nest_tools_mod: composed global descriptor
with union halo (psb_cd_nest_compose) and rectangular local block builder
(psb_d_nest_rect_block), plus the per-block assembly wrappers.
- psb_d_nest_builder_mod: psb_d_nest_matrix, the user frontend with the
init/ins/asb/free pattern hiding all descriptor/halo/compose/setup
boilerplate.
- psb_d_nest_mod: umbrella module (use psb_d_nest_mod).
Remove the earlier bespoke per-block prototype (comm/psblas/vect modules and
the pde_nest_psblas test) superseded by the single MATNEST design.
Tests (test/nested): glob (square operator vs monolithic CSR oracle), rect
(genuinely rectangular blocks), cg (low-level path, ill-conditioned SPD
red-black Laplacian solved with standard CG), builder (same solve via the
utility), plus a README describing the design and usage. All pass serially
and in parallel, with results invariant to the process count.
Build hooks updated (autotools Makefiles + CMakeLists); the nested tests are
relocated out of test/pdegen into test/nested.
Author: Simone Staccone (Stack-1)