Extend the nested (MATNEST) matrix support to all the arithmetics: the
psb_{s,c,z}_nest_{mat,base_mat,tools,builder}_mod modules and the
psb_{s,c,z}_nest_mod umbrellas are generated from the template-psblas
X_nest_* templates; the d sources are regenerated byte-identical.
Preparatory changes to the d sources for clean templating: rowsum/arwsum and
colsum/aclsum no longer share a helper (for the complex arithmetics the
absolute sums are real-valued while the plain sums are complex-valued), the
transposed kernel forwards the actual 'T'/'C' character to the blocks
(conjugate transpose for the complex types), and the capacity helper takes a
type-neutral name.
Build hooks (autotools Makefile and CMakeLists) updated with the per-arith
objects, compile rules and dependencies. All four d tests keep passing.
Author: Simone Staccone (Stack-1)
Add get_owned_rows(i_field) and get_owned_row_count(i_field) to
psb_d_nest_matrix: the list of GLOBAL row indices of a field owned by the
calling process (i.e. the rows it is expected to insert through ins) and
their count. They replace the descriptor-level idiom
field_desc(i)%get_local_rows() / field_desc(i)%l2g(...) in user code, which
leaked descriptor jargon into the build loop.
The high-level tests (glob, rect, builder) are rewritten on the new queries;
the low-level CG test intentionally keeps the descriptor path. README updated
with the new queries and an example.
Author: Simone Staccone (Stack-1)
Complete the integration of the nested (MATNEST) operator into the standard
PSBLAS infrastructure:
- Preconditioners: implement get_diag and csgetrow on psb_d_nest_base_mat so
the stock one-level preconditioners build directly on the nested operator
(DIAG through the concatenated block diagonals, BJAC through the
format-agnostic csget path used by the ILU factorizations).
- Configurable block storage: psb_d_nest_rect_block and psb_d_nest_matrix%asb
accept an optional type ('CSR' default, 'CSC', 'COO') or mold (any class
extending psb_d_base_sparse_mat, e.g. the psb_ext ELL/HLL formats); the
operator is format-agnostic since every operation delegates to the blocks.
- Device-capable matvec: override vect_mv to gather/scatter through the
vectors' own gth/sct with encapsulated index vectors (device kernels on
device vectors) and to run each block through its vect_mv, so device block
formats execute their native kernels; bit-equivalent to csmv on host.
- Full psb_d_base_sparse_mat contract by delegation to the blocks: transposed
csmv (dedicated kernel, ghost contributions left to the transposed halo
exchange), multi-RHS csmm, cp_to_coo/mv_to_coo (unlocking cscnv, csclip,
tril/triu through the base generics), rowsum/arwsum/colsum/aclsum,
maxval/spnmi/spnm1, scal (left/right) and scals, clone (view semantics:
shared blocks, re-owned index maps), mold, sizeof. cp_from_coo/mv_from_coo,
csput and cssv/cssm are intentionally left to the base error (meaningless
for a block-operator view), documented in the type and in the README.
Tests: glob assembles the blocks in HLL (psb_ext) and rect in CSC, both still
bit-identical to the monolithic CSR oracle; the CG test solves under NONE,
DIAG and BJAC/ILU(0), requiring convergence to the exact solution for all of
them and DIAG bit-identical to NONE (exactness check of the nested get_diag).
README updated with the user API reference, the preconditioner section and
the implemented-contract section.
Author: Simone Staccone (Stack-1)
Propagate the latest development (via communication_v2) onto the nested branch:
brings the GMRES refactor, the stopping-criterion change and the restored work
parameter on top of the nested (MATNEST) matrix support. Clean merge, no
conflicts.
Realign communication_v2 with the latest development (10 commits, including the
GMRES refactor and the stopping-criterion change), keeping the communication_v2
work intact.
Conflict resolution:
- base/modules/Makefile (veryclean): keep communication_v2's '/bin/rm -f *.h'.
- linsolve/impl/psb_{c,d,s,z}rgmres.f90: keep development's variable
declaration (itmax_, naux), consistent with development's refactored GMRES
body which references itmax_.
Add a block-structured distributed operator that presents itself to Krylov
solvers and preconditioners as a single ordinary distributed matrix (the
PSBLAS analogue of PETSc MATNEST), targeting saddle-point systems
M = [[A, B^T], [B, 0]] with possibly rectangular sub-blocks.
Library (base/modules):
- psb_desc_nest_mod, psb_d_nest_mat_mod: grid of per-field descriptors and
per-block sparse storage.
- psb_d_nest_base_mat_mod: psb_d_nest_base_mat, the operator extending
psb_d_base_sparse_mat (local csmv, free, field-split hooks for a future
block preconditioner).
- psb_cd_nest_tools_mod / psb_d_nest_tools_mod: composed global descriptor
with union halo (psb_cd_nest_compose) and rectangular local block builder
(psb_d_nest_rect_block), plus the per-block assembly wrappers.
- psb_d_nest_builder_mod: psb_d_nest_matrix, the user frontend with the
init/ins/asb/free pattern hiding all descriptor/halo/compose/setup
boilerplate.
- psb_d_nest_mod: umbrella module (use psb_d_nest_mod).
Remove the earlier bespoke per-block prototype (comm/psblas/vect modules and
the pde_nest_psblas test) superseded by the single MATNEST design.
Tests (test/nested): glob (square operator vs monolithic CSR oracle), rect
(genuinely rectangular blocks), cg (low-level path, ill-conditioned SPD
red-black Laplacian solved with standard CG), builder (same solve via the
utility), plus a README describing the design and usage. All pass serially
and in parallel, with results invariant to the process count.
Build hooks updated (autotools Makefiles + CMakeLists); the nested tests are
relocated out of test/pdegen into test/nested.
Author: Simone Staccone (Stack-1)
The nested layer was imported from an older base where the vector psb_spsm
still took a work buffer. communication_v2 removed work from the psb_x_vect_type
routines, so psb_dspsv_vect has no work argument and the call failed generic
resolution. Remove work from psb_d_nest_spsm (signature, declaration, call).