|
|
# Nested (block-structured / MATNEST) matrices in PSBLAS
|
|
|
|
|
|
Author: Simone Staccone (Stack-1)
|
|
|
|
|
|
This directory contains the tests for the **nested matrix** support added to PSBLAS: a block-structured distributed operator
|
|
|
|
|
|
```
|
|
|
[ A11 A12 ... ]
|
|
|
M = [ A21 A22 ... ]
|
|
|
[ ... ... ]
|
|
|
```
|
|
|
|
|
|
whose blocks are kept as separate sparse matrices (one per field) but which presents itself to Krylov solvers and preconditioners as a **single ordinary distributed matrix**. It is the PSBLAS analogue of PETSc's `MATNEST`.
|
|
|
|
|
|
The motivating case is the **saddle-point** system
|
|
|
|
|
|
```
|
|
|
M = [ A B^T ]
|
|
|
[ B 0 ]
|
|
|
```
|
|
|
|
|
|
(symmetric indefinite, with the (2,2) block absent), but the implementation supports any square multi-field block operator with possibly **rectangular**
|
|
|
sub-blocks.
|
|
|
|
|
|
|
|
|
## 1. Concepts
|
|
|
|
|
|
* **Field** — a contiguous index space (e.g. velocity `V` and pressure `Q` in a saddle-point problem). Each field has its own `psb_desc_type` distribution.
|
|
|
* **Block (i,j)** — the sub-matrix coupling field `i` (rows) with field `j` (columns). It may be rectangular (`|field i| /= |field j|`) and may be absent.
|
|
|
* **Global operator** — the blocks are concatenated into a single **square** operator `M` of size `sum(field_sizes)`, distributed over one **composed global descriptor** with a **union halo** (one halo exchange per matrix-vector product, covering all blocks of a given column field at once).
|
|
|
* **Rectangular blocks** — PSBLAS does not support rectangular *distributed* matrices, but it does support rectangular *local* CSR/COO matrices. The rectangular product therefore happens only in the **local** block `csmv`; the only object carrying a descriptor (and hence communication) is the global operator, which is always square.
|
|
|
|
|
|
The global operator (`a_glob`) and global descriptor (`desc_glob`) can be passed unchanged to `psb_spmm`, `psb_krylov`, and the standard preconditioners.
|
|
|
|
|
|
|
|
|
## 2. Recommended API: `psb_d_nest_matrix`
|
|
|
|
|
|
The easy way to build a nested matrix is the `psb_d_nest_matrix` type (module `psb_d_nest_builder_mod`, re-exported by the umbrella `psb_d_nest_mod`), which follows the usual PSBLAS `init` / `ins` / `asb` pattern and hides all the descriptor / halo / compose / setup boilerplate:
|
|
|
|
|
|
```fortran
|
|
|
use psb_d_nest_mod
|
|
|
|
|
|
type(psb_d_nest_matrix) :: nested_matrix
|
|
|
integer(psb_lpk_) :: n1, n2
|
|
|
|
|
|
! 1) declare the field structure: 2 fields of global size n1, n2
|
|
|
call nested_matrix%init(ctxt, [n1, n2], info)
|
|
|
|
|
|
! 2) insert the block values, owned rows only (PSBLAS convention).
|
|
|
! ins(block_row, block_col, n_entries, entry_rows, entry_cols, entry_vals, info)
|
|
|
! rows are GLOBAL indices in field block_row, columns in field block_col.
|
|
|
call nested_matrix%ins(1, 1, nz_A, iaA, jaA, valA, info) ! A = block (1,1)
|
|
|
call nested_matrix%ins(1, 2, nz_Bt, iaBt, jaBt, valBt, info) ! B^T = block (1,2)
|
|
|
call nested_matrix%ins(2, 1, nz_B, iaB, jaB, valB, info) ! B = block (2,1)
|
|
|
! (the (2,2) block is simply not inserted)
|
|
|
|
|
|
! 3) assemble: builds nested_matrix%a_glob and nested_matrix%desc_glob
|
|
|
call nested_matrix%asb(info)
|
|
|
|
|
|
! 4) from here on it is an ordinary distributed matrix/descriptor
|
|
|
call psb_geall(x, nested_matrix%desc_glob, info)
|
|
|
...
|
|
|
call psb_krylov('CG', nested_matrix%a_glob, prec, b, x, eps, &
|
|
|
& nested_matrix%desc_glob, info, itmax=..., iter=..., err=...)
|
|
|
|
|
|
! 5) release
|
|
|
call nested_matrix%free(info)
|
|
|
```
|
|
|
|
|
|
Notes:
|
|
|
|
|
|
* To know which rows it owns in a field, a process can query the per-field descriptor exposed as `nested_matrix%field_desc(i)` (e.g. `nested_matrix%field_desc(1)%get_local_rows()` and `%l2g(...)`), exactly as it would with a plain `psb_cdall` descriptor.
|
|
|
* Off-diagonal blocks may be rectangular: the cross-field column indices are registered into the union halo automatically by `ins`.
|
|
|
* The CG solver requires an SPD operator; a genuine saddle-point operator is indefinite and needs MINRES/GMRES (plus, eventually, a block preconditioner).
|
|
|
* **Do not copy/move** a `psb_d_nest_matrix` after `asb`: the wrapped operator holds internal pointers into the object.
|
|
|
|
|
|
|
|
|
## 3. Low-level path (advanced)
|
|
|
|
|
|
`psb_d_nest_matrix` is built on three lower-level pieces, available directly for advanced use (see `psb_d_nest_cg_test.F90` for an end-to-end example):
|
|
|
|
|
|
* `psb_cd_nest_compose(grid_desc, desc_glob, info)` — compose the per-field descriptors into the single global descriptor with the union halo.
|
|
|
* `psb_d_nest_base_setup(nest_op, block_storage, grid_desc, desc_glob, info)` — set up the `psb_d_nest_base_mat` operator (implements the local `csmv`).
|
|
|
* `psb_d_nest_rect_block(blk, nz, ia, ja, val, desc_row, desc_col, info)` — build a single (possibly rectangular) local block from global triplets, with rows localized against `desc_row` and columns against `desc_col`.
|
|
|
|
|
|
A field-split interface (`psb_d_nest_get_block`, `psb_d_nest_get_field_desc`,
|
|
|
`psb_d_nest_restrict_field`, `psb_d_nest_prolong_field`,
|
|
|
`psb_d_nest_apply_block`) is exposed on `psb_d_nest_base_mat` as the hook for a future block (field-split / Schur) preconditioner.
|
|
|
|
|
|
|
|
|
## 4. Tests
|
|
|
|
|
|
| Test | What it checks |
|
|
|
|------------------------------|----------------|
|
|
|
| `psb_d_nest_glob_test` | Square 2×2 operator built with `psb_d_nest_matrix`; the nested `psb_spmm` is compared bit-for-bit against the same matrix assembled monolithically in CSR. |
|
|
|
| `psb_d_nest_rect_test` | Same, with fields of different size (`|V| = 2|Q|`) and genuinely **rectangular** off-diagonal blocks. |
|
|
|
| `psb_d_nest_cg_test` | Standard PSBLAS **CG** on an SPD, ill-conditioned operator (1D Laplacian reordered red-black), built on the **low-level path**; the solution is recovered to machine precision over hundreds of matvecs. |
|
|
|
| `psb_d_nest_builder_test` | Same CG solve as above but built through the `psb_d_nest_matrix` utility (high-level path). |
|
|
|
|
|
|
All tests run both serially and in parallel, and the result is invariant with respect to the number of MPI processes.
|
|
|
|
|
|
### Build and run
|
|
|
|
|
|
The PSBLAS library must be built/installed first (from the repository root):
|
|
|
|
|
|
```sh
|
|
|
make # or the CMake build
|
|
|
```
|
|
|
|
|
|
Then, from this directory:
|
|
|
|
|
|
```sh
|
|
|
make # builds the executables into ./runs
|
|
|
./runs/psb_d_nest_glob_test # serial
|
|
|
mpirun -np 4 ./runs/psb_d_nest_rect_test
|
|
|
mpirun -np 4 ./runs/psb_d_nest_cg_test
|
|
|
mpirun -np 4 ./runs/psb_d_nest_builder_test
|
|
|
```
|
|
|
|
|
|
Each test prints a single `[PASS]` / `[FAIL]` line (printed by rank 0).
|
|
|
|
|
|
|
|
|
## 5. Source files
|
|
|
|
|
|
Library (under `base/modules/`):
|
|
|
|
|
|
* `desc/psb_desc_nest_mod.f90` — `psb_desc_nest_type` (grid of per-field descriptors)
|
|
|
* `serial/psb_d_nest_mat_mod.f90` — `psb_d_nest_sparse_mat` (block storage)
|
|
|
* `serial/psb_d_nest_base_mat_mod.F90`— `psb_d_nest_base_mat` (the MATNEST operator + `csmv`)
|
|
|
* `tools/psb_cd_nest_tools_mod.F90` — descriptor tools (`psb_cd_nest_compose`, ...)
|
|
|
* `tools/psb_d_nest_tools_mod.F90` — block tools (`psb_d_nest_rect_block`, ...)
|
|
|
* `tools/psb_d_nest_builder_mod.F90` — `psb_d_nest_matrix` frontend (init/ins/asb)
|
|
|
* `psb_d_nest_mod.f90` — umbrella module (`use psb_d_nest_mod`)
|