You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
psblas3/test/nested/README.md

134 lines
7.0 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# Nested (block-structured / MATNEST) matrices in PSBLAS
Author: Simone Staccone (Stack-1)
This directory contains the tests for the **nested matrix** support added to PSBLAS: a block-structured distributed operator
```
[ A11 A12 ... ]
M = [ A21 A22 ... ]
[ ... ... ]
```
whose blocks are kept as separate sparse matrices (one per field) but which presents itself to Krylov solvers and preconditioners as a **single ordinary distributed matrix**. It is the PSBLAS analogue of PETSc's `MATNEST`.
The motivating case is the **saddle-point** system
```
M = [ A B^T ]
[ B 0 ]
```
(symmetric indefinite, with the (2,2) block absent), but the implementation supports any square multi-field block operator with possibly **rectangular**
sub-blocks.
## 1. Concepts
* **Field** — a contiguous index space (e.g. velocity `V` and pressure `Q` in a saddle-point problem). Each field has its own `psb_desc_type` distribution.
* **Block (i,j)** — the sub-matrix coupling field `i` (rows) with field `j` (columns). It may be rectangular (`|field i| /= |field j|`) and may be absent.
* **Global operator** — the blocks are concatenated into a single **square** operator `M` of size `sum(field_sizes)`, distributed over one **composed global descriptor** with a **union halo** (one halo exchange per matrix-vector product, covering all blocks of a given column field at once).
* **Rectangular blocks** — PSBLAS does not support rectangular *distributed* matrices, but it does support rectangular *local* CSR/COO matrices. The rectangular product therefore happens only in the **local** block `csmv`; the only object carrying a descriptor (and hence communication) is the global operator, which is always square.
The global operator (`a_glob`) and global descriptor (`desc_glob`) can be passed unchanged to `psb_spmm`, `psb_krylov`, and the standard preconditioners.
## 2. Recommended API: `psb_d_nest_matrix`
The easy way to build a nested matrix is the `psb_d_nest_matrix` type (module `psb_d_nest_builder_mod`, re-exported by the umbrella `psb_d_nest_mod`), which follows the usual PSBLAS `init` / `ins` / `asb` pattern and hides all the descriptor / halo / compose / setup boilerplate:
```fortran
use psb_d_nest_mod
type(psb_d_nest_matrix) :: nested_matrix
integer(psb_lpk_) :: n1, n2
! 1) declare the field structure: 2 fields of global size n1, n2
call nested_matrix%init(ctxt, [n1, n2], info)
! 2) insert the block values, owned rows only (PSBLAS convention).
! ins(block_row, block_col, n_entries, entry_rows, entry_cols, entry_vals, info)
! rows are GLOBAL indices in field block_row, columns in field block_col.
call nested_matrix%ins(1, 1, nz_A, iaA, jaA, valA, info) ! A = block (1,1)
call nested_matrix%ins(1, 2, nz_Bt, iaBt, jaBt, valBt, info) ! B^T = block (1,2)
call nested_matrix%ins(2, 1, nz_B, iaB, jaB, valB, info) ! B = block (2,1)
! (the (2,2) block is simply not inserted)
! 3) assemble: builds nested_matrix%a_glob and nested_matrix%desc_glob
call nested_matrix%asb(info)
! 4) from here on it is an ordinary distributed matrix/descriptor
call psb_geall(x, nested_matrix%desc_glob, info)
...
call psb_krylov('CG', nested_matrix%a_glob, prec, b, x, eps, &
& nested_matrix%desc_glob, info, itmax=..., iter=..., err=...)
! 5) release
call nested_matrix%free(info)
```
Notes:
* To know which rows it owns in a field, a process can query the per-field descriptor exposed as `nested_matrix%field_desc(i)` (e.g. `nested_matrix%field_desc(1)%get_local_rows()` and `%l2g(...)`), exactly as it would with a plain `psb_cdall` descriptor.
* Off-diagonal blocks may be rectangular: the cross-field column indices are registered into the union halo automatically by `ins`.
* The CG solver requires an SPD operator; a genuine saddle-point operator is indefinite and needs MINRES/GMRES (plus, eventually, a block preconditioner).
* **Do not copy/move** a `psb_d_nest_matrix` after `asb`: the wrapped operator holds internal pointers into the object.
## 3. Low-level path (advanced)
`psb_d_nest_matrix` is built on three lower-level pieces, available directly for advanced use (see `psb_d_nest_cg_test.F90` for an end-to-end example):
* `psb_cd_nest_compose(grid_desc, desc_glob, info)` — compose the per-field descriptors into the single global descriptor with the union halo.
* `psb_d_nest_base_setup(nest_op, block_storage, grid_desc, desc_glob, info)` — set up the `psb_d_nest_base_mat` operator (implements the local `csmv`).
* `psb_d_nest_rect_block(blk, nz, ia, ja, val, desc_row, desc_col, info)` — build a single (possibly rectangular) local block from global triplets, with rows localized against `desc_row` and columns against `desc_col`.
A field-split interface (`psb_d_nest_get_block`, `psb_d_nest_get_field_desc`,
`psb_d_nest_restrict_field`, `psb_d_nest_prolong_field`,
`psb_d_nest_apply_block`) is exposed on `psb_d_nest_base_mat` as the hook for a future block (field-split / Schur) preconditioner.
## 4. Tests
| Test | What it checks |
|------------------------------|----------------|
| `psb_d_nest_glob_test` | Square 2×2 operator built with `psb_d_nest_matrix`; the nested `psb_spmm` is compared bit-for-bit against the same matrix assembled monolithically in CSR. |
| `psb_d_nest_rect_test` | Same, with fields of different size (`|V| = 2|Q|`) and genuinely **rectangular** off-diagonal blocks. |
| `psb_d_nest_cg_test` | Standard PSBLAS **CG** on an SPD, ill-conditioned operator (1D Laplacian reordered red-black), built on the **low-level path**; the solution is recovered to machine precision over hundreds of matvecs. |
| `psb_d_nest_builder_test` | Same CG solve as above but built through the `psb_d_nest_matrix` utility (high-level path). |
All tests run both serially and in parallel, and the result is invariant with respect to the number of MPI processes.
### Build and run
The PSBLAS library must be built/installed first (from the repository root):
```sh
make # or the CMake build
```
Then, from this directory:
```sh
make # builds the executables into ./runs
./runs/psb_d_nest_glob_test # serial
mpirun -np 4 ./runs/psb_d_nest_rect_test
mpirun -np 4 ./runs/psb_d_nest_cg_test
mpirun -np 4 ./runs/psb_d_nest_builder_test
```
Each test prints a single `[PASS]` / `[FAIL]` line (printed by rank 0).
## 5. Source files
Library (under `base/modules/`):
* `desc/psb_desc_nest_mod.f90``psb_desc_nest_type` (grid of per-field descriptors)
* `serial/psb_d_nest_mat_mod.f90``psb_d_nest_sparse_mat` (block storage)
* `serial/psb_d_nest_base_mat_mod.F90``psb_d_nest_base_mat` (the MATNEST operator + `csmv`)
* `tools/psb_cd_nest_tools_mod.F90` — descriptor tools (`psb_cd_nest_compose`, ...)
* `tools/psb_d_nest_tools_mod.F90` — block tools (`psb_d_nest_rect_block`, ...)
* `tools/psb_d_nest_builder_mod.F90``psb_d_nest_matrix` frontend (init/ins/asb)
* `psb_d_nest_mod.f90` — umbrella module (`use psb_d_nest_mod`)