diff --git a/Nested_Preconditioner_Report.md b/Nested_Preconditioner_Report.md new file mode 100644 index 000000000..a7c033a91 --- /dev/null +++ b/Nested_Preconditioner_Report.md @@ -0,0 +1,958 @@ +# Nested Preconditioner Report + +## Executive Summary + +This report documents the double-precision nested preconditioner functionality added to PSBLAS. The implementation provides a PSBLAS-native field-split preconditioner for matrices represented through the nested matrix infrastructure. It is implemented primarily in: + +- `prec/psb_d_nestedprec.f90` +- `prec/impl/psb_d_prec_type_impl.f90` +- `base/modules/serial/psb_d_nest_base_mat_mod.F90` +- nested test drivers under `test/nested` + +The nested preconditioner is intended for block-structured systems, especially multiphysics and field-split problems, where the global matrix is naturally written as: + +```text + [ A_11 A_12 ... A_1m ] + A = [ A_21 A_22 ... A_2m ] + [ ... ... ... ... ] + [ A_m1 A_m2 ... A_mm ] +``` + +Each field owns a PSBLAS descriptor and each block is a normal sparse PSBLAS matrix. The preconditioner extracts the field diagonal blocks, builds ordinary PSBLAS sub-preconditioners on those blocks, and combines the field solves through additive, multiplicative, symmetric multiplicative, or two-field Schur-style compositions. + +The implementation now also includes independent per-field inner Krylov solve contexts. These are configured through the existing `prec%set` option mechanism and are applied inside field solves when requested. The current inner methods are local nested-preconditioner kernels for `CG` and `BICGSTAB`, rather than calls into the top-level `psb_krylov` driver. + +``` +Outer Krylov solver (psb_krylov) + | + +-- Preconditioner / Block solve + | + +-- Field 1 solve + | | + | +-- Inner CG/BICGSTAB (optional) + | + +-- Field 2 solve + | + +-- Inner CG/BICGSTAB (optional) +``` +## PSBLAS Background + +PSBLAS separates sparse linear algebra into a few recurring concepts: + +- a sparse matrix object, such as `psb_dspmat_type` +- a communication descriptor, `psb_desc_type`, that describes distributed row ownership and halo data +- dense vector storage, either raw arrays or PSBLAS vector objects +- a preconditioner wrapper, `psb_dprec_type`, that owns a polymorphic implementation derived from `psb_d_base_prec_type` +- Krylov solvers, such as `psb_krylov`, that operate on a matrix, a descriptor, and an optional preconditioner + +The usual preconditioner lifecycle is: + +```fortran +call prec%init('...', info) +call prec%set('...', value, info) +call prec%build(a, desc_a, info) +call psb_krylov(method, a, prec, b, x, eps, desc_a, info, ...) +call prec%free(info) +``` + +The nested preconditioner follows this same lifecycle. From the outside it is just another `psb_dprec_type` implementation selected by: + +```fortran +call prec%init('NEST', info) +``` + +The outer Krylov method does not need to know that the preconditioner is block structured. It applies `prec` through the standard PSBLAS preconditioner virtual interface. + +## Motivation + +Many application matrices are block systems, not just scalar sparse matrices. Examples include: + +- coupled PDE systems +- saddle point systems +- fluid-structure and multiphysics discretizations +- systems reordered by physical field or variable type +- matrices where different fields require different smoothers or approximate solvers + +A scalar preconditioner treats the matrix as a single undifferentiated sparse operator. That is often too weak or too inflexible for block systems. A field-split preconditioner can use the mathematical structure of the problem: + +- each field can use a different sub-preconditioner +- off-diagonal coupling can be handled additively, multiplicatively, or through Schur approximations +- expensive exact block solves can be replaced by approximate preconditioned field iterations +- the outer solver can remain unchanged + +The goal of this implementation is to provide that behavior inside PSBLAS without introducing a separate solver framework or requiring users to leave the existing `prec%init`, `prec%set`, `prec%build`, and `psb_krylov` flow. + +## Nested Matrix Infrastructure + +The nested preconditioner depends on the nested matrix support already added to the repository. A nested matrix stores a global operator as a block matrix over fields. The important pieces are: + +- `psb_d_nest_matrix`: user-facing builder/helper for nested matrices +- `psb_d_nest_base_mat`: sparse matrix backend that adapts the nested object to the standard PSBLAS sparse matrix interface +- `a_glob`: global sparse wrapper used by ordinary PSBLAS solver calls +- `desc_glob`: global descriptor for the composed vector layout +- per-field descriptors for field-local vectors and block operations +- helper routines to restrict/prolong between global and field vector layouts + +The nested preconditioner uses the following helper operations from `psb_d_nest_base_mat_mod`: + +```fortran +psb_d_nest_get_n_fields +psb_d_nest_get_field_owned +psb_d_nest_get_block +psb_d_nest_get_field_desc +psb_d_nest_restrict_field +psb_d_nest_restrict_field_local +psb_d_nest_prolong_field +psb_d_nest_apply_block +``` + +These routines are what keep the preconditioner PSBLAS-native. The preconditioner does not invent its own distributed layout. It asks the nested matrix infrastructure how fields map into the global vector and uses the existing descriptors for communication. + +## Design Goals + +The main design goals were: + +- expose nested preconditioning through the standard PSBLAS preconditioner API +- reuse existing PSBLAS preconditioner implementations for diagonal field blocks +- preserve parallel correctness by using PSBLAS descriptors and nested field helpers +- support practical field-split compositions first +- allow per-field configuration of block preconditioners +- allow per-field inner Krylov solves without creating a dependency cycle in the library +- keep the first implementation scoped to double precision + +The implementation is deliberately conservative. It does not attempt to become a separate nonlinear or block-solver framework. It provides preconditioner application paths that can be used by existing PSBLAS Krylov solvers. + +## Mathematical Model + +Assume a nested matrix with `m` fields: + +```text + [ A_11 A_12 ... A_1m ] + A = [ A_21 A_22 ... A_2m ] + [ ... ... ... ... ] + [ A_m1 A_m2 ... A_mm ] +``` + +For field `i`, let `B_i` denote the field preconditioner built from the diagonal block `A_ii`. A global vector `x` is restricted into field components: + +```text +x -> (x_1, x_2, ..., x_m) +``` + +The nested preconditioner applies field solves and then prolongs field corrections back into the global vector layout: + +```text +z_i ~= B_i r_i +z = prolong(z_1, z_2, ..., z_m) +``` + +The different compositions change how field residuals are formed and how off-diagonal blocks are used. + +## Additive And Diagonal Composition + +The additive composition is the block-Jacobi field split: + +```text +z_i = B_i r_i, i = 1, ..., m +``` + +All fields are solved independently from the original restricted residual. Off-diagonal blocks are ignored during the preconditioner application. This is the simplest and most parallel composition. + +For a three-field system: + +```text + [ A_11 A_12 A_13 ] + A = [ A_21 A_22 A_23 ] + [ A_31 A_32 A_33 ] +``` + +and residual: + +```text + [ r_1 ] +r = [ r_2 ] + [ r_3 ] +``` + +the additive preconditioner applies: + +```text +z_1 = B_1 r_1 +z_2 = B_2 r_2 +z_3 = B_3 r_3 +``` + +Graphically: + +```text +r_1 --> B_1 --> z_1 + +r_2 --> B_2 --> z_2 + +r_3 --> B_3 --> z_3 +``` + +There is no dependency between field solves. Field 1 does not wait for field 2 or field 3; field 2 does not use corrections from field 1 or field 3; field 3 is also independent. In matrix terms, additive composition approximates the inverse of the block diagonal part: + +```text + [ A_11 0 0 ]^{-1} +P^{-1} ~[ 0 A_22 0 ] + [ 0 0 A_33 ] +``` + +but with `B_i` in place of exact `A_ii^{-1}`: + +```text + [ B_1 0 0 ] +P^{-1} =[ 0 B_2 0 ] + [ 0 0 B_3 ] +``` + +This is why additive composition is robust as a first preconditioner: it is cheap, parallel, and easy to reason about. Its weakness is that it does not directly account for coupling terms like `A_12`, `A_21`, `A_23`, or `A_32`. The outer Krylov method must handle those couplings. + +`DIAGONAL` and `ADDITIVE` are treated as equivalent names in the implementation. + +## Multiplicative Composition + +The multiplicative composition is a field Gauss-Seidel sweep. Fields are visited in order. For field `i`, the residual is corrected using already computed field updates: + +```text +r_i^hat = r_i - sum_{j < i} A_ij z_j +z_i = B_i r_i^hat +``` + +The implementation uses nested block application for the off-diagonal products and prolongs intermediate field corrections into global work storage as needed. This lets later fields see the effect of earlier field corrections. + +For a three-field forward sweep, the fields are processed as `1, 2, 3`. + +Field 1 has nothing before it: + +```text +rhs_1 = r_1 +z_1 = B_1 rhs_1 +``` + +Field 2 can use the field-1 correction: + +```text +rhs_2 = r_2 - A_21 z_1 +z_2 = B_2 rhs_2 +``` + +Field 3 can use both earlier corrections: + +```text +rhs_3 = r_3 - A_31 z_1 - A_32 z_2 +z_3 = B_3 rhs_3 +``` + +Graphically: + +```text +r_1 ----------------------> B_1 ---> z_1 + | + v +r_2 - A_21 z_1 ----------> B_2 ---> z_2 + | + v +r_3 - A_31 z_1 - A_32 z_2 -> B_3 ---> z_3 +``` + +This resembles a lower triangular block solve. In exact block form, replacing `B_i` by `A_ii^{-1}` would correspond to applying an approximate inverse of: + +```text +[ A_11 0 0 ] +[ A_21 A_22 0 ] +[ A_31 A_32 A_33 ] +``` + +The actual preconditioner does not assemble this triangular matrix. It computes the needed off-diagonal products with the existing nested blocks and solves each field with `psb_d_nested_field_solve`. + +## Symmetric Multiplicative Composition + +The symmetric multiplicative composition performs a forward sweep followed by a backward sweep: + +```text +forward: i = 1, ..., m +backward: i = m, ..., 1 +``` + +The same sweep kernel is reused in both directions. This gives a symmetric-style block Gauss-Seidel preconditioner when the underlying block choices are compatible with the matrix properties. + +For a three-field backward sweep, the fields are processed as `3, 2, 1`. + +Field 3 has nothing above it in this ordering: + +```text +rhs_3 = r_3 +z_tilde_3 = B_3 rhs_3 +``` + +Field 2 can now use the already computed field-3 correction: + +```text +rhs_2 = r_2 - A_23 z_tilde_3 +z_tilde_2 = B_2 rhs_2 +``` + +Field 1 can use both later-field corrections: + +```text +rhs_1 = r_1 - A_12 z_tilde_2 - A_13 z_tilde_3 +z_tilde_1 = B_1 rhs_1 +``` + +Graphically: + +```text +r_3 ------------------------------> B_3 ---> z_tilde_3 + | + v +r_2 - A_23 z_tilde_3 ------------> B_2 ---> z_tilde_2 + | + v +r_1 - A_12 z_tilde_2 - A_13 z_tilde_3 -> B_1 ---> z_tilde_1 +``` + +This resembles an upper triangular block solve: + +```text +[ A_11 A_12 A_13 ] +[ 0 A_22 A_23 ] +[ 0 0 A_33 ] +``` + +The symmetric multiplicative composition combines the information flow from both directions. The forward sweep accounts for lower-block couplings such as `A_21`, `A_31`, and `A_32`; the backward sweep accounts for upper-block couplings such as `A_12`, `A_13`, and `A_23`. This is more expensive than additive composition because fields are no longer independent, but it can be much stronger when coupling terms dominate convergence. + +## Schur-Style Composition + +Schur-complement preconditioning starts from the two-field coupled system: + +```text + [ A_11 A_12 ] +A = [ A_21 A_22 ] +``` + +and: + +```text + [ x_1 ] [ r_1 ] +x = [ x_2 ], r = [ r_2 ] +``` + +The block system `A x = r` expands to: + +```text +A_11 x_1 + A_12 x_2 = r_1 +A_21 x_1 + A_22 x_2 = r_2 +``` + +To eliminate field 1, solve the first equation for `x_1`: + +```text +A_11 x_1 = r_1 - A_12 x_2 +x_1 = A_11^{-1} (r_1 - A_12 x_2) +``` + +Insert this expression into the second equation: + +```text +A_21 A_11^{-1} (r_1 - A_12 x_2) + A_22 x_2 = r_2 +``` + +Rearranging gives: + +```text +(A_22 - A_21 A_11^{-1} A_12) x_2 = r_2 - A_21 A_11^{-1} r_1 +``` + +The matrix: + +```text +S = A_22 - A_21 A_11^{-1} A_12 +``` + +is the Schur complement. + +This is important because the original coupled system has been reduced to three conceptual steps: + +1. solve a field-1 problem +2. solve a field-2 Schur problem +3. recover or correct field 1 + +That pattern is the basis of many modern multiphysics and saddle-point preconditioners. + +The exact block LU factorization is: + +```text + [ I 0 ] [ A_11 0 ] [ I A_11^{-1} A_12 ] +A = [ A_21 A_11^{-1} I ] [ 0 S ] [ 0 I ] +``` + +The diagonal blocks in this factorization are `A_11` and `S`; the remaining factors are triangular. This immediately suggests a preconditioner based on applying approximate triangular solves and approximate inverses for the diagonal factors. + +In exact arithmetic, the inverse action would involve: + +```text +P^{-1} = U^{-1} D^{-1} L^{-1} + +D^{-1} = [ A_11^{-1} 0 ] + [ 0 S^{-1} ] +``` + +In practice, nobody wants to explicitly compute the exact Schur complement for large sparse problems. The problem is `A_11^{-1}`. Even if `A_11`, `A_12`, and `A_21` are sparse, `A_11^{-1}` is generally dense, and therefore: + +```text +A_21 A_11^{-1} A_12 +``` + +can be dense or nearly dense. Assembling `S` would destroy sparsity and can make storage and computation infeasible for large systems. + +The nested preconditioner therefore uses approximations: + +```text +B_1 ~= A_11^{-1} +B_2 ~= A_22^{-1} or B_2 ~= S^{-1} +``` + +where `B_1` and `B_2` are field solves provided by `psb_d_nested_field_solve`. A field solve may be a direct block preconditioner application, or it may use the independent per-field inner Krylov context if `INNER_SOLVE` is enabled. + +The implementation provides three Schur-style compositions. + +### Lower Schur Variant + +The lower Schur variant computes: + +```text +z_1 = B_1 r_1 +z_2 = B_2 (r_2 - A_21 z_1) +``` + +Substituting the first equation into the second gives: + +```text +z_2 = B_2 (r_2 - A_21 B_1 r_1) +``` + +This is the approximate forward solve associated with the lower triangular block factor. + +```text +r_1 --> B_1 --> z_1 + | + v + A_21 z_1 + | + v +r_2 - A_21 z_1 --> B_2 --> z_2 +``` + +### Upper Schur Variant + +The upper Schur variant reverses the ordering: + +```text +z_2 = B_2 r_2 +z_1 = B_1 (r_1 - A_12 z_2) +``` + +This corresponds to applying the upper triangular block factor: + +```text +r_2 --> B_2 --> z_2 + | + v + A_12 z_2 + | + v +r_1 - A_12 z_2 --> B_1 --> z_1 +``` + +### Full Schur Variant + +The full Schur variant applies the lower-style solve and then corrects field 1 with the upper coupling: + +```text +z_1 = B_1 r_1 +z_2 = B_2 (r_2 - A_21 z_1) +z_1 = z_1 - B_1 A_12 z_2 +``` + +This approximates the full `U^{-1} D^{-1} L^{-1}` block-LU inverse action. It is generally stronger than purely additive block-Jacobi or a single directional block sweep because it accounts for both off-diagonal couplings. + +### Matrix-Free Schur Action + +The most important implementation detail is that the code can apply a Schur action without assembling `S`. + +Given a field-2 vector `x`, the matrix-free Schur action computes: + +```text +y = S x +``` + +through the sequence: + +```text +t = A_12 x +u = B_1 t +v = A_21 u +y = A_22 x - v +``` + +or equivalently: + +```text +y = A_22 x - A_21 B_1 A_12 x +``` + +Graphically: + +```text +x +| +v +A_12 +| +v +B_1 +| +v +A_21 +| +v +subtract from A_22 x +| +v +y +``` + +No Schur matrix is ever stored. The code only needs block products and field solves. + +### Richardson Schur Solve + +When `SCHUR_SOLVE = MATRIX_FREE`, the preconditioner needs an approximation to `S^{-1}` even though it only has a routine for `S x`. The implementation therefore uses a small preconditioned Richardson iteration: + +```text +z_{k+1} = z_k + M^{-1} (b - S z_k) +``` + +where `M^{-1}` is the field-2 solve, implemented through `psb_d_nested_field_solve` and usually acting like `B_2`. + +Each Richardson step uses: + +1. the matrix-free Schur product `S z_k` +2. a Schur residual `b - S z_k` +3. the field-2 preconditioner or field-2 inner solve to correct `z_k` + +This gives a cheap inner solve for the Schur complement without assembling or storing `S`. + +The implementation does not assemble an exact sparse Schur complement. Instead, it applies Schur actions using: + +- diagonal field solves through `psb_d_nested_field_solve` +- off-diagonal products through `psb_d_nest_apply_block` +- global/field work vectors managed by the nested matrix helpers + +Supported Schur compositions are: + +- `SCHUR_LOWER` +- `SCHUR_UPPER` +- `SCHUR_FULL` + +The Schur solve mode is controlled by `SCHUR_SOLVE`: + +- `A22`: use the field-2 block preconditioner as the Schur approximation +- `MATRIX_FREE`: use a small Richardson iteration around the matrix-free Schur action + +The `MATRIX_FREE` mode is controlled by `SCHUR_MAXIT` and `SCHUR_TOL`. + +## Implementation Structure + +The core implementation lives in `prec/psb_d_nestedprec.f90`, which defines `psb_d_nestedprec` and the concrete preconditioner type: + +```fortran +type :: psb_d_nested_block_prec + character(len=16) :: ptype + class(psb_d_base_prec_type), allocatable :: pc +end type psb_d_nested_block_prec + +type :: psb_d_nested_krylov_context + logical :: enabled + character(len=16) :: method + integer(psb_ipk_) :: itmax + integer(psb_ipk_) :: itrace + integer(psb_ipk_) :: istop + real(psb_dpk_) :: tol +end type psb_d_nested_krylov_context + +type, extends(psb_d_base_prec_type) :: psb_d_nested_prec_type + integer(psb_ipk_) :: composition + character(len=32) :: composition_name + character(len=16) :: default_block_ptype + integer(psb_ipk_) :: schur_solve + character(len=32) :: schur_solve_name + integer(psb_ipk_) :: schur_maxit + real(psb_dpk_) :: schur_tol + type(psb_d_nested_krylov_context) :: default_krylov + integer(psb_ipk_) :: nfields + type(psb_d_nest_base_mat), pointer :: nest_op + type(psb_d_nested_block_prec), allocatable :: blocks(:) + character(len=16), allocatable :: field_block_ptype(:) + type(psb_d_nested_krylov_context), allocatable :: field_krylov(:) + type(psb_d_nested_iopt), allocatable :: field_iopts(:) + type(psb_d_nested_ropt), allocatable :: field_ropts(:) + type(psb_d_nested_copt), allocatable :: field_copts(:) +contains + procedure, pass(prec) :: d_apply_v => psb_d_nested_apply_vect + procedure, pass(prec) :: d_apply => psb_d_nested_apply + procedure, pass(prec) :: precbld => psb_d_nested_precbld + procedure, pass(prec) :: precinit => psb_d_nested_precinit + procedure, pass(prec) :: precseti => psb_d_nested_precseti + procedure, pass(prec) :: precsetr => psb_d_nested_precsetr + procedure, pass(prec) :: precsetc => psb_d_nested_precsetc + procedure, pass(prec) :: precdescr => psb_d_nested_precdescr + procedure, pass(prec) :: dump => psb_d_nested_dump + procedure, pass(prec) :: clone => psb_d_nested_clone + procedure, pass(prec) :: free => psb_d_nested_precfree + procedure, pass(prec) :: sizeof => psb_d_nested_sizeof + procedure, pass(prec) :: get_nzeros => psb_d_nested_get_nzeros +end type psb_d_nested_prec_type +``` + +The defaults set by `psb_d_nested_precinit` are: + +- composition: `ADDITIVE` +- default block solve: `DIAG` +- Schur solve: `A22` +- Schur maximum iterations: `4` +- Schur tolerance: `0.0` +- default inner Krylov disabled +- default inner Krylov method: `CG` +- default inner Krylov maximum iterations: `20` +- default inner Krylov trace: `-1` +- default inner Krylov stopping rule: `2` +- default inner Krylov tolerance: `1.0e-6` + +The preconditioner stores: + +- the selected field-split composition +- the default block preconditioner type +- optional per-field block preconditioner types +- Schur solve configuration +- default and per-field inner Krylov configuration +- pending per-field sub-preconditioner options +- one built block preconditioner per field +- a pointer to the nested matrix backend + +The preconditioner does not own the nested matrix. It stores a pointer to the nested operator exposed by the global sparse matrix wrapper. + +## Build Phase + +The build phase is implemented by `psb_d_nested_precbld`. + +At build time, the preconditioner: + +1. releases previously built field preconditioners while keeping user configuration +2. verifies that the input sparse matrix is backed by a nested base matrix +3. stores a pointer to the nested operator +4. queries the number of fields +5. allocates one `psb_d_nested_block_prec` entry per field +6. obtains each diagonal block `A_ii` +7. obtains each field descriptor +8. allocates the selected PSBLAS block preconditioner type for that field +9. replays routed per-field options onto the block preconditioner +10. builds the field block preconditioner on `A_ii` + +The build path deliberately uses existing PSBLAS preconditioner implementations. The nested preconditioner is a composition layer over ordinary field preconditioners, not a replacement for ILU, diagonal, approximate inverse, or block-Jacobi preconditioners. + +## Field Block Preconditioners + +The nested block solve type is configured through `BLOCK_SOLVE` or `NEST_BLOCK_SOLVE`. + +Supported block preconditioner type names currently include: + +- `DIAG` +- `BJAC` +- `NONE` + +The implementation allocates the corresponding concrete PSBLAS preconditioner for each diagonal field block. `idx` can be used on `prec%set` calls to override a particular field: + +```fortran +call prec%set('BLOCK_SOLVE', 'BJAC', info, idx=1) +call prec%set('BLOCK_SOLVE', 'DIAG', info, idx=2) +``` + +Field-level sub-preconditioner tuning is also routed through `psb_d_prec_type_impl.f90`. For nested preconditioners, options such as `SUB_SOLVE`, `SUB_FILLIN`, `SUB_ILUTHRS`, `INV_FILLIN`, and `INV_THRESH` are stored on the nested preconditioner and replayed onto the selected field block preconditioners during build. + +This matters because the field block preconditioners do not exist until `precbld`. The routed option storage lets users configure fields before the nested object has been built. + +## Independent Per-Field Inner Krylov Contexts + +The implementation includes independent per-field inner Krylov solve contexts: + +```fortran +type(psb_d_nested_krylov_context) :: default_krylov +type(psb_d_nested_krylov_context), allocatable :: field_krylov(:) +``` + +The default context applies to all fields when `idx` is not supplied. Field-specific contexts are allocated and updated when `idx` is supplied. + +Supported inner methods are: + +- `CG` +- `BICGSTAB` +- `NONE`, which disables the inner Krylov path for that field + +The inner Krylov options are: + +- `INNER_SOLVE`, also accepted as `KRYLOV_SOLVE` or `FIELD_SOLVE` +- `INNER_MAXIT`, also accepted as `INNER_ITMAX`, `KRYLOV_MAXIT`, or `KRYLOV_ITMAX` +- `INNER_TOL`, also accepted as `KRYLOV_TOL` +- `INNER_ITRACE`, also accepted as `KRYLOV_ITRACE` +- `INNER_ISTOP`, also accepted as `KRYLOV_ISTOP` + +Example: + +```fortran +call prec%set('INNER_SOLVE', 'CG', info, idx=1) +call prec%set('INNER_TOL', 1.0d-8, info, idx=1) +call prec%set('INNER_MAXIT', 50, info, idx=1) + +call prec%set('INNER_SOLVE', 'BICGSTAB', info, idx=2) +call prec%set('INNER_TOL', 1.0d-6, info, idx=2) +call prec%set('INNER_MAXIT', 30, info, idx=2) +``` + +When a field has inner Krylov enabled, `psb_d_nested_field_solve` dispatches to: + +- `psb_d_nested_inner_cg` +- `psb_d_nested_inner_bicgstab` + +Those routines apply the field diagonal matrix through `psb_d_nested_field_matvec` and use the field block preconditioner as the inner preconditioner. When inner Krylov is disabled, `psb_d_nested_field_solve` directly applies the field block preconditioner. + +The implementation does not call the top-level `psb_krylov` routine for inner solves. That choice avoids a build dependency cycle: the Krylov solvers live above the preconditioner layer, while `prec` must compile without depending on `linsolve`. The result is a PSBLAS-local inner Krylov implementation that provides independent per-field solve contexts without conflicting with the existing framework. + +## Apply Phase + +The apply phase is implemented by: + +- `psb_d_nested_apply_vect` for PSBLAS vector objects +- `psb_d_nested_apply` for raw double-precision arrays + +The vector-object path copies into raw array buffers, calls the array path, and copies back. The array path handles: + +- optional transpose rejection +- alpha/beta scaling +- work-vector management +- dispatch to the selected composition + +The selected composition is then applied by one of: + +- `psb_d_nested_add_apply` +- `psb_d_nested_sweep` +- `psb_d_nested_apply_schur` + +All paths eventually solve field systems through `psb_d_nested_field_solve`, so per-field block-preconditioner settings and per-field inner Krylov settings are shared by additive, multiplicative, symmetric, and Schur-style compositions. + +## Additive Apply Path + +`psb_d_nested_add_apply` performs the following for each field: + +1. restrict the global residual to the field vector +2. solve the field problem through `psb_d_nested_field_solve` +3. prolong the field correction into the global output vector + +Because the additive path does not use off-diagonal blocks, all field solves are independent at the mathematical level. + +## Multiplicative Apply Path + +`psb_d_nested_sweep` implements both forward and backward field sweeps. For each field: + +1. restrict the original residual to the current field +2. subtract the effect of already computed field corrections through off-diagonal block products +3. solve the corrected field residual +4. prolong the correction into global work storage + +The same routine is used for: + +- `MULTIPLICATIVE`: one forward sweep +- `SYMMETRIC_MULTIPLICATIVE`: a forward sweep followed by a backward sweep + +The sweep implementation uses field descriptors and global work vectors so that off-diagonal products see the correct local and halo data. + +## Schur Apply Path + +`psb_d_nested_apply_schur` is defined for exactly two fields. It implements lower, upper, and full Schur-style block factorizations using field solves and block products. + +Supporting routines are: + +- `psb_d_nested_schur_action` +- `psb_d_nested_schur_solve` + +`psb_d_nested_schur_action` applies the matrix-free Schur operator: + +```text +S x_2 = A_22 x_2 - A_21 B_1 A_12 x_2 +``` + +where `B_1` is the field-1 block solve or inner field solve. `psb_d_nested_schur_solve` either applies the field-2 block solve directly (`A22`) or runs a short preconditioned Richardson iteration using the matrix-free Schur action (`MATRIX_FREE`). + +## Public Options + +The nested preconditioner exposes integer option keys in `psb_d_nestedprec`: + +```fortran +integer(psb_ipk_), parameter :: psb_d_nested_composition_ = 9101 +integer(psb_ipk_), parameter :: psb_d_nested_block_solve_ = 9102 +integer(psb_ipk_), parameter :: psb_d_nested_schur_solve_ = 9103 +integer(psb_ipk_), parameter :: psb_d_nested_schur_maxit_ = 9104 +integer(psb_ipk_), parameter :: psb_d_nested_schur_tol_ = 9105 +integer(psb_ipk_), parameter :: psb_d_nested_inner_solve_ = 9106 +integer(psb_ipk_), parameter :: psb_d_nested_inner_maxit_ = 9107 +integer(psb_ipk_), parameter :: psb_d_nested_inner_tol_ = 9108 +integer(psb_ipk_), parameter :: psb_d_nested_inner_itrace_ = 9109 +integer(psb_ipk_), parameter :: psb_d_nested_inner_istop_ = 9110 +``` + +The user-facing string options routed by `psb_d_prec_type_impl.f90` are: + +| Option | Type | Values | Scope | +| --- | --- | --- | --- | +| `COMPOSITION`, `NEST_COMPOSITION` | character | `ADDITIVE`, `DIAGONAL`, `MULTIPLICATIVE`, `SYMMETRIC_MULTIPLICATIVE`, `SCHUR_LOWER`, `SCHUR_UPPER`, `SCHUR_FULL` | global | +| `BLOCK_SOLVE`, `NEST_BLOCK_SOLVE` | character | `DIAG`, `BJAC`, `NONE` | global or `idx` field | +| `SCHUR_SOLVE`, `NEST_SCHUR_SOLVE` | character | `A22`, `MATRIX_FREE` | global | +| `SCHUR_MAXIT`, `NEST_SCHUR_MAXIT` | integer | nonnegative iteration count | global | +| `SCHUR_TOL`, `NEST_SCHUR_TOL` | real | nonnegative tolerance | global | +| `INNER_SOLVE`, `KRYLOV_SOLVE`, `FIELD_SOLVE` | character | `NONE`, `CG`, `BICGSTAB` | global or `idx` field | +| `INNER_MAXIT`, `INNER_ITMAX`, `KRYLOV_MAXIT`, `KRYLOV_ITMAX` | integer | positive iteration count | global or `idx` field | +| `INNER_TOL`, `KRYLOV_TOL` | real | nonnegative tolerance | global or `idx` field | +| `INNER_ITRACE`, `KRYLOV_ITRACE` | integer | trace level | global or `idx` field | +| `INNER_ISTOP`, `KRYLOV_ISTOP` | integer | stopping rule code | global or `idx` field | +| `SUB_SOLVE` | character | routed field sub-preconditioner selection | global or `idx` field | +| `SUB_FILLIN` | integer | routed ILU fill setting | global or `idx` field | +| `SUB_ILUTHRS` | real | routed ILUT threshold | global or `idx` field | +| `INV_FILLIN` | integer | routed inverse fill setting | global or `idx` field | +| `INV_THRESH` | real | routed inverse threshold | global or `idx` field | + +## Example Usage + +A typical nested preconditioner setup looks like: + +```fortran +call prec%init('NEST', info) +call prec%set('COMPOSITION', 'SCHUR_FULL', info) +call prec%set('SCHUR_SOLVE', 'MATRIX_FREE', info) +call prec%set('SCHUR_MAXIT', 6, info) +call prec%set('SCHUR_TOL', 1.0d-8, info) + +call prec%set('BLOCK_SOLVE', 'BJAC', info, idx=1) +call prec%set('BLOCK_SOLVE', 'DIAG', info, idx=2) + +call prec%set('INNER_SOLVE', 'CG', info, idx=1) +call prec%set('INNER_MAXIT', 40, info, idx=1) +call prec%set('INNER_TOL', 1.0d-8, info, idx=1) + +call prec%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) +call psb_krylov('BICGSTAB', nested_matrix%a_glob, prec, b, x, eps, & + & nested_matrix%desc_glob, info) +``` + +This keeps the outer solve in the existing PSBLAS API while allowing field-specific behavior inside the nested preconditioner. + +## Factory And API Integration + +`psb_dprecinit` recognizes the nested preconditioner name: + +```fortran +call prec%init('NEST', info) +``` + +and allocates: + +```fortran +psb_d_nested_prec_type +``` + +The generic `prec%set` wrappers in `prec/impl/psb_d_prec_type_impl.f90` route nested-specific options only when the underlying concrete preconditioner is `psb_d_nested_prec_type`. For non-nested preconditioners, the existing behavior is preserved where appropriate. + +This means nested-specific settings do not change the behavior of ordinary PSBLAS preconditioners. + +## Parallel Layout Considerations + +Parallel correctness depends on using PSBLAS descriptors and nested helper routines consistently. + +Each field has: + +- its own descriptor +- local owned rows +- possible halo entries needed by block products +- mappings between field-local vectors and global nested vectors + +The preconditioner uses: + +- `psb_d_nest_restrict_field` to extract field data from global vectors +- `psb_d_nest_restrict_field_local` where only local field data are needed +- `psb_d_nest_prolong_field` to insert field corrections into global vectors +- `psb_d_nest_apply_block` for diagonal and off-diagonal block products + +Multiplicative and Schur compositions require more care than additive composition because they use off-diagonal products. Intermediate field corrections are prolonged into global work storage before being consumed by later block products. This ensures that distributed halo data are consistent with the nested matrix layout. + +## Validation + +The repository contains nested preconditioner tests under: + +```text +test/nested +``` + +The main nested preconditioner drivers are: + +- `test/nested/psb_d_nest_prec_cg_test.F90` +- `test/nested/psb_d_nest_mult_prec_cg_test.F90` +- `test/nested/psb_d_nest_schur_prec_cg_test.F90` + +These tests exercise: + +- nested matrix construction +- nested preconditioner initialization +- additive/diagonal composition +- multiplicative and symmetric multiplicative composition +- Schur lower, upper, and full composition +- matrix-free Schur solve mode +- per-field inner Krylov configuration and apply through the Schur preconditioner +- serial and MPI execution paths in the nested examples + +Validation of the nested preconditioner examples showed convergence for the additive, multiplicative, and Schur-style configurations on the nested Laplacian test problems. + +The Schur nested preconditioner test includes an additional `SCHUR_FULL` / `A22+INNER` case. That case configures independent per-field inner Krylov contexts, builds the nested preconditioner, applies it directly to the right-hand side, and verifies a successful finite preconditioner application. The existing Schur cases continue to check outer BiCGSTAB convergence. + +The serial and two-rank runs both passed after rebuilding the nested preconditioner object, the factory object, the preconditioner archive, and the test executable. + +## Current Limitations + +The current implementation has the following limitations: + +- double precision only +- nested preconditioner name currently exposed as `NEST` +- transposed preconditioner application is rejected +- Schur-style compositions are implemented for exactly two fields +- matrix-free Schur solve currently uses a small Richardson iteration +- inner Krylov methods currently include `CG` and `BICGSTAB` + +## Future Work + +Recommended next steps are: + +- add single, complex, and complex double precision variants + +## Summary + +The nested preconditioner adds a field-split preconditioning layer to PSBLAS while preserving the existing solver and preconditioner API. It reuses standard PSBLAS block preconditioners on diagonal field blocks and combines them through additive, multiplicative, symmetric multiplicative, and Schur-style compositions. + +The implementation integrates with: + +- nested matrix storage and field mapping +- the `psb_dprec_type` factory +- the generic `prec%set` configuration path +- the ordinary PSBLAS preconditioner build/apply lifecycle +- the existing outer `psb_krylov` solver interface + +The newer per-field inner Krylov support gives each field an independent local solve context, configured through the same PSBLAS preconditioner option API. It does not conflict with the existing framework because it is implemented inside the preconditioner layer and avoids adding a dependency from `prec` back to the higher-level Krylov solver layer. diff --git a/base/modules/serial/psb_d_nest_base_mat_mod.F90 b/base/modules/serial/psb_d_nest_base_mat_mod.F90 index ed97a2521..395fec0f6 100644 --- a/base/modules/serial/psb_d_nest_base_mat_mod.F90 +++ b/base/modules/serial/psb_d_nest_base_mat_mod.F90 @@ -121,7 +121,8 @@ module psb_d_nest_base_mat_mod ! field-split interface (for the block preconditioner) public :: psb_d_nest_get_n_fields, psb_d_nest_get_field_owned, & & psb_d_nest_get_block, psb_d_nest_get_field_desc, & - & psb_d_nest_restrict_field, psb_d_nest_prolong_field + & psb_d_nest_restrict_field, psb_d_nest_restrict_field_local, & + & psb_d_nest_prolong_field contains @@ -1191,6 +1192,24 @@ contains end do end subroutine psb_d_nest_restrict_field + ! Restrict: extract field k's full local sub-vector (owned + ghosts). + subroutine psb_d_nest_restrict_field_local(nest_op, field, x_global, x_field, info) + type(psb_d_nest_base_mat), intent(in) :: nest_op + integer(psb_ipk_), intent(in) :: field + real(psb_dpk_), intent(in) :: x_global(:) + real(psb_dpk_), intent(out) :: x_field(:) + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_) :: i_entry, n_local + info = psb_success_ + if (field < 1 .or. field > nest_op%n_fields) then + info = psb_err_invalid_input_; return + end if + n_local = size(nest_op%field_map(field)%global_local_pos) + do i_entry = 1, n_local + x_field(i_entry) = x_global(nest_op%field_map(field)%global_local_pos(i_entry)) + end do + end subroutine psb_d_nest_restrict_field_local + ! Prolong: insert field k's OWNED sub-vector into the global local vector. subroutine psb_d_nest_prolong_field(nest_op, field, x_field, x_global, info) type(psb_d_nest_base_mat), intent(in) :: nest_op diff --git a/prec/CMakeLists.txt b/prec/CMakeLists.txt index 83bf7c822..8573d23f5 100644 --- a/prec/CMakeLists.txt +++ b/prec/CMakeLists.txt @@ -77,6 +77,7 @@ set(PSB_prec_source_files psb_z_diagprec.f90 psb_z_bjacprec.f90 psb_d_diagprec.f90 + psb_d_nestedprec.f90 psb_c_prec_mod.f90 psb_d_nullprec.f90 psb_s_diagprec.f90 diff --git a/prec/Makefile b/prec/Makefile index 6121840d9..a849726d9 100644 --- a/prec/Makefile +++ b/prec/Makefile @@ -9,7 +9,7 @@ MODOBJS=psb_prec_const_mod.o\ psb_s_base_prec_mod.o psb_d_base_prec_mod.o psb_c_base_prec_mod.o psb_z_base_prec_mod.o \ psb_prec_type.o \ psb_prec_mod.o psb_s_prec_mod.o psb_d_prec_mod.o psb_c_prec_mod.o psb_z_prec_mod.o \ - psb_d_diagprec.o psb_d_nullprec.o psb_d_bjacprec.o psb_s_ilu_fact_mod.o \ + psb_d_diagprec.o psb_d_nullprec.o psb_d_bjacprec.o psb_d_nestedprec.o psb_s_ilu_fact_mod.o \ psb_s_diagprec.o psb_s_nullprec.o psb_s_bjacprec.o psb_d_ilu_fact_mod.o \ psb_c_diagprec.o psb_c_nullprec.o psb_c_bjacprec.o psb_c_ilu_fact_mod.o \ psb_z_diagprec.o psb_z_nullprec.o psb_z_bjacprec.o psb_z_ilu_fact_mod.o \ @@ -59,6 +59,7 @@ psb_prec_type.o: psb_s_prec_type.o psb_d_prec_type.o psb_c_prec_type.o psb_z_pre psb_prec_mod.o: psb_s_prec_mod.o psb_d_prec_mod.o psb_c_prec_mod.o psb_z_prec_mod.o psb_s_bjacprec.o psb_s_diagprec.o psb_s_nullprec.o: psb_prec_mod.o psb_s_base_prec_mod.o psb_d_bjacprec.o psb_d_diagprec.o psb_d_nullprec.o: psb_prec_mod.o psb_d_base_prec_mod.o +psb_d_nestedprec.o: psb_d_base_prec_mod.o psb_d_nullprec.o psb_d_diagprec.o psb_d_bjacprec.o psb_c_bjacprec.o psb_c_diagprec.o psb_c_nullprec.o: psb_prec_mod.o psb_c_base_prec_mod.o psb_z_bjacprec.o psb_z_diagprec.o psb_z_nullprec.o: psb_prec_mod.o psb_z_base_prec_mod.o psb_s_bjacprec.o: psb_s_ilu_fact_mod.o psb_s_ainv_fact_mod.o psb_s_invk_fact_mod.o psb_s_invt_fact_mod.o diff --git a/prec/impl/psb_d_prec_type_impl.f90 b/prec/impl/psb_d_prec_type_impl.f90 index 1227a0eda..c622c85ac 100644 --- a/prec/impl/psb_d_prec_type_impl.f90 +++ b/prec/impl/psb_d_prec_type_impl.f90 @@ -267,6 +267,10 @@ end subroutine psb_d_apply1v subroutine psb_dcprecseti(prec,what,val,info,ilev,ilmax,pos,idx) use psb_base_mod use psb_d_prec_type, psb_protect_name => psb_dcprecseti + use psb_d_nestedprec, only : psb_d_nested_prec_type, & + & psb_d_nested_schur_maxit_, psb_d_nested_inner_maxit_, & + & psb_d_nested_inner_itrace_, psb_d_nested_inner_istop_, & + & psb_d_nested_set_field_precseti, psb_d_nested_set_field_krylov_i implicit none class(psb_dprec_type), intent(inout) :: prec @@ -280,16 +284,70 @@ subroutine psb_dcprecseti(prec,what,val,info,ilev,ilmax,pos,idx) ! Local variables character(len=*), parameter :: name='psb_precseti' + integer(psb_ipk_) :: field info = psb_success_ ! We need to convert from the 'what' string to the corresponding integer ! value befor passing the call to the set of the inner method. + if (.not. allocated(prec%prec)) then + info = psb_err_invalid_preca_ + return + end if + select case (psb_toupper(trim(what))) + case ('SCHUR_MAXIT','NEST_SCHUR_MAXIT') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call p%precset(psb_d_nested_schur_maxit_, val, info) + class default + info = psb_err_invalid_preca_ + end select + case ('INNER_MAXIT','INNER_ITMAX','KRYLOV_MAXIT','KRYLOV_ITMAX') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_krylov_i(p, field, psb_d_nested_inner_maxit_, val, info) + class default + info = psb_err_invalid_preca_ + end select + case ('INNER_ITRACE','KRYLOV_ITRACE') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_krylov_i(p, field, psb_d_nested_inner_itrace_, val, info) + class default + info = psb_err_invalid_preca_ + end select + case ('INNER_ISTOP','KRYLOV_ISTOP') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_krylov_i(p, field, psb_d_nested_inner_istop_, val, info) + class default + info = psb_err_invalid_preca_ + end select case ('SUB_FILLIN') - call prec%prec%precset(psb_ilu_fill_in_,val,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_precseti(p, field, psb_ilu_fill_in_, val, info) + class default + call prec%prec%precset(psb_ilu_fill_in_,val,info) + end select case ('INV_FILLIN') - call prec%prec%precset(psb_inv_fillin_,val,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_precseti(p, field, psb_inv_fillin_, val, info) + class default + call prec%prec%precset(psb_inv_fillin_,val,info) + end select case default info = psb_err_invalid_args_combination_ write(psb_err_unit,*) name,& @@ -303,6 +361,9 @@ end subroutine psb_dcprecseti subroutine psb_dcprecsetr(prec,what,val,info,ilev,ilmax,pos,idx) use psb_base_mod use psb_d_prec_type, psb_protect_name => psb_dcprecsetr + use psb_d_nestedprec, only : psb_d_nested_prec_type, & + & psb_d_nested_schur_tol_, psb_d_nested_inner_tol_, & + & psb_d_nested_set_field_precsetr, psb_d_nested_set_field_krylov_r implicit none class(psb_dprec_type), intent(inout) :: prec @@ -316,16 +377,52 @@ subroutine psb_dcprecsetr(prec,what,val,info,ilev,ilmax,pos,idx) ! Local variables character(len=*), parameter :: name='psb_precsetr' + integer(psb_ipk_) :: field info = psb_success_ ! We need to convert from the 'what' string to the corresponding integer ! value befor passing the call to the set of the inner method. + if (.not. allocated(prec%prec)) then + info = psb_err_invalid_preca_ + return + end if + select case (psb_toupper(trim(what))) + case('SCHUR_TOL','NEST_SCHUR_TOL') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call p%precset(psb_d_nested_schur_tol_, val, info) + class default + info = psb_err_invalid_preca_ + end select + case('INNER_TOL','KRYLOV_TOL') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_krylov_r(p, field, psb_d_nested_inner_tol_, val, info) + class default + info = psb_err_invalid_preca_ + end select case('SUB_ILUTHRS') - call prec%prec%precset(psb_fact_eps_,val,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_precsetr(p, field, psb_fact_eps_, val, info) + class default + call prec%prec%precset(psb_fact_eps_,val,info) + end select case('INV_THRESH') - call prec%prec%precset(psb_inv_thresh_,val,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_precsetr(p, field, psb_inv_thresh_, val, info) + class default + call prec%prec%precset(psb_inv_thresh_,val,info) + end select case default info = psb_err_invalid_args_combination_ write(psb_err_unit,*) name,& @@ -339,6 +436,11 @@ end subroutine psb_dcprecsetr subroutine psb_dcprecsetc(prec,what,string,info,ilev,ilmax,pos,idx) use psb_base_mod use psb_d_prec_type, psb_protect_name => psb_dcprecsetc + use psb_d_nestedprec, only : psb_d_nested_prec_type, & + & psb_d_nested_composition_, psb_d_nested_block_solve_, & + & psb_d_nested_schur_solve_, psb_d_nested_set_block_solve_field, & + & psb_d_nested_inner_solve_, psb_d_nested_set_field_precseti, & + & psb_d_nested_set_field_krylov_c implicit none class(psb_dprec_type), intent(inout) :: prec @@ -352,68 +454,217 @@ subroutine psb_dcprecsetc(prec,what,string,info,ilev,ilmax,pos,idx) ! Local variables character(len=*), parameter :: name='psb_precsetc' + integer(psb_ipk_) :: field info = psb_success_ ! We need to convert from the 'what' string to the corresponding integer ! value befor passing the call to the set of the inner method. + if (.not. allocated(prec%prec)) then + info = psb_err_invalid_preca_ + return + end if + select case (psb_toupper(trim(what))) + case ('COMPOSITION','NEST_COMPOSITION') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call p%precset(psb_d_nested_composition_, string, info) + class default + info = psb_err_invalid_preca_ + end select + case ('SCHUR_SOLVE','NEST_SCHUR_SOLVE') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call p%precset(psb_d_nested_schur_solve_, string, info) + class default + info = psb_err_invalid_preca_ + end select + case ('BLOCK_SOLVE','NEST_BLOCK_SOLVE') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + if (present(idx)) then + call psb_d_nested_set_block_solve_field(p, idx, string, info) + else + call p%precset(psb_d_nested_block_solve_, string, info) + end if + class default + info = psb_err_invalid_preca_ + end select + case ('INNER_SOLVE','KRYLOV_SOLVE','FIELD_SOLVE') + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + field = 0 + if (present(idx)) field = idx + call psb_d_nested_set_field_krylov_c(p, field, psb_d_nested_inner_solve_, string, info) + class default + info = psb_err_invalid_preca_ + end select case ('SUB_SOLVE') ! We select here the type of solver on the block + field = 0 + if (present(idx)) field = idx select case (psb_toupper(trim(string))) case("ILU") + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_f_type_, psb_f_ilu_k_, info) + if (info == psb_success_) & + & call psb_d_nested_set_field_precseti(p, field, psb_ilu_ialg_, psb_ilu_n_, info) + class default call prec%prec%precset(psb_f_type_,psb_f_ilu_k_,info) - call prec%prec%precset(psb_ilu_ialg_,psb_ilu_n_,info) + if (info == psb_success_) call prec%prec%precset(psb_ilu_ialg_,psb_ilu_n_,info) + end select case("ILUT") + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_f_type_, psb_f_ilu_t_, info) + if (info == psb_success_) & + & call psb_d_nested_set_field_precseti(p, field, psb_ilu_ialg_, psb_ilu_t_, info) + class default call prec%prec%precset(psb_f_type_,psb_f_ilu_t_,info) - call prec%prec%precset(psb_ilu_ialg_,psb_ilu_t_,info) + if (info == psb_success_) call prec%prec%precset(psb_ilu_ialg_,psb_ilu_t_,info) + end select case("AINV") + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_f_type_, psb_f_ainv_, info) + class default call prec%prec%precset(psb_f_type_,psb_f_ainv_,info) + end select case("INVK") + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_f_type_, psb_f_invk_, info) + class default call prec%prec%precset(psb_f_type_,psb_f_invk_,info) + end select case("INVT") + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_f_type_, psb_f_invt_, info) + class default call prec%prec%precset(psb_f_type_,psb_f_invt_,info) + end select case default ! Default to ILU(0) factorization - call prec%prec%precset(psb_f_type_,psb_f_ilu_n_,info) - call prec%prec%precset(psb_ilu_ialg_,psb_ilu_n_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_f_type_, psb_f_ilu_n_, info) + if (info == psb_success_) & + & call psb_d_nested_set_field_precseti(p, field, psb_ilu_ialg_, psb_ilu_n_, info) + class default + call prec%prec%precset(psb_f_type_,psb_f_ilu_n_,info) + if (info == psb_success_) call prec%prec%precset(psb_ilu_ialg_,psb_ilu_n_,info) + end select end select case ("ILU_ALG") + field = 0 + if (present(idx)) field = idx select case (psb_toupper(trim(string))) case ("MILU") - call prec%prec%precset(psb_ilu_ialg_,psb_milu_n_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_ialg_, psb_milu_n_, info) + class default + call prec%prec%precset(psb_ilu_ialg_,psb_milu_n_,info) + end select case default ! Do nothing end select case ("ILUT_SCALE") + field = 0 + if (present(idx)) field = idx select case (psb_toupper(trim(string))) case ("MAXVAL") - call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_maxval_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_scale_, psb_ilu_scale_maxval_, info) + class default + call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_maxval_,info) + end select case ("DIAG") - call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_diag_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_scale_, psb_ilu_scale_diag_, info) + class default + call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_diag_,info) + end select case ("ARWSUM") - call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_arwsum_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_scale_, psb_ilu_scale_arwsum_, info) + class default + call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_arwsum_,info) + end select case ("ARCSUM") - call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_arcsum_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_scale_, psb_ilu_scale_arcsum_, info) + class default + call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_arcsum_,info) + end select case ("ACLSUM") - call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_aclsum_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_scale_, psb_ilu_scale_aclsum_, info) + class default + call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_aclsum_,info) + end select case ("NONE") - call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_none_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_scale_, psb_ilu_scale_none_, info) + class default + call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_none_,info) + end select case default - call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_none_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ilu_scale_, psb_ilu_scale_none_, info) + class default + call prec%prec%precset(psb_ilu_scale_,psb_ilu_scale_none_,info) + end select end select case ("AINV_ALG") + field = 0 + if (present(idx)) field = idx select case (psb_toupper(trim(string))) case("LLK") - call prec%prec%precset(psb_ainv_alg_,psb_ainv_llk_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ainv_alg_, psb_ainv_llk_, info) + class default + call prec%prec%precset(psb_ainv_alg_,psb_ainv_llk_,info) + end select case("SYM-LLK") - call prec%prec%precset(psb_ainv_alg_,psb_ainv_s_llk_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ainv_alg_, psb_ainv_s_llk_, info) + class default + call prec%prec%precset(psb_ainv_alg_,psb_ainv_s_llk_,info) + end select case("STAB-LLK") - call prec%prec%precset(psb_ainv_alg_,psb_ainv_s_ft_llk_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ainv_alg_, psb_ainv_s_ft_llk_, info) + class default + call prec%prec%precset(psb_ainv_alg_,psb_ainv_s_ft_llk_,info) + end select case("MLK","LMX") - call prec%prec%precset(psb_ainv_alg_,psb_ainv_mlk_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ainv_alg_, psb_ainv_mlk_, info) + class default + call prec%prec%precset(psb_ainv_alg_,psb_ainv_mlk_,info) + end select case default - call prec%prec%precset(psb_ainv_alg_,psb_ainv_llk_,info) + select type (p => prec%prec) + type is (psb_d_nested_prec_type) + call psb_d_nested_set_field_precseti(p, field, psb_ainv_alg_, psb_ainv_llk_, info) + class default + call prec%prec%precset(psb_ainv_alg_,psb_ainv_llk_,info) + end select end select case default diff --git a/prec/impl/psb_dprecinit.f90 b/prec/impl/psb_dprecinit.f90 index fe9f6e9d0..8baebe663 100644 --- a/prec/impl/psb_dprecinit.f90 +++ b/prec/impl/psb_dprecinit.f90 @@ -36,6 +36,7 @@ subroutine psb_dprecinit(ctxt,p,ptype,info) use psb_d_nullprec, only : psb_d_null_prec_type use psb_d_diagprec, only : psb_d_diag_prec_type use psb_d_bjacprec, only : psb_d_bjac_prec_type + use psb_d_nestedprec, only : psb_d_nested_prec_type implicit none type(psb_ctxt_type), intent(in) :: ctxt class(psb_dprec_type), intent(inout) :: p @@ -62,6 +63,9 @@ subroutine psb_dprecinit(ctxt,p,ptype,info) case ('BJAC') allocate(psb_d_bjac_prec_type :: p%prec, stat=info) + + case ('NEST','NESTED') + allocate(psb_d_nested_prec_type :: p%prec, stat=info) case default write(psb_err_unit,*) 'Unknown preconditioner type request "',ptype,'"' diff --git a/prec/psb_d_nestedprec.f90 b/prec/psb_d_nestedprec.f90 new file mode 100644 index 000000000..77684bb88 --- /dev/null +++ b/prec/psb_d_nestedprec.f90 @@ -0,0 +1,1718 @@ +! +! Parallel Sparse BLAS version 3.5 +! (C) Copyright 2006-2018 +! Salvatore Filippone +! Alfredo Buttari +! +! Redistribution and use in source and binary forms, with or without +! modification, are permitted provided that the following conditions +! are met: +! 1. Redistributions of source code must retain the above copyright +! notice, this list of conditions and the following disclaimer. +! 2. Redistributions in binary form must reproduce the above copyright +! notice, this list of conditions and the following disclaimer in the +! documentation and/or other materials provided with the distribution. +! 3. The name of the PSBLAS group or the names of its contributors may +! not be used to endorse or promote products derived from this +! software without specific prior written permission. +! +! THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +! ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED +! TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +! PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE PSBLAS GROUP OR ITS CONTRIBUTORS +! BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +! CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +! SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +! INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +! CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +! ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +! POSSIBILITY OF SUCH DAMAGE. +! +module psb_d_nestedprec + + use psb_base_mod, only : psb_ipk_, psb_epk_, psb_dpk_, psb_success_, & + & psb_err_invalid_input_, psb_err_invalid_preca_, psb_err_invalid_mat_state_, & + & psb_err_alloc_dealloc_, psb_err_transpose_not_n_unsupported_, & + & psb_root_, psb_toupper, psb_info, psb_errpush, psb_halo, psb_gedot, & + & done, dzero + use psb_d_base_prec_mod + use psb_d_nullprec, only : psb_d_null_prec_type + use psb_d_diagprec, only : psb_d_diag_prec_type + use psb_d_bjacprec, only : psb_d_bjac_prec_type + use psb_d_nest_base_mat_mod, only : psb_d_nest_base_mat, psb_d_nest_get_n_fields, & + & psb_d_nest_get_field_owned, psb_d_nest_get_block, psb_d_nest_get_field_desc, & + & psb_d_nest_restrict_field, psb_d_nest_restrict_field_local, & + & psb_d_nest_prolong_field, psb_d_nest_apply_block + implicit none + + integer(psb_ipk_), parameter :: psb_d_nested_composition_ = 9101 + integer(psb_ipk_), parameter :: psb_d_nested_block_solve_ = 9102 + integer(psb_ipk_), parameter :: psb_d_nested_schur_solve_ = 9103 + integer(psb_ipk_), parameter :: psb_d_nested_schur_maxit_ = 9104 + integer(psb_ipk_), parameter :: psb_d_nested_schur_tol_ = 9105 + integer(psb_ipk_), parameter :: psb_d_nested_inner_solve_ = 9106 + integer(psb_ipk_), parameter :: psb_d_nested_inner_maxit_ = 9107 + integer(psb_ipk_), parameter :: psb_d_nested_inner_tol_ = 9108 + integer(psb_ipk_), parameter :: psb_d_nested_inner_itrace_ = 9109 + integer(psb_ipk_), parameter :: psb_d_nested_inner_istop_ = 9110 + + integer(psb_ipk_), parameter, private :: psb_d_nested_diag_ = 1 + integer(psb_ipk_), parameter, private :: psb_d_nested_add_ = 2 + integer(psb_ipk_), parameter, private :: psb_d_nested_mult_ = 3 + integer(psb_ipk_), parameter, private :: psb_d_nested_symm_ = 4 + integer(psb_ipk_), parameter, private :: psb_d_nested_schur_lower_ = 5 + integer(psb_ipk_), parameter, private :: psb_d_nested_schur_upper_ = 6 + integer(psb_ipk_), parameter, private :: psb_d_nested_schur_full_ = 7 + integer(psb_ipk_), parameter, private :: psb_d_nested_schur_a22_ = 1 + integer(psb_ipk_), parameter, private :: psb_d_nested_schur_mf_ = 2 + + type :: psb_d_nested_iopt + integer(psb_ipk_) :: field + integer(psb_ipk_) :: what + integer(psb_ipk_) :: val + end type psb_d_nested_iopt + + type :: psb_d_nested_ropt + integer(psb_ipk_) :: field + integer(psb_ipk_) :: what + real(psb_dpk_) :: val + end type psb_d_nested_ropt + + type :: psb_d_nested_copt + integer(psb_ipk_) :: field + integer(psb_ipk_) :: what + character(len=64) :: val + end type psb_d_nested_copt + + type :: psb_d_nested_block_prec + character(len=16) :: ptype + class(psb_d_base_prec_type), allocatable :: pc + end type psb_d_nested_block_prec + + type :: psb_d_nested_krylov_context + logical :: enabled + character(len=16) :: method + integer(psb_ipk_) :: itmax + integer(psb_ipk_) :: itrace + integer(psb_ipk_) :: istop + real(psb_dpk_) :: tol + end type psb_d_nested_krylov_context + + type, extends(psb_d_base_prec_type) :: psb_d_nested_prec_type + integer(psb_ipk_) :: composition + character(len=32) :: composition_name + character(len=16) :: default_block_ptype + integer(psb_ipk_) :: schur_solve + character(len=32) :: schur_solve_name + integer(psb_ipk_) :: schur_maxit + real(psb_dpk_) :: schur_tol + type(psb_d_nested_krylov_context) :: default_krylov + integer(psb_ipk_) :: nfields + type(psb_d_nest_base_mat), pointer :: nest_op + type(psb_d_nested_block_prec), allocatable :: blocks(:) + character(len=16), allocatable :: field_block_ptype(:) + type(psb_d_nested_krylov_context), allocatable :: field_krylov(:) + type(psb_d_nested_iopt), allocatable :: field_iopts(:) + type(psb_d_nested_ropt), allocatable :: field_ropts(:) + type(psb_d_nested_copt), allocatable :: field_copts(:) + contains + procedure, pass(prec) :: d_apply_v => psb_d_nested_apply_vect + procedure, pass(prec) :: d_apply => psb_d_nested_apply + procedure, pass(prec) :: precbld => psb_d_nested_precbld + procedure, pass(prec) :: precinit => psb_d_nested_precinit + procedure, pass(prec) :: precseti => psb_d_nested_precseti + procedure, pass(prec) :: precsetr => psb_d_nested_precsetr + procedure, pass(prec) :: precsetc => psb_d_nested_precsetc + procedure, pass(prec) :: precdescr => psb_d_nested_precdescr + procedure, pass(prec) :: dump => psb_d_nested_dump + procedure, pass(prec) :: clone => psb_d_nested_clone + procedure, pass(prec) :: free => psb_d_nested_precfree + procedure, pass(prec) :: sizeof => psb_d_nested_sizeof + procedure, pass(prec) :: get_nzeros => psb_d_nested_get_nzeros + end type psb_d_nested_prec_type + + private :: psb_d_nested_apply_vect, psb_d_nested_apply, psb_d_nested_precbld, & + & psb_d_nested_precinit, psb_d_nested_precfree, psb_d_nested_precdescr, & + & psb_d_nested_dump, psb_d_nested_clone, psb_d_nested_sizeof, & + & psb_d_nested_get_nzeros, psb_d_nested_precseti, & + & psb_d_nested_precsetr, psb_d_nested_precsetc, & + & psb_d_nested_clear_built, psb_d_nested_valid_block_solve, & + & psb_d_nested_get_field_block_ptype, psb_d_nested_field_solve, & + & psb_d_nested_apply_schur, psb_d_nested_replay_field_options, & + & psb_d_nested_append_iopt, psb_d_nested_append_ropt, & + & psb_d_nested_append_copt, psb_d_nested_schur_action, & + & psb_d_nested_schur_solve, psb_d_nested_get_field_krylov, & + & psb_d_nested_inner_solve, psb_d_nested_inner_cg, psb_d_nested_inner_bicgstab, & + & psb_d_nested_field_matvec, psb_d_nested_ensure_krylov_field + +contains + + ! Reset nested preconditioner options and detach any previous field state. + subroutine psb_d_nested_precinit(prec, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + prec%composition = psb_d_nested_add_ + prec%composition_name = 'ADDITIVE' + prec%default_block_ptype = 'DIAG' + prec%schur_solve = psb_d_nested_schur_a22_ + prec%schur_solve_name = 'A22' + prec%schur_maxit = 4 + prec%schur_tol = 0.0_psb_dpk_ + prec%default_krylov%enabled = .false. + prec%default_krylov%method = 'CG' + prec%default_krylov%itmax = 20 + prec%default_krylov%itrace = -1 + prec%default_krylov%istop = 2 + prec%default_krylov%tol = 1.0e-6_psb_dpk_ + prec%nfields = 0 + prec%nest_op => null() + if (allocated(prec%field_block_ptype)) deallocate(prec%field_block_ptype) + if (allocated(prec%field_krylov)) deallocate(prec%field_krylov) + if (allocated(prec%field_iopts)) deallocate(prec%field_iopts) + if (allocated(prec%field_ropts)) deallocate(prec%field_ropts) + if (allocated(prec%field_copts)) deallocate(prec%field_copts) + end subroutine psb_d_nested_precinit + + ! Release built field preconditioners while preserving user configuration. + subroutine psb_d_nested_clear_built(prec, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_) :: i, local_info + + info = psb_success_ + if (allocated(prec%blocks)) then + do i = 1, size(prec%blocks) + if (allocated(prec%blocks(i)%pc)) then + call prec%blocks(i)%pc%free(local_info) + deallocate(prec%blocks(i)%pc, stat=local_info) + end if + end do + deallocate(prec%blocks, stat=local_info) + if (local_info /= 0 .and. info == psb_success_) info = local_info + end if + prec%nfields = 0 + prec%nest_op => null() + end subroutine psb_d_nested_clear_built + + ! Fully release nested preconditioner storage and pending configuration. + subroutine psb_d_nested_precfree(prec, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_) :: local_info + + call psb_d_nested_clear_built(prec, info) + if (allocated(prec%field_block_ptype)) then + deallocate(prec%field_block_ptype, stat=local_info) + if (local_info /= 0 .and. info == psb_success_) info = local_info + end if + if (allocated(prec%field_krylov)) then + deallocate(prec%field_krylov, stat=local_info) + if (local_info /= 0 .and. info == psb_success_) info = local_info + end if + if (allocated(prec%field_iopts)) deallocate(prec%field_iopts, stat=local_info) + if (allocated(prec%field_ropts)) deallocate(prec%field_ropts, stat=local_info) + if (allocated(prec%field_copts)) deallocate(prec%field_copts, stat=local_info) + end subroutine psb_d_nested_precfree + + ! Set integer-valued nested options or record integer field-block options. + subroutine psb_d_nested_precseti(prec, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: what, val + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + select case (what) + case (psb_d_nested_schur_maxit_) + prec%schur_maxit = max(0, val) + case (psb_d_nested_inner_maxit_, psb_d_nested_inner_itrace_, & + & psb_d_nested_inner_istop_) + call psb_d_nested_set_field_krylov_i(prec, 0, what, val, info) + case default + call psb_d_nested_append_iopt(prec, 0, what, val, info) + end select + end subroutine psb_d_nested_precseti + + ! Set real-valued nested options or record real field-block options. + subroutine psb_d_nested_precsetr(prec, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: what + real(psb_dpk_), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + select case (what) + case (psb_d_nested_schur_tol_) + prec%schur_tol = max(dzero, val) + case (psb_d_nested_inner_tol_) + call psb_d_nested_set_field_krylov_r(prec, 0, what, val, info) + case default + call psb_d_nested_append_ropt(prec, 0, what, val, info) + end select + end subroutine psb_d_nested_precsetr + + ! Set character-valued nested options or record character field-block options. + subroutine psb_d_nested_precsetc(prec, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: what + character(len=*), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + character(len=64) :: opt + + info = psb_success_ + opt = psb_toupper(trim(val)) + select case (what) + case (psb_d_nested_composition_) + select case (trim(opt)) + case ('DIAGONAL','DIAG') + prec%composition = psb_d_nested_diag_ + prec%composition_name = 'DIAGONAL' + case ('ADDITIVE','ADD') + prec%composition = psb_d_nested_add_ + prec%composition_name = 'ADDITIVE' + case ('MULTIPLICATIVE','MULT') + prec%composition = psb_d_nested_mult_ + prec%composition_name = 'MULTIPLICATIVE' + case ('SYMMETRIC_MULTIPLICATIVE','SYMMETRIC','SYM_MULT','SYMM') + prec%composition = psb_d_nested_symm_ + prec%composition_name = 'SYMMETRIC_MULTIPLICATIVE' + case ('SCHUR_LOWER','LOWER_SCHUR') + prec%composition = psb_d_nested_schur_lower_ + prec%composition_name = 'SCHUR_LOWER' + case ('SCHUR_UPPER','UPPER_SCHUR') + prec%composition = psb_d_nested_schur_upper_ + prec%composition_name = 'SCHUR_UPPER' + case ('SCHUR','SCHUR_FULL','FULL_SCHUR') + prec%composition = psb_d_nested_schur_full_ + prec%composition_name = 'SCHUR_FULL' + case default + info = psb_err_invalid_input_ + end select + case (psb_d_nested_block_solve_) + if (psb_d_nested_valid_block_solve(opt)) then + prec%default_block_ptype = trim(opt) + else + info = psb_err_invalid_preca_ + end if + case (psb_d_nested_schur_solve_) + select case (trim(opt)) + case ('A22','A_22','FIELD','FIELD_BLOCK') + prec%schur_solve = psb_d_nested_schur_a22_ + prec%schur_solve_name = 'A22' + case ('MATRIX_FREE','MATFREE','MF','SELF') + prec%schur_solve = psb_d_nested_schur_mf_ + prec%schur_solve_name = 'MATRIX_FREE' + case default + info = psb_err_invalid_input_ + end select + case (psb_d_nested_inner_solve_) + call psb_d_nested_set_field_krylov_c(prec, 0, what, trim(opt), info) + case default + call psb_d_nested_append_copt(prec, 0, what, trim(val), info) + end select + end subroutine psb_d_nested_precsetc + + ! Build one field preconditioner for each diagonal block of a nested matrix. + subroutine psb_d_nested_precbld(a, desc_a, prec, info, amold, vmold, imold) + type(psb_dspmat_type), intent(in), target :: a + type(psb_desc_type), intent(inout), target :: desc_a + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(out) :: info + class(psb_d_base_sparse_mat), intent(in), optional :: amold + class(psb_d_base_vect_type), intent(in), optional :: vmold + class(psb_i_base_vect_type), intent(in), optional :: imold + + integer(psb_ipk_) :: i, nfields + type(psb_dspmat_type), pointer :: block_ptr + type(psb_desc_type), pointer :: field_desc + character(len=24) :: name + + info = psb_success_ + name = 'd_nested_precbld' + call prec%set_ctxt(desc_a%get_ctxt()) + + call psb_d_nested_clear_built(prec, info) + if (info /= psb_success_) return + call prec%set_ctxt(desc_a%get_ctxt()) + + if (.not. allocated(a%a)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, name, a_err='matrix storage not allocated') + return + end if + + select type (ap => a%a) + type is (psb_d_nest_base_mat) + prec%nest_op => ap + class default + info = psb_err_invalid_input_ + call psb_errpush(info, name, a_err='NEST preconditioner requires NEST matrix') + return + end select + + nfields = psb_d_nest_get_n_fields(prec%nest_op) + if (nfields <= 0) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, name, a_err='nested matrix has no fields') + return + end if + + prec%nfields = nfields + allocate(prec%blocks(nfields), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, name, a_err='blocks') + return + end if + + do i = 1, nfields + field_desc => psb_d_nest_get_field_desc(prec%nest_op, i) + if (.not. associated(field_desc)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, name, a_err='missing field descriptor') + return + end if + + block_ptr => psb_d_nest_get_block(prec%nest_op, i, i) + if (associated(block_ptr)) then + prec%blocks(i)%ptype = psb_d_nested_get_field_block_ptype(prec, i) + else + prec%blocks(i)%ptype = 'NONE' + end if + call psb_d_nested_alloc_block_pc(prec%blocks(i), info) + if (info /= psb_success_) return + call prec%blocks(i)%pc%precinit(info) + if (info /= psb_success_) return + call psb_d_nested_replay_field_options(prec, i, info) + if (info /= psb_success_) return + + if (associated(block_ptr)) then + call prec%blocks(i)%pc%precbld(block_ptr, field_desc, info, & + & amold=amold, vmold=vmold, imold=imold) + else + call prec%blocks(i)%pc%set_ctxt(field_desc%get_ctxt()) + end if + if (info /= psb_success_) return + end do + end subroutine psb_d_nested_precbld + + ! Allocate the concrete PSBLAS preconditioner requested for one field block. + subroutine psb_d_nested_alloc_block_pc(block, info) + type(psb_d_nested_block_prec), intent(inout) :: block + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + if (allocated(block%pc)) then + call block%pc%free(info) + if (info == psb_success_) deallocate(block%pc, stat=info) + if (info /= psb_success_) return + end if + select case (psb_toupper(trim(block%ptype))) + case ('NONE','NOPREC') + allocate(psb_d_null_prec_type :: block%pc, stat=info) + case ('DIAG') + allocate(psb_d_diag_prec_type :: block%pc, stat=info) + case ('BJAC') + allocate(psb_d_bjac_prec_type :: block%pc, stat=info) + case default + info = psb_err_invalid_preca_ + end select + end subroutine psb_d_nested_alloc_block_pc + + logical function psb_d_nested_valid_block_solve(ptype) result(valid) + character(len=*), intent(in) :: ptype + character(len=32) :: opt + + opt = psb_toupper(trim(ptype)) + select case (trim(opt)) + case ('NONE','NOPREC','DIAG','BJAC') + valid = .true. + case default + valid = .false. + end select + end function psb_d_nested_valid_block_solve + + function psb_d_nested_get_field_block_ptype(prec, field) result(ptype) + class(psb_d_nested_prec_type), intent(in) :: prec + integer(psb_ipk_), intent(in) :: field + character(len=16) :: ptype + + ptype = prec%default_block_ptype + if (allocated(prec%field_block_ptype)) then + if (field <= size(prec%field_block_ptype)) then + if (len_trim(prec%field_block_ptype(field)) > 0) & + & ptype = prec%field_block_ptype(field) + end if + end if + end function psb_d_nested_get_field_block_ptype + + ! Store a per-field block preconditioner type override. + subroutine psb_d_nested_set_block_solve_field(prec, field, ptype, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + character(len=*), intent(in) :: ptype + integer(psb_ipk_), intent(out) :: info + + character(len=16), allocatable :: tmp(:) + character(len=32) :: opt + integer(psb_ipk_) :: old_size + + info = psb_success_ + opt = psb_toupper(trim(ptype)) + if (field <= 0) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_block_solve_field', a_err='field index') + return + end if + if (.not. psb_d_nested_valid_block_solve(opt)) then + info = psb_err_invalid_preca_ + call psb_errpush(info, 'd_nested_set_block_solve_field', a_err='block solve') + return + end if + + old_size = 0 + if (allocated(prec%field_block_ptype)) old_size = size(prec%field_block_ptype) + if (field > old_size) then + allocate(tmp(field), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_set_block_solve_field', a_err='field block options') + return + end if + tmp(:) = '' + if (old_size > 0) tmp(1:old_size) = prec%field_block_ptype(1:old_size) + call move_alloc(tmp, prec%field_block_ptype) + end if + prec%field_block_ptype(field) = trim(opt) + end subroutine psb_d_nested_set_block_solve_field + + subroutine psb_d_nested_ensure_krylov_field(prec, field, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + integer(psb_ipk_), intent(out) :: info + + type(psb_d_nested_krylov_context), allocatable :: tmp(:) + integer(psb_ipk_) :: old_size + + info = psb_success_ + if (field <= 0) return + + old_size = 0 + if (allocated(prec%field_krylov)) old_size = size(prec%field_krylov) + if (field > old_size) then + allocate(tmp(field), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_ensure_krylov_field', a_err='field krylov options') + return + end if + tmp(:) = prec%default_krylov + if (old_size > 0) tmp(1:old_size) = prec%field_krylov(1:old_size) + call move_alloc(tmp, prec%field_krylov) + end if + end subroutine psb_d_nested_ensure_krylov_field + + subroutine psb_d_nested_set_field_krylov_c(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what + character(len=*), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + character(len=32) :: opt + + info = psb_success_ + opt = psb_toupper(trim(val)) + if (field < 0) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_krylov_c', a_err='field index') + return + end if + if (what /= psb_d_nested_inner_solve_) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_krylov_c', a_err='inner option') + return + end if + + select case (trim(opt)) + case ('NONE','NO','OFF','FALSE') + if (field == 0) then + prec%default_krylov%enabled = .false. + else + call psb_d_nested_ensure_krylov_field(prec, field, info) + if (info == psb_success_) prec%field_krylov(field)%enabled = .false. + end if + case ('CG','BICGSTAB','BICGstab','BiCGSTAB') + if (field == 0) then + prec%default_krylov%enabled = .true. + prec%default_krylov%method = trim(opt) + else + call psb_d_nested_ensure_krylov_field(prec, field, info) + if (info == psb_success_) then + prec%field_krylov(field)%enabled = .true. + prec%field_krylov(field)%method = trim(opt) + end if + end if + case default + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_krylov_c', a_err='inner method') + end select + end subroutine psb_d_nested_set_field_krylov_c + + subroutine psb_d_nested_set_field_krylov_i(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what, val + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + if (field < 0) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_krylov_i', a_err='field index') + return + end if + + if (field > 0) then + call psb_d_nested_ensure_krylov_field(prec, field, info) + if (info /= psb_success_) return + end if + + select case (what) + case (psb_d_nested_inner_maxit_) + if (field == 0) then + prec%default_krylov%itmax = max(1, val) + else + prec%field_krylov(field)%itmax = max(1, val) + end if + case (psb_d_nested_inner_itrace_) + if (field == 0) then + prec%default_krylov%itrace = val + else + prec%field_krylov(field)%itrace = val + end if + case (psb_d_nested_inner_istop_) + if (field == 0) then + prec%default_krylov%istop = val + else + prec%field_krylov(field)%istop = val + end if + case default + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_krylov_i', a_err='inner option') + end select + end subroutine psb_d_nested_set_field_krylov_i + + subroutine psb_d_nested_set_field_krylov_r(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what + real(psb_dpk_), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + if (field < 0) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_krylov_r', a_err='field index') + return + end if + if (what /= psb_d_nested_inner_tol_) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_krylov_r', a_err='inner option') + return + end if + + if (field == 0) then + prec%default_krylov%tol = max(dzero, val) + else + call psb_d_nested_ensure_krylov_field(prec, field, info) + if (info == psb_success_) prec%field_krylov(field)%tol = max(dzero, val) + end if + end subroutine psb_d_nested_set_field_krylov_r + + function psb_d_nested_get_field_krylov(prec, field) result(ctx) + class(psb_d_nested_prec_type), intent(in) :: prec + integer(psb_ipk_), intent(in) :: field + type(psb_d_nested_krylov_context) :: ctx + + ctx = prec%default_krylov + if (allocated(prec%field_krylov)) then + if (field <= size(prec%field_krylov)) ctx = prec%field_krylov(field) + end if + end function psb_d_nested_get_field_krylov + + ! Append a pending integer option for all fields or one selected field. + subroutine psb_d_nested_append_iopt(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what, val + integer(psb_ipk_), intent(out) :: info + + type(psb_d_nested_iopt), allocatable :: tmp(:) + integer(psb_ipk_) :: n + + info = psb_success_ + n = 0 + if (allocated(prec%field_iopts)) n = size(prec%field_iopts) + allocate(tmp(n+1), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_append_iopt', a_err='field integer options') + return + end if + if (n > 0) tmp(1:n) = prec%field_iopts(1:n) + tmp(n+1)%field = field + tmp(n+1)%what = what + tmp(n+1)%val = val + call move_alloc(tmp, prec%field_iopts) + end subroutine psb_d_nested_append_iopt + + ! Append a pending real option for all fields or one selected field. + subroutine psb_d_nested_append_ropt(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what + real(psb_dpk_), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + type(psb_d_nested_ropt), allocatable :: tmp(:) + integer(psb_ipk_) :: n + + info = psb_success_ + n = 0 + if (allocated(prec%field_ropts)) n = size(prec%field_ropts) + allocate(tmp(n+1), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_append_ropt', a_err='field real options') + return + end if + if (n > 0) tmp(1:n) = prec%field_ropts(1:n) + tmp(n+1)%field = field + tmp(n+1)%what = what + tmp(n+1)%val = val + call move_alloc(tmp, prec%field_ropts) + end subroutine psb_d_nested_append_ropt + + ! Append a pending character option for all fields or one selected field. + subroutine psb_d_nested_append_copt(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what + character(len=*), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + type(psb_d_nested_copt), allocatable :: tmp(:) + integer(psb_ipk_) :: n + + info = psb_success_ + n = 0 + if (allocated(prec%field_copts)) n = size(prec%field_copts) + allocate(tmp(n+1), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_append_copt', a_err='field character options') + return + end if + if (n > 0) tmp(1:n) = prec%field_copts(1:n) + tmp(n+1)%field = field + tmp(n+1)%what = what + tmp(n+1)%val = trim(val) + call move_alloc(tmp, prec%field_copts) + end subroutine psb_d_nested_append_copt + + ! Record an integer option to replay on a field block preconditioner. + subroutine psb_d_nested_set_field_precseti(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what, val + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + if (field < 0) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_precseti', a_err='field index') + return + end if + call psb_d_nested_append_iopt(prec, field, what, val, info) + end subroutine psb_d_nested_set_field_precseti + + ! Record a real option to replay on a field block preconditioner. + subroutine psb_d_nested_set_field_precsetr(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what + real(psb_dpk_), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + if (field < 0) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_precsetr', a_err='field index') + return + end if + call psb_d_nested_append_ropt(prec, field, what, val, info) + end subroutine psb_d_nested_set_field_precsetr + + ! Record a character option to replay on a field block preconditioner. + subroutine psb_d_nested_set_field_precsetc(prec, field, what, val, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field, what + character(len=*), intent(in) :: val + integer(psb_ipk_), intent(out) :: info + + info = psb_success_ + if (field < 0) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_set_field_precsetc', a_err='field index') + return + end if + call psb_d_nested_append_copt(prec, field, what, val, info) + end subroutine psb_d_nested_set_field_precsetc + + ! Apply stored field-specific options to a built field block preconditioner. + subroutine psb_d_nested_replay_field_options(prec, field, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + integer(psb_ipk_), intent(out) :: info + + integer(psb_ipk_) :: k + + info = psb_success_ + if (.not. allocated(prec%blocks(field)%pc)) return + + if (allocated(prec%field_iopts)) then + do k = 1, size(prec%field_iopts) + if ((prec%field_iopts(k)%field == 0) .or. (prec%field_iopts(k)%field == field)) then + call prec%blocks(field)%pc%precset(prec%field_iopts(k)%what, & + & prec%field_iopts(k)%val, info) + if (info /= psb_success_) return + end if + end do + end if + + if (allocated(prec%field_ropts)) then + do k = 1, size(prec%field_ropts) + if ((prec%field_ropts(k)%field == 0) .or. (prec%field_ropts(k)%field == field)) then + call prec%blocks(field)%pc%precset(prec%field_ropts(k)%what, & + & prec%field_ropts(k)%val, info) + if (info /= psb_success_) return + end if + end do + end if + + if (allocated(prec%field_copts)) then + do k = 1, size(prec%field_copts) + if ((prec%field_copts(k)%field == 0) .or. (prec%field_copts(k)%field == field)) then + call prec%blocks(field)%pc%precset(prec%field_copts(k)%what, & + & trim(prec%field_copts(k)%val), info) + if (info /= psb_success_) return + end if + end do + end if + end subroutine psb_d_nested_replay_field_options + + ! Apply the nested preconditioner to PSBLAS vector objects. + subroutine psb_d_nested_apply_vect(alpha, prec, x, beta, y, desc_data, info, trans) + type(psb_desc_type), intent(in) :: desc_data + class(psb_d_nested_prec_type), intent(inout) :: prec + type(psb_d_vect_type), intent(inout) :: x + real(psb_dpk_), intent(in) :: alpha, beta + type(psb_d_vect_type), intent(inout) :: y + integer(psb_ipk_), intent(out) :: info + character(len=1), optional :: trans + + real(psb_dpk_), allocatable :: xbuf(:), ybuf(:) + integer(psb_ipk_) :: ncol + + info = psb_success_ + ncol = desc_data%get_local_cols() + xbuf = x%get_vect(ncol) + allocate(ybuf(max(y%get_nrows(), ncol)), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + return + end if + ybuf(:) = dzero + call psb_d_nested_apply(alpha, prec, xbuf, beta, ybuf, desc_data, info, trans) + if (info == psb_success_) call y%set(ybuf(1:y%get_nrows())) + deallocate(ybuf) + end subroutine psb_d_nested_apply_vect + + ! Apply the selected nested composition to raw array vectors. + subroutine psb_d_nested_apply(alpha, prec, x, beta, y, desc_data, info, trans, work) + type(psb_desc_type), intent(in) :: desc_data + class(psb_d_nested_prec_type), intent(inout) :: prec + real(psb_dpk_), intent(inout) :: x(:) + real(psb_dpk_), intent(in) :: alpha, beta + real(psb_dpk_), intent(inout) :: y(:) + integer(psb_ipk_), intent(out) :: info + character(len=1), optional :: trans + real(psb_dpk_), intent(inout), optional, target :: work(:) + + real(psb_dpk_), allocatable :: z(:) + integer(psb_ipk_) :: nrow, ncol + character :: trans_ + character(len=24) :: name + + info = psb_success_ + name = 'd_nested_apply' + trans_ = 'N' + if (present(trans)) trans_ = psb_toupper(trans) + if (.not. associated(prec%nest_op) .or. .not. allocated(prec%blocks)) then + info = 1124 + call psb_errpush(info, name, a_err='nested preconditioner') + return + end if + + nrow = desc_data%get_local_rows() + ncol = desc_data%get_local_cols() + if (size(x) < nrow .or. size(y) < nrow) then + info = psb_err_invalid_input_ + call psb_errpush(info, name, a_err='vector too small') + return + end if + allocate(z(max(nrow,ncol)), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, name, a_err='work vector') + return + end if + z(:) = dzero + + if (trans_ /= 'N') then + info = psb_err_transpose_not_n_unsupported_ + call psb_errpush(info, name) + else + select case (prec%composition) + case (psb_d_nested_diag_, psb_d_nested_add_) + call psb_d_nested_add_apply(prec, x, z, desc_data, info) + case (psb_d_nested_mult_) + call psb_d_nested_sweep(prec, x, z, desc_data, 1, prec%nfields, 1, info) + case (psb_d_nested_symm_) + call psb_d_nested_sweep(prec, x, z, desc_data, 1, prec%nfields, 1, info) + if (info == psb_success_) & + & call psb_d_nested_sweep(prec, x, z, desc_data, prec%nfields, 1, -1, info) + case (psb_d_nested_schur_lower_, psb_d_nested_schur_upper_, psb_d_nested_schur_full_) + call psb_d_nested_apply_schur(prec, x, z, desc_data, info) + case default + info = psb_err_invalid_input_ + call psb_errpush(info, name, a_err='unknown composition') + end select + end if + if (info /= psb_success_) then + deallocate(z) + return + end if + + if (beta == dzero) then + y(1:nrow) = alpha * z(1:nrow) + else + y(1:nrow) = alpha * z(1:nrow) + beta * y(1:nrow) + end if + deallocate(z) + end subroutine psb_d_nested_apply + + ! Apply the additive or diagonal field-split composition. + subroutine psb_d_nested_add_apply(prec, x, z, desc_data, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + real(psb_dpk_), intent(in) :: x(:) + real(psb_dpk_), intent(inout) :: z(:) + type(psb_desc_type), intent(in) :: desc_data + integer(psb_ipk_), intent(out) :: info + + integer(psb_ipk_) :: i, n_owned, n_col_field + real(psb_dpk_), allocatable :: rhs(:), sol(:), wrk(:) + type(psb_desc_type), pointer :: field_desc + + info = psb_success_ + z(:) = dzero + + do i = 1, prec%nfields + n_owned = psb_d_nest_get_field_owned(prec%nest_op, i) + field_desc => psb_d_nest_get_field_desc(prec%nest_op, i) + + if (.not. associated(field_desc)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_add_apply', a_err='missing field descriptor') + return + end if + + n_col_field = field_desc%get_local_cols() + + allocate(rhs(n_col_field), sol(n_col_field), wrk(n_col_field), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_add_apply', a_err='field vectors') + return + end if + + rhs(:) = dzero + sol(:) = dzero + + call psb_d_nest_restrict_field(prec%nest_op, i, x, rhs, info) + + if (info == psb_success_) then + call prec%blocks(i)%pc%apply(done, rhs, dzero, sol, field_desc, info, trans='N', work=wrk) + end if + + if (info == psb_success_) then + call psb_d_nest_prolong_field(prec%nest_op, i, sol, z, info) + end if + + deallocate(rhs, sol, wrk) + + if (info /= psb_success_) return + end do + + end subroutine psb_d_nested_add_apply + + ! Apply one forward or backward multiplicative field sweep. + subroutine psb_d_nested_sweep(prec, x, z, desc_data, first, last, step, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + real(psb_dpk_), intent(in) :: x(:) + real(psb_dpk_), intent(inout) :: z(:) + type(psb_desc_type), intent(in) :: desc_data + integer(psb_ipk_), intent(in) :: first, last, step + integer(psb_ipk_), intent(out) :: info + + integer(psb_ipk_) :: i, j, n_owned, n_col_i, n_col_j + real(psb_dpk_), allocatable :: rhs(:), sol(:), z_field(:), wrk(:) + type(psb_desc_type), pointer :: field_desc_i, field_desc_j + type(psb_dspmat_type), pointer :: block_ptr + + info = psb_success_ + + call psb_halo(z, desc_data, info) + if (info /= psb_success_) return + + i = first + do + n_owned = psb_d_nest_get_field_owned(prec%nest_op, i) + field_desc_i => psb_d_nest_get_field_desc(prec%nest_op, i) + + if (.not. associated(field_desc_i)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_sweep', a_err='missing row field descriptor') + return + end if + + n_col_i = field_desc_i%get_local_cols() + + allocate(rhs(n_col_i), sol(n_col_i), wrk(n_col_i), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_sweep', a_err='field vectors') + return + end if + + rhs(:) = dzero + sol(:) = dzero + + call psb_d_nest_restrict_field(prec%nest_op, i, x, rhs, info) + if (info /= psb_success_) then + deallocate(rhs, sol, wrk) + return + end if + + do j = 1, prec%nfields + if (j == i) cycle + + block_ptr => psb_d_nest_get_block(prec%nest_op, i, j) + if (.not. associated(block_ptr)) cycle + + field_desc_j => psb_d_nest_get_field_desc(prec%nest_op, j) + if (.not. associated(field_desc_j)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_sweep', a_err='missing column field descriptor') + deallocate(rhs, sol, wrk) + return + end if + + n_col_j = field_desc_j%get_local_cols() + + allocate(z_field(n_col_j), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_sweep', a_err='offdiag field vector') + deallocate(rhs, sol, wrk) + return + end if + + z_field(:) = dzero + + call psb_d_nest_restrict_field_local(prec%nest_op, j, z, z_field, info) + + if (info == psb_success_) then + call psb_d_nest_apply_block(prec%nest_op, i, j, -done, z_field, done, rhs, info) + end if + + deallocate(z_field) + + if (info /= psb_success_) then + deallocate(rhs, sol, wrk) + return + end if + end do + + call prec%blocks(i)%pc%apply(done, rhs, dzero, sol, field_desc_i, info, trans='N', work=wrk) + + if (info == psb_success_) then + call psb_d_nest_prolong_field(prec%nest_op, i, sol, z, info) + end if + + deallocate(rhs, sol, wrk) + + if (info /= psb_success_) return + + call psb_halo(z, desc_data, info) + if (info /= psb_success_) return + + if (i == last) exit + i = i + step + end do + end subroutine psb_d_nested_sweep + + ! Apply one field block preconditioner to a field-local right-hand side. + subroutine psb_d_nested_field_solve(prec, field, rhs, sol, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + real(psb_dpk_), intent(inout) :: rhs(:) + real(psb_dpk_), intent(inout) :: sol(:) + integer(psb_ipk_), intent(out) :: info + + real(psb_dpk_), allocatable :: wrk(:) + type(psb_desc_type), pointer :: field_desc + type(psb_d_nested_krylov_context) :: kctx + integer(psb_ipk_) :: n_col_field + + info = psb_success_ + field_desc => psb_d_nest_get_field_desc(prec%nest_op, field) + if (.not. associated(field_desc)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_field_solve', a_err='missing field descriptor') + return + end if + + n_col_field = field_desc%get_local_cols() + kctx = prec%default_krylov + if (allocated(prec%field_krylov)) then + if (field <= size(prec%field_krylov)) kctx = prec%field_krylov(field) + end if + if (kctx%enabled) then + call psb_d_nested_inner_solve(prec, field, kctx, rhs, sol, info) + return + end if + + allocate(wrk(n_col_field), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_field_solve', a_err='work vector') + return + end if + + sol(:) = dzero + call prec%blocks(field)%pc%apply(done, rhs, dzero, sol, field_desc, info, trans='N', work=wrk) + deallocate(wrk) + end subroutine psb_d_nested_field_solve + + subroutine psb_d_nested_field_matvec(prec, field, x, y, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + real(psb_dpk_), intent(inout) :: x(:) + real(psb_dpk_), intent(inout) :: y(:) + integer(psb_ipk_), intent(out) :: info + + real(psb_dpk_), allocatable :: xh(:) + type(psb_desc_type), pointer :: field_desc + type(psb_dspmat_type), pointer :: block_ptr + integer(psb_ipk_) :: n_col_field + + info = psb_success_ + field_desc => psb_d_nest_get_field_desc(prec%nest_op, field) + block_ptr => psb_d_nest_get_block(prec%nest_op, field, field) + if (.not. associated(field_desc)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_field_matvec', a_err='missing field descriptor') + return + end if + if (.not. associated(block_ptr)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_field_matvec', a_err='missing diagonal block') + return + end if + + n_col_field = field_desc%get_local_cols() + allocate(xh(n_col_field), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_field_matvec', a_err='work vector') + return + end if + xh(:) = dzero + xh(1:min(size(x), n_col_field)) = x(1:min(size(x), n_col_field)) + call psb_halo(xh, field_desc, info) + if (info == psb_success_) then + y(:) = dzero + call psb_d_nest_apply_block(prec%nest_op, field, field, done, xh, dzero, y, info) + end if + deallocate(xh) + end subroutine psb_d_nested_field_matvec + + subroutine psb_d_nested_inner_solve(prec, field, kctx, rhs, sol, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + type(psb_d_nested_krylov_context), intent(in) :: kctx + real(psb_dpk_), intent(inout) :: rhs(:) + real(psb_dpk_), intent(inout) :: sol(:) + integer(psb_ipk_), intent(out) :: info + + select case (psb_toupper(trim(kctx%method))) + case ('CG') + call psb_d_nested_inner_cg(prec, field, kctx, rhs, sol, info) + case ('BICGSTAB') + call psb_d_nested_inner_bicgstab(prec, field, kctx, rhs, sol, info) + case default + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_inner_solve', a_err='inner Krylov method') + end select + end subroutine psb_d_nested_inner_solve + + subroutine psb_d_nested_inner_cg(prec, field, kctx, rhs, sol, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + type(psb_d_nested_krylov_context), intent(in) :: kctx + real(psb_dpk_), intent(inout) :: rhs(:) + real(psb_dpk_), intent(inout) :: sol(:) + integer(psb_ipk_), intent(out) :: info + + real(psb_dpk_), allocatable :: r(:), z(:), p(:), q(:), wrk(:) + type(psb_desc_type), pointer :: field_desc + integer(psb_ipk_) :: n_col_field, k + real(psb_dpk_) :: bnorm, rnorm, rz, rz_new, denom, alpha, beta + + info = psb_success_ + field_desc => psb_d_nest_get_field_desc(prec%nest_op, field) + if (.not. associated(field_desc)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_inner_cg', a_err='missing field descriptor') + return + end if + n_col_field = field_desc%get_local_cols() + allocate(r(n_col_field), z(n_col_field), p(n_col_field), q(n_col_field), & + & wrk(n_col_field), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_inner_cg', a_err='inner work vectors') + return + end if + + sol(:) = dzero + r(:) = rhs(:) + bnorm = sqrt(max(dzero, psb_gedot(rhs, rhs, field_desc, info))) + if (info /= psb_success_) goto 100 + if (bnorm == dzero) goto 100 + + z(:) = dzero + call prec%blocks(field)%pc%apply(done, r, dzero, z, field_desc, info, trans='N', work=wrk) + if (info /= psb_success_) goto 100 + p(:) = z(:) + rz = psb_gedot(r, z, field_desc, info) + if (info /= psb_success_) goto 100 + + do k = 1, max(1, kctx%itmax) + q(:) = dzero + call psb_d_nested_field_matvec(prec, field, p, q, info) + if (info /= psb_success_) exit + denom = psb_gedot(p, q, field_desc, info) + if (info /= psb_success_) exit + if (abs(denom) <= epsilon(done)) exit + alpha = rz / denom + sol(:) = sol(:) + alpha * p(:) + r(:) = r(:) - alpha * q(:) + rnorm = sqrt(max(dzero, psb_gedot(r, r, field_desc, info))) + if (info /= psb_success_) exit + if (rnorm <= kctx%tol * bnorm) exit + z(:) = dzero + call prec%blocks(field)%pc%apply(done, r, dzero, z, field_desc, info, trans='N', work=wrk) + if (info /= psb_success_) exit + rz_new = psb_gedot(r, z, field_desc, info) + if (info /= psb_success_) exit + if (abs(rz) <= epsilon(done)) exit + beta = rz_new / rz + p(:) = z(:) + beta * p(:) + rz = rz_new + end do + +100 continue + deallocate(r, z, p, q, wrk) + end subroutine psb_d_nested_inner_cg + + subroutine psb_d_nested_inner_bicgstab(prec, field, kctx, rhs, sol, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + integer(psb_ipk_), intent(in) :: field + type(psb_d_nested_krylov_context), intent(in) :: kctx + real(psb_dpk_), intent(inout) :: rhs(:) + real(psb_dpk_), intent(inout) :: sol(:) + integer(psb_ipk_), intent(out) :: info + + real(psb_dpk_), allocatable :: r(:), r0(:), p(:), v(:), s(:), t(:), ph(:), sh(:), wrk(:) + type(psb_desc_type), pointer :: field_desc + integer(psb_ipk_) :: n_col_field, k + real(psb_dpk_) :: bnorm, rnorm, rho, rho_old, alpha, beta, omega, denom + + info = psb_success_ + field_desc => psb_d_nest_get_field_desc(prec%nest_op, field) + if (.not. associated(field_desc)) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_inner_bicgstab', a_err='missing field descriptor') + return + end if + n_col_field = field_desc%get_local_cols() + allocate(r(n_col_field), r0(n_col_field), p(n_col_field), v(n_col_field), & + & s(n_col_field), t(n_col_field), ph(n_col_field), sh(n_col_field), & + & wrk(n_col_field), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_inner_bicgstab', a_err='inner work vectors') + return + end if + + sol(:) = dzero + r(:) = rhs(:) + r0(:) = r(:) + p(:) = dzero + v(:) = dzero + rho_old = done + alpha = done + omega = done + bnorm = sqrt(max(dzero, psb_gedot(rhs, rhs, field_desc, info))) + if (info /= psb_success_) goto 100 + if (bnorm == dzero) goto 100 + + do k = 1, max(1, kctx%itmax) + rho = psb_gedot(r0, r, field_desc, info) + if (info /= psb_success_) exit + if (abs(rho) <= epsilon(done)) exit + if (k == 1) then + p(:) = r(:) + else + if (abs(omega) <= epsilon(done)) exit + beta = (rho / rho_old) * (alpha / omega) + p(:) = r(:) + beta * (p(:) - omega * v(:)) + end if + + ph(:) = dzero + call prec%blocks(field)%pc%apply(done, p, dzero, ph, field_desc, info, trans='N', work=wrk) + if (info /= psb_success_) exit + v(:) = dzero + call psb_d_nested_field_matvec(prec, field, ph, v, info) + if (info /= psb_success_) exit + denom = psb_gedot(r0, v, field_desc, info) + if (info /= psb_success_) exit + if (abs(denom) <= epsilon(done)) exit + alpha = rho / denom + s(:) = r(:) - alpha * v(:) + rnorm = sqrt(max(dzero, psb_gedot(s, s, field_desc, info))) + if (info /= psb_success_) exit + if (rnorm <= kctx%tol * bnorm) then + sol(:) = sol(:) + alpha * ph(:) + exit + end if + + sh(:) = dzero + call prec%blocks(field)%pc%apply(done, s, dzero, sh, field_desc, info, trans='N', work=wrk) + if (info /= psb_success_) exit + t(:) = dzero + call psb_d_nested_field_matvec(prec, field, sh, t, info) + if (info /= psb_success_) exit + denom = psb_gedot(t, t, field_desc, info) + if (info /= psb_success_) exit + if (abs(denom) <= epsilon(done)) exit + omega = psb_gedot(t, s, field_desc, info) / denom + if (info /= psb_success_) exit + sol(:) = sol(:) + alpha * ph(:) + omega * sh(:) + r(:) = s(:) - omega * t(:) + rnorm = sqrt(max(dzero, psb_gedot(r, r, field_desc, info))) + if (info /= psb_success_) exit + if (rnorm <= kctx%tol * bnorm) exit + rho_old = rho + end do + +100 continue + deallocate(r, r0, p, v, s, t, ph, sh, wrk) + end subroutine psb_d_nested_inner_bicgstab + + ! Apply the matrix-free Schur action Sx on the second field. + subroutine psb_d_nested_schur_action(prec, x2, y2, gwork, desc_data, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + real(psb_dpk_), intent(inout) :: x2(:) + real(psb_dpk_), intent(inout) :: y2(:) + real(psb_dpk_), intent(inout) :: gwork(:) + type(psb_desc_type), intent(in) :: desc_data + integer(psb_ipk_), intent(out) :: info + + real(psb_dpk_), allocatable :: x2h(:), t1(:), w1(:), w1h(:), t2(:) + type(psb_desc_type), pointer :: desc1, desc2 + type(psb_dspmat_type), pointer :: block12, block21, block22 + integer(psb_ipk_) :: n_col_1, n_col_2 + + info = psb_success_ + desc1 => psb_d_nest_get_field_desc(prec%nest_op, 1) + desc2 => psb_d_nest_get_field_desc(prec%nest_op, 2) + if ((.not. associated(desc1)) .or. (.not. associated(desc2))) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_schur_action', a_err='missing field descriptor') + return + end if + + n_col_1 = desc1%get_local_cols() + n_col_2 = desc2%get_local_cols() + allocate(x2h(n_col_2), t1(n_col_1), w1(n_col_1), w1h(n_col_1), & + & t2(n_col_2), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_schur_action', a_err='field vectors') + return + end if + + x2h(:) = dzero + t1(:) = dzero + w1(:) = dzero + w1h(:) = dzero + t2(:) = dzero + y2(:) = dzero + + gwork(:) = dzero + call psb_d_nest_prolong_field(prec%nest_op, 2, x2, gwork, info) + if (info /= psb_success_) goto 100 + call psb_halo(gwork, desc_data, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_restrict_field_local(prec%nest_op, 2, gwork, x2h, info) + if (info /= psb_success_) goto 100 + + block22 => psb_d_nest_get_block(prec%nest_op, 2, 2) + if (associated(block22)) then + call psb_d_nest_apply_block(prec%nest_op, 2, 2, done, x2h, dzero, y2, info) + if (info /= psb_success_) goto 100 + end if + + block12 => psb_d_nest_get_block(prec%nest_op, 1, 2) + block21 => psb_d_nest_get_block(prec%nest_op, 2, 1) + if (associated(block12) .and. associated(block21)) then + call psb_d_nest_apply_block(prec%nest_op, 1, 2, done, x2h, dzero, t1, info) + if (info /= psb_success_) goto 100 + call psb_d_nested_field_solve(prec, 1, t1, w1, info) + if (info /= psb_success_) goto 100 + + gwork(:) = dzero + call psb_d_nest_prolong_field(prec%nest_op, 1, w1, gwork, info) + if (info /= psb_success_) goto 100 + call psb_halo(gwork, desc_data, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_restrict_field_local(prec%nest_op, 1, gwork, w1h, info) + if (info /= psb_success_) goto 100 + + call psb_d_nest_apply_block(prec%nest_op, 2, 1, done, w1h, dzero, t2, info) + if (info /= psb_success_) goto 100 + y2(:) = y2(:) - t2(:) + end if + +100 continue + deallocate(x2h, t1, w1, w1h, t2) + end subroutine psb_d_nested_schur_action + + ! Approximately solve the Schur block using A22 or the matrix-free Schur action. + subroutine psb_d_nested_schur_solve(prec, rhs, sol, gwork, desc_data, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + real(psb_dpk_), intent(inout) :: rhs(:) + real(psb_dpk_), intent(inout) :: sol(:) + real(psb_dpk_), intent(inout) :: gwork(:) + type(psb_desc_type), intent(in) :: desc_data + integer(psb_ipk_), intent(out) :: info + + real(psb_dpk_), allocatable :: sx(:), res(:), dz(:) + integer(psb_ipk_) :: k, maxit, n_owned + real(psb_dpk_) :: rnrm + + info = psb_success_ + if (prec%schur_solve == psb_d_nested_schur_a22_) then + call psb_d_nested_field_solve(prec, 2, rhs, sol, info) + return + end if + + allocate(sx(size(rhs)), res(size(rhs)), dz(size(rhs)), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_schur_solve', a_err='Schur work vectors') + return + end if + + sol(:) = dzero + maxit = max(1, prec%schur_maxit) + n_owned = psb_d_nest_get_field_owned(prec%nest_op, 2) + do k = 1, maxit + sx(:) = dzero + call psb_d_nested_schur_action(prec, sol, sx, gwork, desc_data, info) + if (info /= psb_success_) exit + res(:) = rhs(:) - sx(:) + if (prec%schur_tol > dzero) then + rnrm = sqrt(sum(res(1:n_owned) * res(1:n_owned))) + if (rnrm <= prec%schur_tol) exit + end if + call psb_d_nested_field_solve(prec, 2, res, dz, info) + if (info /= psb_success_) exit + sol(:) = sol(:) + dz(:) + end do + + deallocate(sx, res, dz) + end subroutine psb_d_nested_schur_solve + + ! Apply the two-field lower, upper, or full Schur-style composition. + subroutine psb_d_nested_apply_schur(prec, x, z, desc_data, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + real(psb_dpk_), intent(in) :: x(:) + real(psb_dpk_), intent(inout) :: z(:) + type(psb_desc_type), intent(in) :: desc_data + integer(psb_ipk_), intent(out) :: info + + real(psb_dpk_), allocatable :: b1(:), b2(:), x1(:), x2(:), t1(:), t2(:), c1(:), gwork(:) + type(psb_desc_type), pointer :: desc1, desc2 + type(psb_dspmat_type), pointer :: block12, block21 + integer(psb_ipk_) :: n_col_1, n_col_2 + + info = psb_success_ + z(:) = dzero + + if (prec%nfields /= 2) then + info = psb_err_invalid_input_ + call psb_errpush(info, 'd_nested_apply_schur', a_err='Schur composition requires two fields') + return + end if + + desc1 => psb_d_nest_get_field_desc(prec%nest_op, 1) + desc2 => psb_d_nest_get_field_desc(prec%nest_op, 2) + if ((.not. associated(desc1)) .or. (.not. associated(desc2))) then + info = psb_err_invalid_mat_state_ + call psb_errpush(info, 'd_nested_apply_schur', a_err='missing field descriptor') + return + end if + + n_col_1 = desc1%get_local_cols() + n_col_2 = desc2%get_local_cols() + allocate(b1(n_col_1), b2(n_col_2), x1(n_col_1), x2(n_col_2), & + & t1(n_col_1), t2(n_col_2), c1(n_col_1), gwork(size(z)), stat=info) + if (info /= 0) then + info = psb_err_alloc_dealloc_ + call psb_errpush(info, 'd_nested_apply_schur', a_err='field vectors') + return + end if + + b1(:) = dzero + b2(:) = dzero + x1(:) = dzero + x2(:) = dzero + t1(:) = dzero + t2(:) = dzero + c1(:) = dzero + gwork(:) = dzero + + call psb_d_nest_restrict_field(prec%nest_op, 1, x, b1, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_restrict_field(prec%nest_op, 2, x, b2, info) + if (info /= psb_success_) goto 100 + + block12 => psb_d_nest_get_block(prec%nest_op, 1, 2) + block21 => psb_d_nest_get_block(prec%nest_op, 2, 1) + + select case (prec%composition) + case (psb_d_nested_schur_lower_, psb_d_nested_schur_full_) + call psb_d_nested_field_solve(prec, 1, b1, x1, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_prolong_field(prec%nest_op, 1, x1, z, info) + if (info /= psb_success_) goto 100 + call psb_halo(z, desc_data, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_restrict_field_local(prec%nest_op, 1, z, x1, info) + if (info /= psb_success_) goto 100 + + t2(:) = b2(:) + if (associated(block21)) then + call psb_d_nest_apply_block(prec%nest_op, 2, 1, -done, x1, done, t2, info) + if (info /= psb_success_) goto 100 + end if + call psb_d_nested_schur_solve(prec, t2, x2, gwork, desc_data, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_prolong_field(prec%nest_op, 2, x2, z, info) + if (info /= psb_success_) goto 100 + + if (prec%composition == psb_d_nested_schur_full_) then + call psb_halo(z, desc_data, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_restrict_field_local(prec%nest_op, 2, z, x2, info) + if (info /= psb_success_) goto 100 + t1(:) = dzero + if (associated(block12)) then + call psb_d_nest_apply_block(prec%nest_op, 1, 2, done, x2, dzero, t1, info) + if (info /= psb_success_) goto 100 + end if + call psb_d_nested_field_solve(prec, 1, t1, c1, info) + if (info /= psb_success_) goto 100 + x1(:) = x1(:) - c1(:) + call psb_d_nest_prolong_field(prec%nest_op, 1, x1, z, info) + if (info /= psb_success_) goto 100 + end if + + case (psb_d_nested_schur_upper_) + call psb_d_nested_schur_solve(prec, b2, x2, gwork, desc_data, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_prolong_field(prec%nest_op, 2, x2, z, info) + if (info /= psb_success_) goto 100 + call psb_halo(z, desc_data, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_restrict_field_local(prec%nest_op, 2, z, x2, info) + if (info /= psb_success_) goto 100 + + t1(:) = b1(:) + if (associated(block12)) then + call psb_d_nest_apply_block(prec%nest_op, 1, 2, -done, x2, done, t1, info) + if (info /= psb_success_) goto 100 + end if + call psb_d_nested_field_solve(prec, 1, t1, x1, info) + if (info /= psb_success_) goto 100 + call psb_d_nest_prolong_field(prec%nest_op, 1, x1, z, info) + if (info /= psb_success_) goto 100 + end select + + call psb_halo(z, desc_data, info) + +100 continue + deallocate(b1, b2, x1, x2, t1, t2, c1, gwork) + end subroutine psb_d_nested_apply_schur + + ! Print a short description of the nested preconditioner configuration. + subroutine psb_d_nested_precdescr(prec, iout, root, verbosity, prefix) + class(psb_d_nested_prec_type), intent(in) :: prec + integer(psb_ipk_), intent(in), optional :: iout + integer(psb_ipk_), intent(in), optional :: root + integer(psb_ipk_), intent(in), optional :: verbosity + character(len=*), intent(in), optional :: prefix + + integer(psb_ipk_) :: iout_, root_, verbosity_, iam, np + character(len=1024) :: prefix_ + + iout_ = 6 + if (present(iout)) iout_ = iout + root_ = psb_root_ + if (present(root)) root_ = root + verbosity_ = 0 + if (present(verbosity)) verbosity_ = verbosity + prefix_ = '' + if (present(prefix)) prefix_ = prefix + if (verbosity_ < 0) return + + call psb_info(prec%ctxt, iam, np) + if (root_ == -1) root_ = iam + if (iam == root_) then + write(iout_,*) trim(prefix_), ' ', trim(prec%desc_prefix()), & + & ' Nested block preconditioner: ', trim(prec%composition_name), & + & ', block solve: ', trim(prec%default_block_ptype) + if (prec%default_krylov%enabled) & + & write(iout_,*) trim(prefix_), ' default inner Krylov: ', & + & trim(prec%default_krylov%method), ' maxit=', prec%default_krylov%itmax, & + & ' tol=', prec%default_krylov%tol + end if + end subroutine psb_d_nested_precdescr + + ! Provide a placeholder dump hook for the nested preconditioner. + subroutine psb_d_nested_dump(prec, info, prefix, head) + class(psb_d_nested_prec_type), intent(in) :: prec + integer(psb_ipk_), intent(out) :: info + character(len=*), intent(in), optional :: prefix, head + info = psb_success_ + end subroutine psb_d_nested_dump + + function psb_d_nested_sizeof(prec) result(val) + class(psb_d_nested_prec_type), intent(in) :: prec + integer(psb_epk_) :: val + integer(psb_ipk_) :: i + val = 0_psb_epk_ + if (allocated(prec%blocks)) then + do i = 1, size(prec%blocks) + if (allocated(prec%blocks(i)%pc)) val = val + prec%blocks(i)%pc%sizeof() + end do + end if + end function psb_d_nested_sizeof + + function psb_d_nested_get_nzeros(prec) result(val) + class(psb_d_nested_prec_type), intent(in) :: prec + integer(psb_epk_) :: val + integer(psb_ipk_) :: i + val = 0_psb_epk_ + if (allocated(prec%blocks)) then + do i = 1, size(prec%blocks) + if (allocated(prec%blocks(i)%pc)) val = val + prec%blocks(i)%pc%get_nzeros() + end do + end if + end function psb_d_nested_get_nzeros + + ! Clone nested preconditioner configuration and any built field blocks. + subroutine psb_d_nested_clone(prec, precout, info) + class(psb_d_nested_prec_type), intent(inout) :: prec + class(psb_d_base_prec_type), allocatable, intent(inout) :: precout + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_) :: i + + info = psb_success_ + if (allocated(precout)) then + call precout%free(info) + if (info == psb_success_) deallocate(precout, stat=info) + if (info /= psb_success_) return + end if + allocate(psb_d_nested_prec_type :: precout, stat=info) + if (info /= 0) return + select type (pout => precout) + type is (psb_d_nested_prec_type) + call pout%set_ctxt(prec%get_ctxt()) + pout%composition = prec%composition + pout%composition_name = prec%composition_name + pout%default_block_ptype = prec%default_block_ptype + pout%schur_solve = prec%schur_solve + pout%schur_solve_name = prec%schur_solve_name + pout%schur_maxit = prec%schur_maxit + pout%schur_tol = prec%schur_tol + pout%default_krylov = prec%default_krylov + pout%nfields = prec%nfields + pout%nest_op => prec%nest_op + if (allocated(prec%field_block_ptype)) then + allocate(pout%field_block_ptype(size(prec%field_block_ptype)), stat=info) + if (info /= 0) return + pout%field_block_ptype(:) = prec%field_block_ptype(:) + end if + if (allocated(prec%field_krylov)) then + allocate(pout%field_krylov(size(prec%field_krylov)), stat=info) + if (info /= 0) return + pout%field_krylov(:) = prec%field_krylov(:) + end if + if (allocated(prec%field_iopts)) then + allocate(pout%field_iopts(size(prec%field_iopts)), stat=info) + if (info /= 0) return + pout%field_iopts(:) = prec%field_iopts(:) + end if + if (allocated(prec%field_ropts)) then + allocate(pout%field_ropts(size(prec%field_ropts)), stat=info) + if (info /= 0) return + pout%field_ropts(:) = prec%field_ropts(:) + end if + if (allocated(prec%field_copts)) then + allocate(pout%field_copts(size(prec%field_copts)), stat=info) + if (info /= 0) return + pout%field_copts(:) = prec%field_copts(:) + end if + if (allocated(prec%blocks)) then + allocate(pout%blocks(size(prec%blocks)), stat=info) + if (info /= 0) return + do i = 1, size(prec%blocks) + pout%blocks(i)%ptype = prec%blocks(i)%ptype + if (allocated(prec%blocks(i)%pc)) & + & call prec%blocks(i)%pc%clone(pout%blocks(i)%pc, info) + if (info /= psb_success_) return + end do + end if + end select + end subroutine psb_d_nested_clone + +end module psb_d_nestedprec diff --git a/test/nested/CMakeLists.txt b/test/nested/CMakeLists.txt index 7e5c44e05..839820616 100644 --- a/test/nested/CMakeLists.txt +++ b/test/nested/CMakeLists.txt @@ -26,6 +26,9 @@ file(MAKE_DIRECTORY ${EXEDIR}) set(SOURCES_D_NEST_GLOB_TEST psb_d_nest_glob_test.F90) set(SOURCES_D_NEST_RECT_TEST psb_d_nest_rect_test.F90) set(SOURCES_D_NEST_CG_TEST psb_d_nest_cg_test.F90) +set(SOURCES_D_NEST_PREC_CG_TEST psb_d_nest_prec_cg_test.F90) +set(SOURCES_D_NEST_MULT_PREC_CG_TEST psb_d_nest_mult_prec_cg_test.F90) +set(SOURCES_D_NEST_SCHUR_PREC_CG_TEST psb_d_nest_schur_prec_cg_test.F90) add_executable(psb_d_nest_glob_test ${SOURCES_D_NEST_GLOB_TEST}) target_link_libraries(psb_d_nest_glob_test psblas::util psblas::linsolve psblas::prec psblas::ext psblas::base) @@ -36,8 +39,17 @@ target_link_libraries(psb_d_nest_rect_test psblas::util psblas::linsolve psblas: add_executable(psb_d_nest_cg_test ${SOURCES_D_NEST_CG_TEST}) target_link_libraries(psb_d_nest_cg_test psblas::util psblas::linsolve psblas::prec psblas::ext psblas::base) +add_executable(psb_d_nest_prec_cg_test ${SOURCES_D_NEST_PREC_CG_TEST}) +target_link_libraries(psb_d_nest_prec_cg_test psblas::util psblas::linsolve psblas::prec psblas::ext psblas::base) + +add_executable(psb_d_nest_mult_prec_cg_test ${SOURCES_D_NEST_MULT_PREC_CG_TEST}) +target_link_libraries(psb_d_nest_mult_prec_cg_test psblas::util psblas::linsolve psblas::prec psblas::ext psblas::base) + +add_executable(psb_d_nest_schur_prec_cg_test ${SOURCES_D_NEST_SCHUR_PREC_CG_TEST}) +target_link_libraries(psb_d_nest_schur_prec_cg_test psblas::util psblas::linsolve psblas::prec psblas::ext psblas::base) + # Set output directory for executables -foreach(target psb_d_nest_glob_test psb_d_nest_rect_test psb_d_nest_cg_test) +foreach(target psb_d_nest_glob_test psb_d_nest_rect_test psb_d_nest_cg_test psb_d_nest_prec_cg_test psb_d_nest_mult_prec_cg_test psb_d_nest_schur_prec_cg_test) set_target_properties(${target} PROPERTIES RUNTIME_OUTPUT_DIRECTORY ${EXEDIR} ) diff --git a/test/nested/Makefile b/test/nested/Makefile index d647aee38..9504bcd5d 100644 --- a/test/nested/Makefile +++ b/test/nested/Makefile @@ -16,7 +16,8 @@ FINCLUDES=$(FMFLAG)$(MODDIR) $(FMFLAG). EXEDIR=./runs -all: runsd psb_d_nest_glob_test psb_d_nest_rect_test psb_d_nest_cg_test +all: runsd psb_d_nest_glob_test psb_d_nest_rect_test psb_d_nest_cg_test psb_d_nest_prec_cg_test \ + psb_d_nest_mult_prec_cg_test psb_d_nest_schur_prec_cg_test runsd: (if test ! -d runs ; then mkdir runs; fi) @@ -33,9 +34,24 @@ psb_d_nest_cg_test: psb_d_nest_cg_test.o $(FLINK) psb_d_nest_cg_test.o -o psb_d_nest_cg_test $(PSBLAS_LIB) $(LDLIBS) /bin/mv psb_d_nest_cg_test $(EXEDIR) +psb_d_nest_prec_cg_test: psb_d_nest_prec_cg_test.o + $(FLINK) psb_d_nest_prec_cg_test.o -o psb_d_nest_prec_cg_test $(PSBLAS_LIB) $(LDLIBS) + /bin/mv psb_d_nest_prec_cg_test $(EXEDIR) + +psb_d_nest_mult_prec_cg_test: psb_d_nest_mult_prec_cg_test.o + $(FLINK) psb_d_nest_mult_prec_cg_test.o -o psb_d_nest_mult_prec_cg_test $(PSBLAS_LIB) $(LDLIBS) + /bin/mv psb_d_nest_mult_prec_cg_test $(EXEDIR) + +psb_d_nest_schur_prec_cg_test: psb_d_nest_schur_prec_cg_test.o + $(FLINK) psb_d_nest_schur_prec_cg_test.o -o psb_d_nest_schur_prec_cg_test $(PSBLAS_LIB) $(LDLIBS) + /bin/mv psb_d_nest_schur_prec_cg_test $(EXEDIR) + clean: - /bin/rm -f psb_d_nest_glob_test.o psb_d_nest_rect_test.o psb_d_nest_cg_test.o *$(.mod) \ - $(EXEDIR)/psb_d_nest_glob_test $(EXEDIR)/psb_d_nest_rect_test $(EXEDIR)/psb_d_nest_cg_test + /bin/rm -f psb_d_nest_glob_test.o psb_d_nest_rect_test.o psb_d_nest_cg_test.o \ + psb_d_nest_prec_cg_test.o psb_d_nest_mult_prec_cg_test.o psb_d_nest_schur_prec_cg_test.o *$(.mod) \ + $(EXEDIR)/psb_d_nest_glob_test $(EXEDIR)/psb_d_nest_rect_test $(EXEDIR)/psb_d_nest_cg_test \ + $(EXEDIR)/psb_d_nest_prec_cg_test $(EXEDIR)/psb_d_nest_mult_prec_cg_test \ + $(EXEDIR)/psb_d_nest_schur_prec_cg_test verycleanlib: (cd ../..; make veryclean) lib: diff --git a/test/nested/psb_d_nest_mult_prec_cg_test.F90 b/test/nested/psb_d_nest_mult_prec_cg_test.F90 new file mode 100644 index 000000000..0ee363b47 --- /dev/null +++ b/test/nested/psb_d_nest_mult_prec_cg_test.F90 @@ -0,0 +1,257 @@ +! +! Parallel Sparse BLAS version 3.5 +! (C) Copyright 2006-2018 +! Salvatore Filippone +! Alfredo Buttari +! +! Redistribution and use in source and binary forms, with or without +! modification, are permitted provided that the following conditions +! are met: +! 1. Redistributions of source code must retain the above copyright +! notice, this list of conditions and the following disclaimer. +! 2. Redistributions in binary form must reproduce the above copyright +! notice, this list of conditions, and the following disclaimer in the +! documentation and/or other materials provided with the distribution. +! 3. The name of the PSBLAS group or the names of its contributors may +! not be used to endorse or promote products derived from this +! software without specific prior written permission. +! +! THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +! ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED +! TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +! PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE PSBLAS GROUP OR ITS CONTRIBUTORS +! BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +! CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +! SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +! INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +! CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +! ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +! POSSIBILITY OF SUCH DAMAGE. +! +! +! File: psb_d_nest_mult_prec_cg_test.F90 +! +! Program: psb_d_nest_mult_prec_cg_test +! +! Same matrix/RHS check as psb_d_nest_cg_test, but the preconditioner is the +! specialized nested preconditioner with multiplicative field compositions. +! The forward multiplicative sweep is nonsymmetric, so it is tested with +! BiCGSTAB; the symmetric forward/backward sweep is tested with CG. +! +! call preconditioner%init(context, 'NEST', info) +! call preconditioner%set('COMPOSITION', 'MULTIPLICATIVE', info) +! call preconditioner%set('COMPOSITION', 'SYMMETRIC_MULTIPLICATIVE', info) +! call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) +! +! The matrix is the red-black reordered 1D Laplacian: +! +! M = [ 2I C ] +! [ C^T 2I ] +! +! With diagonal block solves these tests exercise the field Gauss-Seidel sweep +! and its symmetric forward/backward variant. +! +program psb_d_nest_mult_prec_cg_test + use psb_base_mod + use psb_util_mod + use psb_prec_mod + use psb_linsolve_mod + use psb_d_nest_mod + implicit none + + type(psb_ctxt_type) :: context + integer(psb_ipk_) :: my_rank, num_procs, info, i_local_row, entry_idx + integer(psb_ipk_) :: field1_local_rows, field2_local_rows + integer(psb_lpk_) :: field1_global_row, field2_global_row, field_size + + type(psb_d_nest_matrix) :: nested_matrix + type(psb_dprec_type) :: preconditioner + type(psb_d_vect_type) :: x_solution, rhs, x_exact + + integer(psb_lpk_), allocatable :: entry_rows(:), entry_cols(:) + integer(psb_lpk_), allocatable :: field1_rows(:), field2_rows(:) + real(psb_dpk_), allocatable :: entry_vals(:) + + real(psb_dpk_) :: diag_value, stop_tol, final_residual + real(psb_dpk_) :: norm_x_exact, solution_error + integer(psb_ipk_) :: max_iter, trace_level, n_iter, n_iter_none, iprec + integer(psb_ipk_) :: stop_criterion + real(psb_dpk_), parameter :: solution_tol = 1.0e-5_psb_dpk_ + character(len=32), parameter :: composition_names(2) = & + & [ character(len=32) :: 'MULTIPLICATIVE', 'SYMMETRIC_MULTIPLICATIVE' ] + character(len=16), parameter :: method_names(2) = & + & [ character(len=16) :: 'BICGSTAB', 'CG' ] + logical :: all_passed + + call psb_init(context) + call psb_info(context, my_rank, num_procs) + + field_size = 512 + diag_value = 2.0_psb_dpk_ + stop_tol = 1.0e-9_psb_dpk_ + max_iter = 4000 + trace_level = 0 + stop_criterion = 2 + all_passed = .true. + + call nested_matrix%init(context, [field_size, field_size], info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested_matrix%init info=', info + goto 9999 + end if + field1_rows = nested_matrix%get_owned_rows(1) + field2_rows = nested_matrix%get_owned_rows(2) + field1_local_rows = size(field1_rows) + field2_local_rows = size(field2_rows) + + ! block (1,1) = diag*I + allocate(entry_rows(field1_local_rows), entry_cols(field1_local_rows), entry_vals(field1_local_rows)) + do i_local_row = 1, field1_local_rows + field1_global_row = field1_rows(i_local_row) + entry_rows(i_local_row) = field1_global_row + entry_cols(i_local_row) = field1_global_row + entry_vals(i_local_row) = diag_value + end do + call nested_matrix%ins(1, 1, field1_local_rows, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (2,2) = diag*I + allocate(entry_rows(field2_local_rows), entry_cols(field2_local_rows), entry_vals(field2_local_rows)) + do i_local_row = 1, field2_local_rows + field2_global_row = field2_rows(i_local_row) + entry_rows(i_local_row) = field2_global_row + entry_cols(i_local_row) = field2_global_row + entry_vals(i_local_row) = diag_value + end do + call nested_matrix%ins(2, 2, field2_local_rows, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (1,2) = C + allocate(entry_rows(2*field1_local_rows), entry_cols(2*field1_local_rows), entry_vals(2*field1_local_rows)) + entry_idx = 0 + do i_local_row = 1, field1_local_rows + field1_global_row = field1_rows(i_local_row) + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field1_global_row + entry_cols(entry_idx) = field1_global_row + entry_vals(entry_idx) = -1.0_psb_dpk_ + if (field1_global_row > 1) then + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field1_global_row + entry_cols(entry_idx) = field1_global_row - 1_psb_lpk_ + entry_vals(entry_idx) = -1.0_psb_dpk_ + end if + end do + call nested_matrix%ins(1, 2, entry_idx, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (2,1) = C^T + allocate(entry_rows(2*field2_local_rows), entry_cols(2*field2_local_rows), entry_vals(2*field2_local_rows)) + entry_idx = 0 + do i_local_row = 1, field2_local_rows + field2_global_row = field2_rows(i_local_row) + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field2_global_row + entry_cols(entry_idx) = field2_global_row + entry_vals(entry_idx) = -1.0_psb_dpk_ + if (field2_global_row < field_size) then + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field2_global_row + entry_cols(entry_idx) = field2_global_row + 1_psb_lpk_ + entry_vals(entry_idx) = -1.0_psb_dpk_ + end if + end do + call nested_matrix%ins(2, 1, entry_idx, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + call nested_matrix%asb(info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested_matrix%asb info=', info + goto 9999 + end if + + call psb_geall(x_exact, nested_matrix%desc_glob, info) + call psb_geasb(x_exact, nested_matrix%desc_glob, info) + call x_exact%set(done) + + call psb_geall(rhs, nested_matrix%desc_glob, info) + call psb_geasb(rhs, nested_matrix%desc_glob, info) + call psb_spmm(done, nested_matrix%a_glob, x_exact, dzero, rhs, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_spmm (RHS) info=', info + goto 9999 + end if + norm_x_exact = psb_genrm2(x_exact, nested_matrix%desc_glob, info) + + if (my_rank == 0) write(*,'(a,i0,a,i0)') ' np=', num_procs, ' N(global)=', 2*field_size + + ! Baseline: no preconditioner. + call preconditioner%init(context, 'NONE', info) + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call psb_krylov('CG', nested_matrix%a_glob, preconditioner, rhs, x_solution, stop_tol, & + & nested_matrix%desc_glob, info, & + & itmax=max_iter, iter=n_iter_none, err=final_residual, itrace=trace_level, istop=stop_criterion) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_krylov(CG,NONE) info=', info + all_passed = .false. + end if + if (my_rank == 0) write(*,'(a,i6,a,es12.4)') & + & ' prec=NONE CG iterations=', n_iter_none, ' residual=', final_residual + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + + do iprec = 1, size(composition_names) + call preconditioner%init(context, 'NEST', info) + call preconditioner%set('COMPOSITION', trim(composition_names(iprec)), info) + call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested prec%build info=', info, & + & ' composition=', trim(composition_names(iprec)) + all_passed = .false. + goto 9999 + end if + + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call psb_krylov(trim(method_names(iprec)), nested_matrix%a_glob, preconditioner, rhs, x_solution, stop_tol, & + & nested_matrix%desc_glob, info, & + & itmax=max_iter, iter=n_iter, err=final_residual, itrace=trace_level, istop=stop_criterion) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_krylov(', trim(method_names(iprec)), ',NEST) info=', info, & + & ' composition=', trim(composition_names(iprec)) + all_passed = .false. + end if + + call psb_geaxpby(-done, x_exact, done, x_solution, nested_matrix%desc_glob, info) + solution_error = psb_genrm2(x_solution, nested_matrix%desc_glob, info) / norm_x_exact + if (my_rank == 0) then + write(*,'(a,a32,a,a16,a,i6,a,es12.4,a,es12.4)') & + & ' prec=NEST/', trim(composition_names(iprec)), & + & ' method=', trim(method_names(iprec)), & + & ' iterations=', n_iter, ' residual=', final_residual, & + & ' ||x-x_ex||/||x_ex||=', solution_error + end if + + if ((n_iter >= max_iter) .or. (solution_error > solution_tol)) all_passed = .false. + + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + end do + + if (my_rank == 0) then + if (all_passed) then + write(*,*) '[PASS] Krylov solvers converge with multiplicative NEST preconditioners' + else + write(*,*) '[FAIL] Krylov solvers with multiplicative NEST preconditioners' + end if + end if + + call nested_matrix%free(info) + +9999 continue + call psb_exit(context) + +end program psb_d_nest_mult_prec_cg_test diff --git a/test/nested/psb_d_nest_prec_cg_test.F90 b/test/nested/psb_d_nest_prec_cg_test.F90 new file mode 100644 index 000000000..539bb44b9 --- /dev/null +++ b/test/nested/psb_d_nest_prec_cg_test.F90 @@ -0,0 +1,246 @@ +! +! Parallel Sparse BLAS version 3.5 +! (C) Copyright 2006-2018 +! Salvatore Filippone +! Alfredo Buttari +! +! Redistribution and use in source and binary forms, with or without +! modification, are permitted provided that the following conditions +! are met: +! 1. Redistributions of source code must retain the above copyright +! notice, this list of conditions and the following disclaimer. +! 2. Redistributions in binary form must reproduce the above copyright +! notice, this list of conditions, and the following disclaimer in the +! documentation and/or other materials provided with the distribution. +! 3. The name of the PSBLAS group or the names of its contributors may +! not be used to endorse or promote products derived from this +! software without specific prior written permission. +! +! THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +! ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED +! TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +! PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE PSBLAS GROUP OR ITS CONTRIBUTORS +! BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +! CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +! SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +! INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +! CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +! ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +! POSSIBILITY OF SUCH DAMAGE. +! +! +! File: psb_d_nest_prec_cg_test.F90 +! +! Program: psb_d_nest_prec_cg_test +! +! Same matrix/RHS/CG check as psb_d_nest_cg_test, but the preconditioner is the +! specialized nested preconditioner: +! +! call preconditioner%init(context, 'NEST', info) +! call preconditioner%set('COMPOSITION', 'ADDITIVE', info) +! call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) +! +! The matrix is the red-black reordered 1D Laplacian: +! +! M = [ 2I C ] +! [ C^T 2I ] +! +! With diagonal block solves this NEST/ADDITIVE preconditioner is a scalar +! rescaling of the identity on this problem, so CG should converge to the exact +! solution and reproduce the unpreconditioned iteration count. +! +program psb_d_nest_prec_cg_test + use psb_base_mod + use psb_util_mod + use psb_prec_mod + use psb_linsolve_mod + use psb_d_nest_mod + implicit none + + type(psb_ctxt_type) :: context + integer(psb_ipk_) :: my_rank, num_procs, info, i_local_row, entry_idx + integer(psb_ipk_) :: field1_local_rows, field2_local_rows + integer(psb_lpk_) :: field1_global_row, field2_global_row, field_size + + type(psb_d_nest_matrix) :: nested_matrix + type(psb_dprec_type) :: preconditioner + type(psb_d_vect_type) :: x_solution, rhs, x_exact + + integer(psb_lpk_), allocatable :: entry_rows(:), entry_cols(:) + integer(psb_lpk_), allocatable :: field1_rows(:), field2_rows(:) + real(psb_dpk_), allocatable :: entry_vals(:) + + real(psb_dpk_) :: diag_value, stop_tol, final_residual + real(psb_dpk_) :: norm_x_exact, solution_error + integer(psb_ipk_) :: max_iter, trace_level, n_iter, n_iter_none + integer(psb_ipk_) :: stop_criterion + real(psb_dpk_), parameter :: solution_tol = 1.0e-6_psb_dpk_ + logical :: all_passed + + call psb_init(context) + call psb_info(context, my_rank, num_procs) + + field_size = 512 + diag_value = 2.0_psb_dpk_ + stop_tol = 1.0e-9_psb_dpk_ + max_iter = 4000 + trace_level = 0 + stop_criterion = 2 + all_passed = .true. + + call nested_matrix%init(context, [field_size, field_size], info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested_matrix%init info=', info + goto 9999 + end if + field1_rows = nested_matrix%get_owned_rows(1) + field2_rows = nested_matrix%get_owned_rows(2) + field1_local_rows = size(field1_rows) + field2_local_rows = size(field2_rows) + + ! block (1,1) = diag*I + allocate(entry_rows(field1_local_rows), entry_cols(field1_local_rows), entry_vals(field1_local_rows)) + do i_local_row = 1, field1_local_rows + field1_global_row = field1_rows(i_local_row) + entry_rows(i_local_row) = field1_global_row + entry_cols(i_local_row) = field1_global_row + entry_vals(i_local_row) = diag_value + end do + call nested_matrix%ins(1, 1, field1_local_rows, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (2,2) = diag*I + allocate(entry_rows(field2_local_rows), entry_cols(field2_local_rows), entry_vals(field2_local_rows)) + do i_local_row = 1, field2_local_rows + field2_global_row = field2_rows(i_local_row) + entry_rows(i_local_row) = field2_global_row + entry_cols(i_local_row) = field2_global_row + entry_vals(i_local_row) = diag_value + end do + call nested_matrix%ins(2, 2, field2_local_rows, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (1,2) = C + allocate(entry_rows(2*field1_local_rows), entry_cols(2*field1_local_rows), entry_vals(2*field1_local_rows)) + entry_idx = 0 + do i_local_row = 1, field1_local_rows + field1_global_row = field1_rows(i_local_row) + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field1_global_row + entry_cols(entry_idx) = field1_global_row + entry_vals(entry_idx) = -1.0_psb_dpk_ + if (field1_global_row > 1) then + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field1_global_row + entry_cols(entry_idx) = field1_global_row - 1_psb_lpk_ + entry_vals(entry_idx) = -1.0_psb_dpk_ + end if + end do + call nested_matrix%ins(1, 2, entry_idx, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (2,1) = C^T + allocate(entry_rows(2*field2_local_rows), entry_cols(2*field2_local_rows), entry_vals(2*field2_local_rows)) + entry_idx = 0 + do i_local_row = 1, field2_local_rows + field2_global_row = field2_rows(i_local_row) + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field2_global_row + entry_cols(entry_idx) = field2_global_row + entry_vals(entry_idx) = -1.0_psb_dpk_ + if (field2_global_row < field_size) then + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field2_global_row + entry_cols(entry_idx) = field2_global_row + 1_psb_lpk_ + entry_vals(entry_idx) = -1.0_psb_dpk_ + end if + end do + call nested_matrix%ins(2, 1, entry_idx, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + call nested_matrix%asb(info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested_matrix%asb info=', info + goto 9999 + end if + + call psb_geall(x_exact, nested_matrix%desc_glob, info) + call psb_geasb(x_exact, nested_matrix%desc_glob, info) + call x_exact%set(done) + + call psb_geall(rhs, nested_matrix%desc_glob, info) + call psb_geasb(rhs, nested_matrix%desc_glob, info) + call psb_spmm(done, nested_matrix%a_glob, x_exact, dzero, rhs, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_spmm (RHS) info=', info + goto 9999 + end if + norm_x_exact = psb_genrm2(x_exact, nested_matrix%desc_glob, info) + + if (my_rank == 0) write(*,'(a,i0,a,i0)') ' np=', num_procs, ' N(global)=', 2*field_size + + ! Baseline: no preconditioner. + call preconditioner%init(context, 'NONE', info) + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call psb_krylov('CG', nested_matrix%a_glob, preconditioner, rhs, x_solution, stop_tol, & + & nested_matrix%desc_glob, info, & + & itmax=max_iter, iter=n_iter_none, err=final_residual, itrace=trace_level, istop=stop_criterion) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_krylov(CG,NONE) info=', info + all_passed = .false. + end if + if (my_rank == 0) write(*,'(a,i6,a,es12.4)') & + & ' prec=NONE CG iterations=', n_iter_none, ' residual=', final_residual + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + + ! Specialized nested preconditioner: additive block-diagonal field split. + call preconditioner%init(context, 'NEST', info) + call preconditioner%set('COMPOSITION', 'ADDITIVE', info) + call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested prec%build info=', info + all_passed = .false. + goto 9999 + end if + + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call psb_krylov('CG', nested_matrix%a_glob, preconditioner, rhs, x_solution, stop_tol, & + & nested_matrix%desc_glob, info, & + & itmax=max_iter, iter=n_iter, err=final_residual, itrace=trace_level, istop=stop_criterion) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_krylov(CG,NEST) info=', info + all_passed = .false. + end if + + call psb_geaxpby(-done, x_exact, done, x_solution, nested_matrix%desc_glob, info) + solution_error = psb_genrm2(x_solution, nested_matrix%desc_glob, info) / norm_x_exact + if (my_rank == 0) then + write(*,'(a,i6,a,es12.4,a,es12.4)') & + & ' prec=NEST CG iterations=', n_iter, ' residual=', final_residual, & + & ' ||x-x_ex||/||x_ex||=', solution_error + end if + + if ((n_iter >= max_iter) .or. (solution_error > solution_tol)) all_passed = .false. + if (n_iter /= n_iter_none) all_passed = .false. + + if (my_rank == 0) then + if (all_passed) then + write(*,*) '[PASS] CG converges with the specialized NEST preconditioner' + else + write(*,*) '[FAIL] CG with specialized NEST preconditioner' + end if + end if + + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + call nested_matrix%free(info) + +9999 continue + call psb_exit(context) + +end program psb_d_nest_prec_cg_test diff --git a/test/nested/psb_d_nest_schur_prec_cg_test.F90 b/test/nested/psb_d_nest_schur_prec_cg_test.F90 new file mode 100644 index 000000000..b94bd970a --- /dev/null +++ b/test/nested/psb_d_nest_schur_prec_cg_test.F90 @@ -0,0 +1,354 @@ +! +! Parallel Sparse BLAS version 3.5 +! (C) Copyright 2006-2018 +! Salvatore Filippone +! Alfredo Buttari +! +! Redistribution and use in source and binary forms, with or without +! modification, are permitted provided that the following conditions +! are met: +! 1. Redistributions of source code must retain the above copyright +! notice, this list of conditions and the following disclaimer. +! 2. Redistributions in binary form must reproduce the above copyright +! notice, this list of conditions, and the following disclaimer in the +! documentation and/or other materials provided with the distribution. +! 3. The name of the PSBLAS group or the names of its contributors may +! not be used to endorse or promote products derived from this +! software without specific prior written permission. +! +! THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +! ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED +! TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +! PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE PSBLAS GROUP OR ITS CONTRIBUTORS +! BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +! CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +! SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +! INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +! CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +! ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +! POSSIBILITY OF SUCH DAMAGE. +! +! +! File: psb_d_nest_schur_prec_cg_test.F90 +! +! Program: psb_d_nest_schur_prec_cg_test +! +! Same matrix/RHS check as psb_d_nest_cg_test, but the preconditioner is the +! specialized nested preconditioner with Schur-complement-style field splits. +! These splits are generally nonsymmetric with approximate block solves, so +! they are tested with BiCGSTAB. +! +! call preconditioner%init(context, 'NEST', info) +! call preconditioner%set('COMPOSITION', 'SCHUR_LOWER', info) +! call preconditioner%set('COMPOSITION', 'SCHUR_UPPER', info) +! call preconditioner%set('COMPOSITION', 'SCHUR_FULL', info) +! call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) +! call preconditioner%set('BLOCK_SOLVE', 'DIAG', info, idx=1) +! call preconditioner%set('BLOCK_SOLVE', 'DIAG', info, idx=2) +! call preconditioner%set('INNER_SOLVE', 'CG', info, idx=1) +! call preconditioner%set('INNER_SOLVE', 'CG', info, idx=2) +! +! The matrix is the red-black reordered 1D Laplacian: +! +! M = [ 2I C ] +! [ C^T 2I ] +! +! With diagonal block solves these tests exercise lower, upper, and full +! two-field block-factorization-style applications. +! +program psb_d_nest_schur_prec_cg_test + use psb_base_mod + use psb_util_mod + use psb_prec_mod + use psb_linsolve_mod + use psb_d_nest_mod + implicit none + + type(psb_ctxt_type) :: context + integer(psb_ipk_) :: my_rank, num_procs, info, i_local_row, entry_idx + integer(psb_ipk_) :: field1_local_rows, field2_local_rows + integer(psb_lpk_) :: field1_global_row, field2_global_row, field_size + + type(psb_d_nest_matrix) :: nested_matrix + type(psb_dprec_type) :: preconditioner + type(psb_d_vect_type) :: x_solution, rhs, x_exact + + integer(psb_lpk_), allocatable :: entry_rows(:), entry_cols(:) + integer(psb_lpk_), allocatable :: field1_rows(:), field2_rows(:) + real(psb_dpk_), allocatable :: entry_vals(:) + + real(psb_dpk_) :: diag_value, stop_tol, final_residual + real(psb_dpk_) :: norm_x_exact, solution_error + integer(psb_ipk_) :: max_iter, trace_level, n_iter, n_iter_none, iprec + integer(psb_ipk_) :: stop_criterion + real(psb_dpk_), parameter :: solution_tol = 1.0e-5_psb_dpk_ + character(len=32), parameter :: composition_names(4) = & + & [ character(len=32) :: 'SCHUR_LOWER', 'SCHUR_UPPER', 'SCHUR_FULL', 'SCHUR_FULL' ] + character(len=32), parameter :: schur_solve_names(4) = & + & [ character(len=32) :: 'A22', 'A22', 'A22', 'MATRIX_FREE' ] + logical :: all_passed + + call psb_init(context) + call psb_info(context, my_rank, num_procs) + + field_size = 512 + diag_value = 2.0_psb_dpk_ + stop_tol = 1.0e-9_psb_dpk_ + max_iter = 4000 + trace_level = 0 + stop_criterion = 2 + all_passed = .true. + + call nested_matrix%init(context, [field_size, field_size], info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested_matrix%init info=', info + goto 9999 + end if + field1_rows = nested_matrix%get_owned_rows(1) + field2_rows = nested_matrix%get_owned_rows(2) + field1_local_rows = size(field1_rows) + field2_local_rows = size(field2_rows) + + ! block (1,1) = diag*I + allocate(entry_rows(field1_local_rows), entry_cols(field1_local_rows), entry_vals(field1_local_rows)) + do i_local_row = 1, field1_local_rows + field1_global_row = field1_rows(i_local_row) + entry_rows(i_local_row) = field1_global_row + entry_cols(i_local_row) = field1_global_row + entry_vals(i_local_row) = diag_value + end do + call nested_matrix%ins(1, 1, field1_local_rows, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (2,2) = diag*I + allocate(entry_rows(field2_local_rows), entry_cols(field2_local_rows), entry_vals(field2_local_rows)) + do i_local_row = 1, field2_local_rows + field2_global_row = field2_rows(i_local_row) + entry_rows(i_local_row) = field2_global_row + entry_cols(i_local_row) = field2_global_row + entry_vals(i_local_row) = diag_value + end do + call nested_matrix%ins(2, 2, field2_local_rows, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (1,2) = C + allocate(entry_rows(2*field1_local_rows), entry_cols(2*field1_local_rows), entry_vals(2*field1_local_rows)) + entry_idx = 0 + do i_local_row = 1, field1_local_rows + field1_global_row = field1_rows(i_local_row) + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field1_global_row + entry_cols(entry_idx) = field1_global_row + entry_vals(entry_idx) = -1.0_psb_dpk_ + if (field1_global_row > 1) then + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field1_global_row + entry_cols(entry_idx) = field1_global_row - 1_psb_lpk_ + entry_vals(entry_idx) = -1.0_psb_dpk_ + end if + end do + call nested_matrix%ins(1, 2, entry_idx, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + ! block (2,1) = C^T + allocate(entry_rows(2*field2_local_rows), entry_cols(2*field2_local_rows), entry_vals(2*field2_local_rows)) + entry_idx = 0 + do i_local_row = 1, field2_local_rows + field2_global_row = field2_rows(i_local_row) + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field2_global_row + entry_cols(entry_idx) = field2_global_row + entry_vals(entry_idx) = -1.0_psb_dpk_ + if (field2_global_row < field_size) then + entry_idx = entry_idx + 1 + entry_rows(entry_idx) = field2_global_row + entry_cols(entry_idx) = field2_global_row + 1_psb_lpk_ + entry_vals(entry_idx) = -1.0_psb_dpk_ + end if + end do + call nested_matrix%ins(2, 1, entry_idx, entry_rows, entry_cols, entry_vals, info) + deallocate(entry_rows, entry_cols, entry_vals) + + call nested_matrix%asb(info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested_matrix%asb info=', info + goto 9999 + end if + + call psb_geall(x_exact, nested_matrix%desc_glob, info) + call psb_geasb(x_exact, nested_matrix%desc_glob, info) + call x_exact%set(done) + + call psb_geall(rhs, nested_matrix%desc_glob, info) + call psb_geasb(rhs, nested_matrix%desc_glob, info) + call psb_spmm(done, nested_matrix%a_glob, x_exact, dzero, rhs, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_spmm (RHS) info=', info + goto 9999 + end if + norm_x_exact = psb_genrm2(x_exact, nested_matrix%desc_glob, info) + + if (my_rank == 0) write(*,'(a,i0,a,i0)') ' np=', num_procs, ' N(global)=', 2*field_size + + ! Baseline: no preconditioner. + call preconditioner%init(context, 'NONE', info) + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call psb_krylov('CG', nested_matrix%a_glob, preconditioner, rhs, x_solution, stop_tol, & + & nested_matrix%desc_glob, info, & + & itmax=max_iter, iter=n_iter_none, err=final_residual, itrace=trace_level, istop=stop_criterion) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_krylov(CG,NONE) info=', info + all_passed = .false. + end if + if (my_rank == 0) write(*,'(a,i6,a,es12.4)') & + & ' prec=NONE CG iterations=', n_iter_none, ' residual=', final_residual + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + + do iprec = 1, size(composition_names) + call preconditioner%init(context, 'NEST', info) + call preconditioner%set('COMPOSITION', trim(composition_names(iprec)), info) + call preconditioner%set('SCHUR_SOLVE', trim(schur_solve_names(iprec)), info) + if (trim(schur_solve_names(iprec)) == 'MATRIX_FREE') then + call preconditioner%set('SCHUR_MAXIT', 8, info) + call preconditioner%set('BLOCK_SOLVE', 'BJAC', info) + call preconditioner%set('BLOCK_SOLVE', 'BJAC', info, idx=1) + call preconditioner%set('BLOCK_SOLVE', 'BJAC', info, idx=2) + call preconditioner%set('SUB_SOLVE', 'ILU', info, idx=1) + call preconditioner%set('SUB_SOLVE', 'ILU', info, idx=2) + call preconditioner%set('SUB_FILLIN', 0, info, idx=1) + call preconditioner%set('SUB_FILLIN', 0, info, idx=2) + else + call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) + call preconditioner%set('BLOCK_SOLVE', 'DIAG', info, idx=1) + call preconditioner%set('BLOCK_SOLVE', 'DIAG', info, idx=2) + end if + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested prec%build info=', info, & + & ' composition=', trim(composition_names(iprec)) + all_passed = .false. + goto 9999 + end if + + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call psb_krylov('BICGSTAB', nested_matrix%a_glob, preconditioner, rhs, x_solution, stop_tol, & + & nested_matrix%desc_glob, info, & + & itmax=max_iter, iter=n_iter, err=final_residual, itrace=trace_level, istop=stop_criterion) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: psb_krylov(BICGSTAB,NEST) info=', info, & + & ' composition=', trim(composition_names(iprec)) + all_passed = .false. + end if + + call psb_geaxpby(-done, x_exact, done, x_solution, nested_matrix%desc_glob, info) + solution_error = psb_genrm2(x_solution, nested_matrix%desc_glob, info) / norm_x_exact + if (my_rank == 0) then + write(*,'(a,a32,a,a16,a,a16,a,i6,a,es12.4,a,es12.4)') & + & ' prec=NEST/', trim(composition_names(iprec)), & + & ' schur=', trim(schur_solve_names(iprec)), & + & ' method=', 'BICGSTAB', & + & ' iterations=', n_iter, ' residual=', final_residual, & + & ' ||x-x_ex||/||x_ex||=', solution_error + end if + + if ((n_iter >= max_iter) .or. (solution_error > solution_tol)) all_passed = .false. + + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + end do + + ! Exercise independent inner Krylov contexts inside the field solves. + call preconditioner%init(context, 'NEST', info) + call preconditioner%set('COMPOSITION', 'SCHUR_FULL', info) + call preconditioner%set('SCHUR_SOLVE', 'A22', info) + call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) + call preconditioner%set('INNER_SOLVE', 'CG', info, idx=1) + call preconditioner%set('INNER_MAXIT', 10, info, idx=1) + call preconditioner%set('INNER_TOL', 1.0d-10, info, idx=1) + call preconditioner%set('INNER_SOLVE', 'CG', info, idx=2) + call preconditioner%set('INNER_MAXIT', 12, info, idx=2) + call preconditioner%set('INNER_TOL', 1.0d-11, info, idx=2) + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested inner Krylov prec%build info=', info + all_passed = .false. + goto 9999 + end if + + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call preconditioner%apply(rhs, x_solution, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: NEST inner Krylov apply info=', info + all_passed = .false. + end if + + solution_error = psb_genrm2(x_solution, nested_matrix%desc_glob, info) + if (my_rank == 0) then + write(*,'(a,a32,a,a16,a,a16,a,es12.4)') & + & ' prec=NEST/', 'SCHUR_FULL', & + & ' schur=', 'A22+INNER', & + & ' method=', 'APPLY', & + & ' ||P rhs||=', solution_error + end if + + if ((info /= psb_success_) .or. (solution_error /= solution_error) .or. & + & (solution_error <= dzero)) all_passed = .false. + + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + + ! Exercise transposed Schur application for the explicit A22 approximation. + call preconditioner%init(context, 'NEST', info) + call preconditioner%set('COMPOSITION', 'SCHUR_FULL', info) + call preconditioner%set('SCHUR_SOLVE', 'A22', info) + call preconditioner%set('BLOCK_SOLVE', 'DIAG', info) + call preconditioner%build(nested_matrix%a_glob, nested_matrix%desc_glob, info) + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: nested Schur transpose prec%build info=', info + all_passed = .false. + goto 9999 + end if + + call psb_geall(x_solution, nested_matrix%desc_glob, info) + call psb_geasb(x_solution, nested_matrix%desc_glob, info) + call preconditioner%apply(rhs, x_solution, nested_matrix%desc_glob, info, trans='T') + if (info /= psb_success_) then + if (my_rank == 0) write(*,*) 'FAIL: NEST Schur transpose apply info=', info + all_passed = .false. + end if + + solution_error = psb_genrm2(x_solution, nested_matrix%desc_glob, info) + if (my_rank == 0) then + write(*,'(a,a32,a,a16,a,a16,a,es12.4)') & + & ' prec=NEST/', 'SCHUR_FULL', & + & ' schur=', 'A22', & + & ' method=', 'APPLY^T', & + & ' ||P^T rhs||=', solution_error + end if + + if ((info /= psb_success_) .or. (solution_error /= solution_error) .or. & + & (solution_error <= dzero)) all_passed = .false. + + call psb_gefree(x_solution, nested_matrix%desc_glob, info) + call preconditioner%free(info) + + if (my_rank == 0) then + if (all_passed) then + write(*,*) '[PASS] Krylov solvers converge with Schur-style NEST preconditioners' + write(*,*) ' including per-field inner Krylov solves' + else + write(*,*) '[FAIL] Krylov solvers with Schur-style NEST preconditioners' + end if + end if + + call nested_matrix%free(info) + +9999 continue + call psb_exit(context) + +end program psb_d_nest_schur_prec_cg_test diff --git a/util/psb_i_mmio_impl.f90 b/util/psb_i_mmio_impl.f90 index 64d02bcd7..941559d91 100644 --- a/util/psb_i_mmio_impl.f90 +++ b/util/psb_i_mmio_impl.f90 @@ -1,6 +1,6 @@ ! -! Parallel Sparse BLAS version 3.5 -! (C) Copyright 2006-2018 +! Parallel Sparse BLAS version 3.1 +! (C) Copyright 2006, 2007, 2008, 2009, 2010, 2012, 2013 ! Salvatore Filippone ! Alfredo Buttari ! @@ -29,6 +29,11 @@ ! POSSIBILITY OF SUCH DAMAGE. ! ! +! +! Warning: MM does not define a format for an array with integer entries. +! Hence we hijack the REAL format, but this could lead to errors when +! used with non-integer files. +! subroutine mm_ivet_read(b, info, iunit, filename) use psb_base_mod implicit none @@ -39,10 +44,8 @@ subroutine mm_ivet_read(b, info, iunit, filename) integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, infile character :: mmheader*15, fmt*15, object*10, type*10, sym*15,& & line*1024 - logical :: opened info = psb_success_ - opened = .false. if (present(filename)) then if (filename == '-') then infile=5 @@ -53,7 +56,6 @@ subroutine mm_ivet_read(b, info, iunit, filename) infile=99 endif open(infile,file=filename, status='OLD', err=901, action='READ') - opened = .true. endif else if (present(iunit)) then @@ -86,7 +88,7 @@ subroutine mm_ivet_read(b, info, iunit, filename) end do end if ! read right hand sides - if (opened) close(infile) + if (infile /= 5) close(infile) return ! open failed @@ -114,10 +116,8 @@ subroutine mm_ivet2_read(b, info, iunit, filename) integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, infile character :: mmheader*15, fmt*15, object*10, type*10, sym*15,& & line*1024 - logical :: opened info = psb_success_ - opened = .false. if (present(filename)) then if (filename == '-') then infile=5 @@ -128,7 +128,6 @@ subroutine mm_ivet2_read(b, info, iunit, filename) infile=99 endif open(infile,file=filename, status='OLD', err=901, action='READ') - opened = .true. endif else if (present(iunit)) then @@ -152,14 +151,14 @@ subroutine mm_ivet2_read(b, info, iunit, filename) end do read(line,fmt=*)nrow,ncol - + if ((psb_tolower(type) == 'real').and.(psb_tolower(sym) == 'general')) then allocate(b(nrow,ncol),stat = ircode) if (ircode /= 0) goto 993 read(infile,fmt=*,end=902) ((b(i,j), i=1,nrow),j=1,ncol) end if ! read right hand sides - if (opened) close(infile) + if (infile /= 5) close(infile) return ! open failed @@ -188,10 +187,8 @@ subroutine mm_ivet2_write(b, header, info, iunit, filename) integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, outfile character(len=80) :: frmtv - logical :: opened info = psb_success_ - opened = .false. if (present(filename)) then if (filename == '-') then outfile=6 @@ -202,7 +199,6 @@ subroutine mm_ivet2_write(b, header, info, iunit, filename) outfile=99 endif open(outfile,file=filename, err=901, action='WRITE') - opened = .true. endif else if (present(iunit)) then @@ -211,7 +207,7 @@ subroutine mm_ivet2_write(b, header, info, iunit, filename) outfile=6 endif endif - + write(outfile,'(a)') '%%MatrixMarket matrix array real general' write(outfile,'(a)') '% '//trim(header) write(outfile,'(a)') '% ' @@ -219,11 +215,9 @@ subroutine mm_ivet2_write(b, header, info, iunit, filename) ncol = size(b,2) write(outfile,*) nrow, ncol - write(frmtv,'(a,i0,a)') '(',ncol,'(es26.18,1x))' - - write(outfile,fmt=frmtv) ((b(i,j), i=1,nrow),j=1,ncol) + write(outfile,fmt='(I14,1x)') ((b(i,j), i=1,nrow),j=1,ncol) - if (opened) close(outfile) + if (outfile /= 6) close(outfile) return ! open failed @@ -245,10 +239,8 @@ subroutine mm_ivet1_write(b, header, info, iunit, filename) integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, outfile character(len=80) :: frmtv - logical :: opened info = psb_success_ - opened = .false. if (present(filename)) then if (filename == '-') then outfile=6 @@ -259,7 +251,6 @@ subroutine mm_ivet1_write(b, header, info, iunit, filename) outfile=99 endif open(outfile,file=filename, err=901, action='WRITE') - opened = .true. endif else if (present(iunit)) then @@ -276,13 +267,13 @@ subroutine mm_ivet1_write(b, header, info, iunit, filename) ncol = 1 write(outfile,*) nrow,ncol - write(frmtv,'(a,i0,a)') '(',ncol,'(es26.18,1x))' + write(frmtv,'(a,i0,a)') '(',ncol,'(i14,1x))' do i=1,size(b,1) write(outfile,frmtv) b(i) end do - if (opened) close(outfile) + if (outfile /= 6) close(outfile) return ! open failed @@ -293,6 +284,259 @@ subroutine mm_ivet1_write(b, header, info, iunit, filename) end subroutine mm_ivet1_write +#if defined(PSB_IPK4) && defined(PSB_LPK8) +subroutine mm_lvet_read(b, info, iunit, filename) + use psb_base_mod + implicit none + integer(psb_lpk_), allocatable, intent(out) :: b(:) + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_), optional, intent(in) :: iunit + character(len=*), optional, intent(in) :: filename + integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, infile + character :: mmheader*15, fmt*15, object*10, type*10, sym*15,& + & line*1024 + + info = psb_success_ + if (present(filename)) then + if (filename == '-') then + infile=5 + else + if (present(iunit)) then + infile=iunit + else + infile=99 + endif + open(infile,file=filename, status='OLD', err=901, action='READ') + endif + else + if (present(iunit)) then + infile=iunit + else + infile=5 + endif + endif + + read(infile,fmt=*, end=902) mmheader, object, fmt, type, sym + + if ( (object /= 'matrix').or.(fmt /= 'array')) then + write(psb_err_unit,*) 'read_rhs: input file type not yet supported' + info = -3 + return + end if + + do + read(infile,fmt='(a)') line + if (line(1:1) /= '%') exit + end do + + read(line,fmt=*)nrow,ncol + + if ((psb_tolower(type) == 'real').and.(psb_tolower(sym) == 'general')) then + allocate(b(nrow),stat = ircode) + if (ircode /= 0) goto 993 + do i=1,nrow + read(infile,fmt=*,end=902) b(i) + end do + + end if ! read right hand sides + if (infile /= 5) close(infile) + + return + ! open failed +901 write(psb_err_unit,*) 'mm_vet_read: could not open file ',& + & infile,' for input' + info = -1 + return + +902 write(psb_err_unit,*) 'mmv_vet_read: unexpected end of file ',infile,& + & ' during input' + info = -2 + return +993 write(psb_err_unit,*) 'mm_vet_read: memory allocation failure' + info = -3 + return +end subroutine mm_lvet_read + +subroutine mm_lvet2_read(b, info, iunit, filename) + use psb_base_mod + implicit none + integer(psb_lpk_), allocatable, intent(out) :: b(:,:) + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_), optional, intent(in) :: iunit + character(len=*), optional, intent(in) :: filename + integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, infile + character :: mmheader*15, fmt*15, object*10, type*10, sym*15,& + & line*1024 + + info = psb_success_ + if (present(filename)) then + if (filename == '-') then + infile=5 + else + if (present(iunit)) then + infile=iunit + else + infile=99 + endif + open(infile,file=filename, status='OLD', err=901, action='READ') + endif + else + if (present(iunit)) then + infile=iunit + else + infile=5 + endif + endif + + read(infile,fmt=*, end=902) mmheader, object, fmt, type, sym + + if ( (object /= 'matrix').or.(fmt /= 'array')) then + write(psb_err_unit,*) 'read_rhs: input file type not yet supported' + info = -3 + return + end if + + do + read(infile,fmt='(a)') line + if (line(1:1) /= '%') exit + end do + + read(line,fmt=*)nrow,ncol + + if ((psb_tolower(type) == 'real').and.(psb_tolower(sym) == 'general')) then + allocate(b(nrow,ncol),stat = ircode) + if (ircode /= 0) goto 993 + read(infile,fmt=*,end=902) ((b(i,j), i=1,nrow),j=1,ncol) + + end if ! read right hand sides + if (infile /= 5) close(infile) + + return + ! open failed +901 write(psb_err_unit,*) 'mm_vet_read: could not open file ',& + & infile,' for input' + info = -1 + return + +902 write(psb_err_unit,*) 'mmv_vet_read: unexpected end of file ',infile,& + & ' during input' + info = -2 + return +993 write(psb_err_unit,*) 'mm_vet_read: memory allocation failure' + info = -3 + return +end subroutine mm_lvet2_read + +subroutine mm_lvet2_write(b, header, info, iunit, filename) + use psb_base_mod + implicit none + integer(psb_lpk_), intent(in) :: b(:,:) + character(len=*), intent(in) :: header + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_), optional, intent(in) :: iunit + character(len=*), optional, intent(in) :: filename + integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, outfile + + character(len=80) :: frmtv + + info = psb_success_ + if (present(filename)) then + if (filename == '-') then + outfile=6 + else + if (present(iunit)) then + outfile=iunit + else + outfile=99 + endif + open(outfile,file=filename, err=901, action='WRITE') + endif + else + if (present(iunit)) then + outfile=iunit + else + outfile=6 + endif + endif + + write(outfile,'(a)') '%%MatrixMarket matrix array real general' + write(outfile,'(a)') '% '//trim(header) + write(outfile,'(a)') '% ' + nrow = size(b,1) + ncol = size(b,2) + write(outfile,*) nrow, ncol + + write(outfile,fmt='(I24,1x)') ((b(i,j), i=1,nrow),j=1,ncol) + + if (outfile /= 6) close(outfile) + + return + ! open failed +901 write(psb_err_unit,*) 'mm_vet_write: could not open file ',& + & outfile,' for output' + info = -1 + return + +end subroutine mm_lvet2_write + +subroutine mm_lvet1_write(b, header, info, iunit, filename) + use psb_base_mod + implicit none + integer(psb_lpk_), intent(in) :: b(:) + character(len=*), intent(in) :: header + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_), optional, intent(in) :: iunit + character(len=*), optional, intent(in) :: filename + integer(psb_ipk_) :: nrow, ncol, i,root, np, me, ircode, j, outfile + + character(len=80) :: frmtv + + info = psb_success_ + if (present(filename)) then + if (filename == '-') then + outfile=6 + else + if (present(iunit)) then + outfile=iunit + else + outfile=99 + endif + open(outfile,file=filename, err=901, action='WRITE') + endif + else + if (present(iunit)) then + outfile=iunit + else + outfile=6 + endif + endif + + write(outfile,'(a)') '%%MatrixMarket matrix array real general' + write(outfile,'(a)') '% '//trim(header) + write(outfile,'(a)') '% ' + nrow = size(b,1) + ncol = 1 + write(outfile,*) nrow,ncol + + write(frmtv,'(a,i0,a)') '(',ncol,'(i24,1x))' + + do i=1,size(b,1) + write(outfile,frmtv) b(i) + end do + + if (outfile /= 6) close(outfile) + + return + ! open failed +901 write(psb_err_unit,*) 'mm_vet_write: could not open file ',& + & outfile,' for output' + info = -1 + return + +end subroutine mm_lvet1_write + +#endif + subroutine mm_ivect_read(b, info, iunit, filename) use psb_base_mod use psb_mmio_mod, psb_protect_name => mm_ivect_read @@ -305,7 +549,7 @@ subroutine mm_ivect_read(b, info, iunit, filename) integer(psb_ipk_), allocatable :: bv(:) call mm_array_read(bv, info, iunit, filename) - if (info == 0) call b%bld(bv) + call b%bld(bv) end subroutine mm_ivect_read @@ -326,3 +570,35 @@ subroutine mm_ivect_write(b, header, info, iunit, filename) end subroutine mm_ivect_write +subroutine mm_lvect_read(b, info, iunit, filename) + use psb_base_mod + use psb_mmio_mod, psb_protect_name => mm_lvect_read + implicit none + type(psb_l_vect_type), intent(inout) :: b + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_), optional, intent(in) :: iunit + character(len=*), optional, intent(in) :: filename + ! + integer(psb_lpk_), allocatable :: bv(:) + + call mm_array_read(bv, info, iunit, filename) + call b%bld(bv) + +end subroutine mm_lvect_read + +subroutine mm_lvect_write(b, header, info, iunit, filename) + use psb_base_mod + use psb_mmio_mod, psb_protect_name => mm_lvect_write + implicit none + type(psb_l_vect_type), intent(inout) :: b + character(len=*), intent(in) :: header + integer(psb_ipk_), intent(out) :: info + integer(psb_ipk_), optional, intent(in) :: iunit + character(len=*), optional, intent(in) :: filename + info = psb_success_ + if (.not.allocated(b%v)) return + call b%sync() + + call mm_array_write(b%v%v,header,info,iunit,filename) + +end subroutine mm_lvect_write