diff --git a/.gitignore b/.gitignore index 6ecf4e1f..d9e7db98 100644 --- a/.gitignore +++ b/.gitignore @@ -20,3 +20,5 @@ autom4te.cache # the executable from tests runs +# Documentation temporary files +docs/src/userguide.pdf diff --git a/Make_n.inc.in b/Make_n.inc.in index 8fdc622e..d699579d 100644 --- a/Make_n.inc.in +++ b/Make_n.inc.in @@ -49,7 +49,7 @@ PSBLAS_INCLUDES=@PSBLAS_INCLUDES@ PSBLAS_LIBS=@PSBLAS_LIBS@ PSBBASEMODNAME=psb_base_mod PSBPRECMODNAME=psb_prec_mod -PSBMETHDMODNAME=psb_krylov_mod +PSBMETHDMODNAME=psb_linsolve_mod PSBUTILMODNAME=psb_util_mod diff --git a/README.md b/README.md index 3be67f1f..37f5ce5f 100644 --- a/README.md +++ b/README.md @@ -1,54 +1,76 @@ - AMG4PSBLAS - Algebraic Multigrid Package based on PSBLAS (Parallel Sparse BLAS version 3.8) - -Salvatore Filippone (University of Rome Tor Vergata and IAC-CNR) -Pasqua D'Ambra (IAC-CNR, Naples, IT) -Fabio Durastante (IAC-CNR, Naples, IT) +# AMG4PSBLAS v1.2 +Algebraic Multigrid Package based on [PSBLAS](https://github.com/sfilippone/psblas3) (Parallel Sparse BLAS version 3.9) ---------------------------------------------------------------------- +AMG4PSBLAS is a package of parallel algebraic multilevel preconditioners included in the PSCToolkit (Parallel Sparse Computation Toolkit) software framework. -AMG4PSBLAS is a package of Algebraic MultiGrid (AMG) -preconditioners for the iterative solution of large and sparse linear systems. +It is a progress of a software development project started in 2007, named MLD2P4, which originally implemented a multilevel version of some domain decomposition preconditioners of additive-Schwarz type and was based on a parallel decoupled version of the well known smoothed aggregation method to generate the multilevel hierarchy of coarser matrices. -It is an evolution of MLD2P4 (see LICENSE.MLD2P4), but it has been -thoroughly reworked, and it is sufficiently different to warrant a new -project name. +In the last years the package was extended for including new algorithms and functionalities for the setup and application new AMG preconditioners with the final aims of improving efficiency and scalability when tens of thousands cores are used and of boosting reliability in dealing with general symmetric positive definite linear systems. +It is an evolution of MLD2P4 (see [LICENSE.MLD2P4](LICENSE.MLD2P4)), but due to the significant number of changes and the increase in scope, we decided to rename the package as AMG4PSBLAS. -MAIN REFERENCES: +AMG4PSBLAS has been designed to provide scalable and easy-to-use preconditioners in the context of the PSBLAS (Parallel Sparse Basic Linear Algebra Subprograms) computational framework and can be used in conjuction with the Krylov solvers available in this framework. Our package is based on a completely algebraic approach; therefore users level interfaces assume that the system matrix and preconditioners are represented as PSBLAS distributed sparse matrices. - +AMG4PSBLAS enables the user to easily specify different features of an algebraic multilevel preconditioner, thus allowing to experiment with different preconditioners for the problem and parallel computers at hand. -P. D'Ambra, D. di Serafino, S. Filippone, -MLD2P4: a Package of Parallel Algebraic Multilevel Domain Decomposition -Preconditioners in Fortran 95, -ACM Transactions on Mathematical Software, 37 (3), 2010, art. 30, -doi: 10.1145/1824801.1824808. +The package employs object-oriented design techniques in Fortran 2008, with interfaces to additional third party libraries such as MUMPS, UMFPACK, SuperLU, and SuperLU_Dist, which can be exploited in building multilevel preconditioners. The parallel implementation is based on a Single Program Multiple Data (SPMD) paradigm; the inter-process communication is based on MPI and is managed mainly through PSBLAS. +## Main Refrerences: -TO COMPILE +The main reference for this project is +> D'Ambra, P., Durastante, F., & Filippone, S. (2021). AMG preconditioners for linear solvers towards extreme scale. SIAM Journal on Scientific Computing, 43(5), S679-S703. + +AMG4PSBLAS is the suite of preconditioners for the Parallel Sparse Computation Toolkit ([PSCToolkit](https://psctoolkit.github.io/)) suite of libraries. See the paper: +> D’Ambra, P., Durastante, F., & Filippone, S. (2023). Parallel Sparse Computation Toolkit. Software Impacts, 15, 100463. + +The main reference for features inherited from MLD2P4 is +> P. D'Ambra, D. di Serafino, S. Filippone, +> MLD2P4: a Package of Parallel Algebraic Multilevel Domain Decomposition +> Preconditioners in Fortran 95, +> ACM Transactions on Mathematical Software, 37 (3), 2010, art. 30, +> doi: 10.1145/1824801.1824808. + +## Installing + +Installation requires having a working version of the [PSBLAS](https://github.com/sfilippone/psblas3) library installed. +AMG4PSBLAS has several interfaces to third-party libraries that can be used in the construction and application phases of preconditioners. +In particular, it is possible to link AMG4PSBLAS with the libraries: MUMPS, SuperLU, SuperLU_Dist, UMFPACK. This is _not mandatory_ and the library can run +in isolation and without these features. 0. Unpack the tar file in a directory of your choice (preferrably outside the main PSBLAS directory). -1. run configure --with-psblas= +1. run configure `--with-psblas=` adding the options for MUMPS, SuperLU, SuperLU_Dist, UMFPACK as desired. - See MLD2P4 User's and Reference Guide (Section 3) for details. -2. Tweak Make.inc if you are not satisfied. -3. make; + See [AMG4PSBLAS User's and Reference Guide](docs/amg4psblas_1.0-guide.pdf) (Section 3) for details. +2. Tweak `Make.inc` if you are not satisfied. +3. run `make`; 4. Go into the test subdirectory and build the examples of your choice. -5. (if desired): make install +5. (if desired): `make install` + +>[!CAUTION] +>The single precision version is supported only by MUMPS and SuperLU; +>thus, even if you specify at configure time to use UMFPACK or SuperLU_Dist, +>the corresponding preconditioner options will be available only from +>the double precision version. + +### CUDA, OpeMP, OpenACC + +CUDA, OpenMP and OpenACC features are transparently inherited by PSBLAS installation. If PSBLAS has been configured (and installed) with these supports then AMG4PSBLAS will transparently inherit them. It will then be possible to move the computation to GPU accelerator simply by selecting the appropriate variable types. If these have not been activated or installed for PSBLAS then they will not be available for AMG4PSBLAS either and the operation will be purely on CPU/MPI. + +### EoCoE - Software as service portal + +In the European project “Energy oriented Center of Excellence: toward exascale for energy” we made available a software as service portal: [https://eocoe.psnc.pl/](https://eocoe.psnc.pl/). This permits to test several cutting-edge computational methods for accelerating the transition to the production, storage and management of clean, decarbonized energy. Among them you have the possibility of running PSBLAS+AMG4PSBLAS on some test problems to become familiar with using the software. +## TODO and bugs + +- [X] Fix all reamining bugs. Bugs? We dont' have any ! 🤓 -NOTES +> [!NOTE] +> To report bugs 🐛 or issues ❓ please use the [GitHub issue system](https://github.com/sfilippone/amg4psblas/issues). -- The single precision version is supported only by MUMPS and SuperLU; - thus, even if you specify at configure time to use UMFPACK or SuperLU_Dist, - the corresponding preconditioner options will be available only from - the double precision version. +## The AMG4PSBLAS team. +- Pasqua D'Ambra (IAC-CNR, Naples, IT) +- Fabio Durastante (University of Pisa and IAC-CNR, IT) +- Salvatore Filippone (University of Rome Tor Vergata and IAC-CNR, IT) -The AMG4PSBLAS team. ---------------- -Salvatore Filippone -Pasqua D'Ambra -Fabio Durastante diff --git a/amgprec/amg_base_prec_type.F90 b/amgprec/amg_base_prec_type.F90 index 896bf13b..70e55801 100644 --- a/amgprec/amg_base_prec_type.F90 +++ b/amgprec/amg_base_prec_type.F90 @@ -288,17 +288,19 @@ module amg_base_prec_type ! ! Legal values for entry: amg_aggr_prol_ ! - integer(psb_ipk_), parameter :: amg_no_smooth_ = 0 - integer(psb_ipk_), parameter :: amg_smooth_prol_ = 1 - integer(psb_ipk_), parameter :: amg_min_energy_ = 2 + integer(psb_ipk_), parameter :: amg_no_smooth_ = 0 + integer(psb_ipk_), parameter :: amg_smooth_prol_ = 1 + integer(psb_ipk_), parameter :: amg_l1_smooth_prol_ = 2 + integer(psb_ipk_), parameter :: amg_min_energy_ = 3 ! Disabling min_energy for the time being. - integer(psb_ipk_), parameter :: amg_max_aggr_prol_=amg_smooth_prol_ + integer(psb_ipk_), parameter :: amg_max_aggr_prol_= amg_l1_smooth_prol_ ! ! Legal values for entry: amg_aggr_filter_ ! - integer(psb_ipk_), parameter :: amg_no_filter_mat_ = 0 - integer(psb_ipk_), parameter :: amg_filter_mat_ = 1 - integer(psb_ipk_), parameter :: amg_max_filter_mat_ = amg_filter_mat_ + integer(psb_ipk_), parameter :: amg_no_filter_mat_ = 0 + integer(psb_ipk_), parameter :: amg_filter_mat_ = 1 + integer(psb_ipk_), parameter :: amg_filter_prow_mat_ = 2 + integer(psb_ipk_), parameter :: amg_max_filter_mat_ = amg_filter_prow_mat_ ! ! Legal values for entry: amg_aggr_ord_ ! @@ -376,10 +378,11 @@ module amg_base_prec_type character(len=19), parameter, private :: & & eigen_estimates(0:0)=(/'infinity norm '/) character(len=15), parameter, private :: & - & aggr_prols(0:3)=(/'unsmoothed ','smoothed ',& - & 'min energy ','bizr. smoothed'/) + & aggr_prols(0:4)=(/'unsmoothed ','smoothed ',& + & 'l1-smoothed ','min energy ','bizr. smoothed'/) character(len=15), parameter, private :: & - & aggr_filters(0:1)=(/'no filtering ','filtering '/) + & aggr_filters(0:2)=(/'no filtering ','filtering ',& + & 'filtering rsum'/) character(len=15), parameter, private :: & & matrix_names(0:1)=(/'distributed ','replicated '/) character(len=18), parameter, private :: & @@ -548,6 +551,8 @@ contains val = amg_no_smooth_ case('SMOOTHED') val = amg_smooth_prol_ + case('L1-SMOOTHED','L1SMOOTHED') + val = amg_l1_smooth_prol_ case('MINENERGY') val = amg_min_energy_ case('NOPREC') @@ -588,6 +593,8 @@ contains val = amg_eig_est_ case('FILTER') val = amg_filter_mat_ + case('FILTERROWSUM') + val = amg_filter_prow_mat_ case('NOFILTER','NO_FILTER') val = amg_no_filter_mat_ case('OUTER_SWEEPS') diff --git a/amgprec/amg_c_ainv_solver.F90 b/amgprec/amg_c_ainv_solver.F90 index 2b20847f..fa9c7bf0 100644 --- a/amgprec/amg_c_ainv_solver.F90 +++ b/amgprec/amg_c_ainv_solver.F90 @@ -91,9 +91,9 @@ module amg_c_ainv_solver import :: psb_desc_type, psb_cspmat_type, psb_c_base_sparse_mat, & & amg_c_base_solver_type, psb_dpk_, amg_c_ainv_solver_type, psb_ipk_ Implicit None - class(amg_c_ainv_solver_type), intent(inout) :: sv - class(amg_c_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_c_ainv_solver_type), intent(inout) :: sv + class(amg_c_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_c_ainv_solver_clone_settings end interface diff --git a/amgprec/amg_c_inner_mod.f90 b/amgprec/amg_c_inner_mod.f90 index ac260134..c97b8c3f 100644 --- a/amgprec/amg_c_inner_mod.f90 +++ b/amgprec/amg_c_inner_mod.f90 @@ -109,11 +109,12 @@ module amg_c_inner_mod end interface amg_map_to_tprol abstract interface - subroutine amg_caggrmat_var_bld(a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_caggrmat_var_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: psb_cspmat_type, psb_desc_type, psb_spk_, psb_ipk_, psb_lpk_, psb_lcspmat_type import :: amg_c_onelev_type, amg_sml_parms implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_cspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) diff --git a/amgprec/amg_c_invk_solver.f90 b/amgprec/amg_c_invk_solver.f90 index 40eb3613..ca10c991 100644 --- a/amgprec/amg_c_invk_solver.f90 +++ b/amgprec/amg_c_invk_solver.f90 @@ -79,9 +79,9 @@ module amg_c_invk_solver import :: psb_desc_type, psb_cspmat_type, psb_c_base_sparse_mat, & & amg_c_base_solver_type, psb_spk_, amg_c_invk_solver_type, psb_ipk_ Implicit None - class(amg_c_invk_solver_type), intent(inout) :: sv - class(amg_c_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_c_invk_solver_type), intent(inout) :: sv + class(amg_c_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_c_invk_solver_clone_settings end interface diff --git a/amgprec/amg_c_invt_solver.f90 b/amgprec/amg_c_invt_solver.f90 index a73fd19e..eb3b6a85 100644 --- a/amgprec/amg_c_invt_solver.f90 +++ b/amgprec/amg_c_invt_solver.f90 @@ -79,9 +79,9 @@ module amg_c_invt_solver import :: psb_desc_type, psb_cspmat_type, psb_c_base_sparse_mat, & & amg_c_base_solver_type, psb_spk_, amg_c_invt_solver_type, psb_ipk_ Implicit None - class(amg_c_invt_solver_type), intent(inout) :: sv - class(amg_c_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_c_invt_solver_type), intent(inout) :: sv + class(amg_c_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_c_invt_solver_clone_settings end interface diff --git a/amgprec/amg_c_jac_smoother.f90 b/amgprec/amg_c_jac_smoother.f90 index 70aba712..9e70fa13 100644 --- a/amgprec/amg_c_jac_smoother.f90 +++ b/amgprec/amg_c_jac_smoother.f90 @@ -203,8 +203,8 @@ module amg_c_jac_smoother subroutine amg_c_jac_smoother_clone_settings(sm,smout,info) import :: amg_c_jac_smoother_type, psb_spk_, & & amg_c_base_smoother_type, psb_ipk_ - class(amg_c_jac_smoother_type), intent(inout) :: sm - class(amg_c_base_smoother_type), allocatable, intent(inout) :: smout + class(amg_c_jac_smoother_type), intent(inout) :: sm + class(amg_c_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info end subroutine amg_c_jac_smoother_clone_settings end interface diff --git a/amgprec/amg_d_ainv_solver.F90 b/amgprec/amg_d_ainv_solver.F90 index d091305b..7045082d 100644 --- a/amgprec/amg_d_ainv_solver.F90 +++ b/amgprec/amg_d_ainv_solver.F90 @@ -91,9 +91,9 @@ module amg_d_ainv_solver import :: psb_desc_type, psb_dspmat_type, psb_d_base_sparse_mat, & & amg_d_base_solver_type, psb_dpk_, amg_d_ainv_solver_type, psb_ipk_ Implicit None - class(amg_d_ainv_solver_type), intent(inout) :: sv - class(amg_d_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_d_ainv_solver_type), intent(inout) :: sv + class(amg_d_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_d_ainv_solver_clone_settings end interface diff --git a/amgprec/amg_d_inner_mod.f90 b/amgprec/amg_d_inner_mod.f90 index 8fa96609..83176af7 100644 --- a/amgprec/amg_d_inner_mod.f90 +++ b/amgprec/amg_d_inner_mod.f90 @@ -109,11 +109,12 @@ module amg_d_inner_mod end interface amg_map_to_tprol abstract interface - subroutine amg_daggrmat_var_bld(a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_daggrmat_var_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: psb_dspmat_type, psb_desc_type, psb_dpk_, psb_ipk_, psb_lpk_, psb_ldspmat_type import :: amg_d_onelev_type, amg_dml_parms implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) diff --git a/amgprec/amg_d_invk_solver.f90 b/amgprec/amg_d_invk_solver.f90 index 23c04c6c..9b4356e8 100644 --- a/amgprec/amg_d_invk_solver.f90 +++ b/amgprec/amg_d_invk_solver.f90 @@ -79,9 +79,9 @@ module amg_d_invk_solver import :: psb_desc_type, psb_dspmat_type, psb_d_base_sparse_mat, & & amg_d_base_solver_type, psb_dpk_, amg_d_invk_solver_type, psb_ipk_ Implicit None - class(amg_d_invk_solver_type), intent(inout) :: sv - class(amg_d_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_d_invk_solver_type), intent(inout) :: sv + class(amg_d_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_d_invk_solver_clone_settings end interface diff --git a/amgprec/amg_d_invt_solver.f90 b/amgprec/amg_d_invt_solver.f90 index 0ba91619..898e60eb 100644 --- a/amgprec/amg_d_invt_solver.f90 +++ b/amgprec/amg_d_invt_solver.f90 @@ -79,9 +79,9 @@ module amg_d_invt_solver import :: psb_desc_type, psb_dspmat_type, psb_d_base_sparse_mat, & & amg_d_base_solver_type, psb_dpk_, amg_d_invt_solver_type, psb_ipk_ Implicit None - class(amg_d_invt_solver_type), intent(inout) :: sv - class(amg_d_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_d_invt_solver_type), intent(inout) :: sv + class(amg_d_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_d_invt_solver_clone_settings end interface diff --git a/amgprec/amg_d_jac_smoother.f90 b/amgprec/amg_d_jac_smoother.f90 index 8f3845a0..6ce889e8 100644 --- a/amgprec/amg_d_jac_smoother.f90 +++ b/amgprec/amg_d_jac_smoother.f90 @@ -203,8 +203,8 @@ module amg_d_jac_smoother subroutine amg_d_jac_smoother_clone_settings(sm,smout,info) import :: amg_d_jac_smoother_type, psb_dpk_, & & amg_d_base_smoother_type, psb_ipk_ - class(amg_d_jac_smoother_type), intent(inout) :: sm - class(amg_d_base_smoother_type), allocatable, intent(inout) :: smout + class(amg_d_jac_smoother_type), intent(inout) :: sm + class(amg_d_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info end subroutine amg_d_jac_smoother_clone_settings end interface diff --git a/amgprec/amg_d_parmatch_aggregator_mod.F90 b/amgprec/amg_d_parmatch_aggregator_mod.F90 index 525bb0c3..def1fba1 100644 --- a/amgprec/amg_d_parmatch_aggregator_mod.F90 +++ b/amgprec/amg_d_parmatch_aggregator_mod.F90 @@ -244,11 +244,12 @@ module amg_d_parmatch_aggregator_mod end interface interface - subroutine amg_d_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_d_parmatch_unsmth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: amg_d_parmatch_aggregator_type, psb_desc_type, psb_dspmat_type,& & psb_ldspmat_type, psb_dpk_, psb_ipk_, psb_lpk_, amg_dml_parms, amg_daggr_data implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_d_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a @@ -262,11 +263,12 @@ module amg_d_parmatch_aggregator_mod end interface interface - subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_d_parmatch_smth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: amg_d_parmatch_aggregator_type, psb_desc_type, psb_dspmat_type,& & psb_ldspmat_type, psb_dpk_, psb_ipk_, psb_lpk_, amg_dml_parms, amg_daggr_data implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_d_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a diff --git a/amgprec/amg_d_poly_smoother.f90 b/amgprec/amg_d_poly_smoother.f90 index f8294303..4d428d22 100644 --- a/amgprec/amg_d_poly_smoother.f90 +++ b/amgprec/amg_d_poly_smoother.f90 @@ -192,8 +192,8 @@ module amg_d_poly_smoother subroutine amg_d_poly_smoother_clone_settings(sm,smout,info) import :: amg_d_poly_smoother_type, psb_dpk_, & & amg_d_base_smoother_type, psb_ipk_ - class(amg_d_poly_smoother_type), intent(inout) :: sm - class(amg_d_base_smoother_type), allocatable, intent(inout) :: smout + class(amg_d_poly_smoother_type), intent(inout) :: sm + class(amg_d_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info end subroutine amg_d_poly_smoother_clone_settings end interface diff --git a/amgprec/amg_s_ainv_solver.F90 b/amgprec/amg_s_ainv_solver.F90 index 654ffcd1..665a868f 100644 --- a/amgprec/amg_s_ainv_solver.F90 +++ b/amgprec/amg_s_ainv_solver.F90 @@ -91,9 +91,9 @@ module amg_s_ainv_solver import :: psb_desc_type, psb_sspmat_type, psb_s_base_sparse_mat, & & amg_s_base_solver_type, psb_dpk_, amg_s_ainv_solver_type, psb_ipk_ Implicit None - class(amg_s_ainv_solver_type), intent(inout) :: sv - class(amg_s_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_s_ainv_solver_type), intent(inout) :: sv + class(amg_s_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_s_ainv_solver_clone_settings end interface diff --git a/amgprec/amg_s_inner_mod.f90 b/amgprec/amg_s_inner_mod.f90 index 85ed1089..be883b21 100644 --- a/amgprec/amg_s_inner_mod.f90 +++ b/amgprec/amg_s_inner_mod.f90 @@ -109,11 +109,12 @@ module amg_s_inner_mod end interface amg_map_to_tprol abstract interface - subroutine amg_saggrmat_var_bld(a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_saggrmat_var_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: psb_sspmat_type, psb_desc_type, psb_spk_, psb_ipk_, psb_lpk_, psb_lsspmat_type import :: amg_s_onelev_type, amg_sml_parms implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) diff --git a/amgprec/amg_s_invk_solver.f90 b/amgprec/amg_s_invk_solver.f90 index bb663a5f..9de4d9cf 100644 --- a/amgprec/amg_s_invk_solver.f90 +++ b/amgprec/amg_s_invk_solver.f90 @@ -79,9 +79,9 @@ module amg_s_invk_solver import :: psb_desc_type, psb_sspmat_type, psb_s_base_sparse_mat, & & amg_s_base_solver_type, psb_spk_, amg_s_invk_solver_type, psb_ipk_ Implicit None - class(amg_s_invk_solver_type), intent(inout) :: sv - class(amg_s_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_s_invk_solver_type), intent(inout) :: sv + class(amg_s_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_s_invk_solver_clone_settings end interface diff --git a/amgprec/amg_s_invt_solver.f90 b/amgprec/amg_s_invt_solver.f90 index d4ff72bc..efa181f4 100644 --- a/amgprec/amg_s_invt_solver.f90 +++ b/amgprec/amg_s_invt_solver.f90 @@ -79,9 +79,9 @@ module amg_s_invt_solver import :: psb_desc_type, psb_sspmat_type, psb_s_base_sparse_mat, & & amg_s_base_solver_type, psb_spk_, amg_s_invt_solver_type, psb_ipk_ Implicit None - class(amg_s_invt_solver_type), intent(inout) :: sv - class(amg_s_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_s_invt_solver_type), intent(inout) :: sv + class(amg_s_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_s_invt_solver_clone_settings end interface diff --git a/amgprec/amg_s_jac_smoother.f90 b/amgprec/amg_s_jac_smoother.f90 index 6d4ded83..24771855 100644 --- a/amgprec/amg_s_jac_smoother.f90 +++ b/amgprec/amg_s_jac_smoother.f90 @@ -203,8 +203,8 @@ module amg_s_jac_smoother subroutine amg_s_jac_smoother_clone_settings(sm,smout,info) import :: amg_s_jac_smoother_type, psb_spk_, & & amg_s_base_smoother_type, psb_ipk_ - class(amg_s_jac_smoother_type), intent(inout) :: sm - class(amg_s_base_smoother_type), allocatable, intent(inout) :: smout + class(amg_s_jac_smoother_type), intent(inout) :: sm + class(amg_s_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info end subroutine amg_s_jac_smoother_clone_settings end interface diff --git a/amgprec/amg_s_parmatch_aggregator_mod.F90 b/amgprec/amg_s_parmatch_aggregator_mod.F90 index d58bd750..059c5b73 100644 --- a/amgprec/amg_s_parmatch_aggregator_mod.F90 +++ b/amgprec/amg_s_parmatch_aggregator_mod.F90 @@ -244,11 +244,12 @@ module amg_s_parmatch_aggregator_mod end interface interface - subroutine amg_s_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_s_parmatch_unsmth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: amg_s_parmatch_aggregator_type, psb_desc_type, psb_sspmat_type,& & psb_lsspmat_type, psb_dpk_, psb_ipk_, psb_lpk_, amg_sml_parms, amg_saggr_data implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_s_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a @@ -262,11 +263,12 @@ module amg_s_parmatch_aggregator_mod end interface interface - subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_s_parmatch_smth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: amg_s_parmatch_aggregator_type, psb_desc_type, psb_sspmat_type,& & psb_lsspmat_type, psb_dpk_, psb_ipk_, psb_lpk_, amg_sml_parms, amg_saggr_data implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_s_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a diff --git a/amgprec/amg_s_poly_smoother.f90 b/amgprec/amg_s_poly_smoother.f90 index 95052ed2..27797993 100644 --- a/amgprec/amg_s_poly_smoother.f90 +++ b/amgprec/amg_s_poly_smoother.f90 @@ -192,8 +192,8 @@ module amg_s_poly_smoother subroutine amg_s_poly_smoother_clone_settings(sm,smout,info) import :: amg_s_poly_smoother_type, psb_spk_, & & amg_s_base_smoother_type, psb_ipk_ - class(amg_s_poly_smoother_type), intent(inout) :: sm - class(amg_s_base_smoother_type), allocatable, intent(inout) :: smout + class(amg_s_poly_smoother_type), intent(inout) :: sm + class(amg_s_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info end subroutine amg_s_poly_smoother_clone_settings end interface diff --git a/amgprec/amg_z_ainv_solver.F90 b/amgprec/amg_z_ainv_solver.F90 index e88d83f3..b0634900 100644 --- a/amgprec/amg_z_ainv_solver.F90 +++ b/amgprec/amg_z_ainv_solver.F90 @@ -91,9 +91,9 @@ module amg_z_ainv_solver import :: psb_desc_type, psb_zspmat_type, psb_z_base_sparse_mat, & & amg_z_base_solver_type, psb_dpk_, amg_z_ainv_solver_type, psb_ipk_ Implicit None - class(amg_z_ainv_solver_type), intent(inout) :: sv - class(amg_z_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_z_ainv_solver_type), intent(inout) :: sv + class(amg_z_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_z_ainv_solver_clone_settings end interface diff --git a/amgprec/amg_z_inner_mod.f90 b/amgprec/amg_z_inner_mod.f90 index bf997651..fb7139fd 100644 --- a/amgprec/amg_z_inner_mod.f90 +++ b/amgprec/amg_z_inner_mod.f90 @@ -109,11 +109,12 @@ module amg_z_inner_mod end interface amg_map_to_tprol abstract interface - subroutine amg_zaggrmat_var_bld(a,desc_a,ilaggr,nlaggr,parms,& + subroutine amg_zaggrmat_var_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,parms,& & ac,desc_ac,op_prol,op_restr,t_prol,info) import :: psb_zspmat_type, psb_desc_type, psb_dpk_, psb_ipk_, psb_lpk_, psb_lzspmat_type import :: amg_z_onelev_type, amg_dml_parms implicit none + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_zspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) diff --git a/amgprec/amg_z_invk_solver.f90 b/amgprec/amg_z_invk_solver.f90 index db5d84e2..b5e38595 100644 --- a/amgprec/amg_z_invk_solver.f90 +++ b/amgprec/amg_z_invk_solver.f90 @@ -79,9 +79,9 @@ module amg_z_invk_solver import :: psb_desc_type, psb_zspmat_type, psb_z_base_sparse_mat, & & amg_z_base_solver_type, psb_dpk_, amg_z_invk_solver_type, psb_ipk_ Implicit None - class(amg_z_invk_solver_type), intent(inout) :: sv - class(amg_z_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_z_invk_solver_type), intent(inout) :: sv + class(amg_z_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_z_invk_solver_clone_settings end interface diff --git a/amgprec/amg_z_invt_solver.f90 b/amgprec/amg_z_invt_solver.f90 index 45d59f8e..96ddd6b0 100644 --- a/amgprec/amg_z_invt_solver.f90 +++ b/amgprec/amg_z_invt_solver.f90 @@ -79,9 +79,9 @@ module amg_z_invt_solver import :: psb_desc_type, psb_zspmat_type, psb_z_base_sparse_mat, & & amg_z_base_solver_type, psb_dpk_, amg_z_invt_solver_type, psb_ipk_ Implicit None - class(amg_z_invt_solver_type), intent(inout) :: sv - class(amg_z_base_solver_type), allocatable, intent(inout) :: svout - integer(psb_ipk_), intent(out) :: info + class(amg_z_invt_solver_type), intent(inout) :: sv + class(amg_z_base_solver_type), intent(inout) :: svout + integer(psb_ipk_), intent(out) :: info end subroutine amg_z_invt_solver_clone_settings end interface diff --git a/amgprec/amg_z_jac_smoother.f90 b/amgprec/amg_z_jac_smoother.f90 index bfe83949..afcf25eb 100644 --- a/amgprec/amg_z_jac_smoother.f90 +++ b/amgprec/amg_z_jac_smoother.f90 @@ -203,8 +203,8 @@ module amg_z_jac_smoother subroutine amg_z_jac_smoother_clone_settings(sm,smout,info) import :: amg_z_jac_smoother_type, psb_dpk_, & & amg_z_base_smoother_type, psb_ipk_ - class(amg_z_jac_smoother_type), intent(inout) :: sm - class(amg_z_base_smoother_type), allocatable, intent(inout) :: smout + class(amg_z_jac_smoother_type), intent(inout) :: sm + class(amg_z_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info end subroutine amg_z_jac_smoother_clone_settings end interface diff --git a/amgprec/impl/aggregator/amg_c_dec_aggregator_mat_bld.f90 b/amgprec/impl/aggregator/amg_c_dec_aggregator_mat_bld.f90 index 2c9317d1..186e7a1e 100644 --- a/amgprec/impl/aggregator/amg_c_dec_aggregator_mat_bld.f90 +++ b/amgprec/impl/aggregator/amg_c_dec_aggregator_mat_bld.f90 @@ -177,23 +177,24 @@ subroutine amg_c_dec_aggregator_mat_bld(ag,parms,a,desc_a,ilaggr,nlaggr,& select case (parms%aggr_prol) case (amg_no_smooth_) - call amg_caggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,& - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_caggrmat_nosmth_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - case(amg_smooth_prol_) + case(amg_smooth_prol_,amg_l1_smooth_prol_) - call amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_caggrmat_smth_bld(parms%aggr_prol,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,& + op_restr,t_prol,info) !!$ case(amg_biz_prol_) !!$ !!$ call amg_caggrmat_biz_bld(a,desc_a,ilaggr,nlaggr, & !!$ & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - + case(amg_min_energy_) - call amg_caggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_caggrmat_minnrg_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) case default info = psb_err_internal_error_ diff --git a/amgprec/impl/aggregator/amg_c_ptap_bld.f90 b/amgprec/impl/aggregator/amg_c_ptap_bld.f90 index 02dcb1f4..7883bde5 100644 --- a/amgprec/impl/aggregator/amg_c_ptap_bld.f90 +++ b/amgprec/impl/aggregator/amg_c_ptap_bld.f90 @@ -216,6 +216,7 @@ subroutine amg_c_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -420,6 +421,7 @@ subroutine amg_c_lc_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -629,6 +631,7 @@ subroutine amg_lc_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_c_rap.f90 b/amgprec/impl/aggregator/amg_c_rap.f90 index 9d3549cf..287f3a32 100644 --- a/amgprec/impl/aggregator/amg_c_rap.f90 +++ b/amgprec/impl/aggregator/amg_c_rap.f90 @@ -142,6 +142,7 @@ subroutine amg_c_rap(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_c_soc1_map_bld.F90 b/amgprec/impl/aggregator/amg_c_soc1_map_bld.F90 index 24720675..36c4311c 100644 --- a/amgprec/impl/aggregator/amg_c_soc1_map_bld.F90 +++ b/amgprec/impl/aggregator/amg_c_soc1_map_bld.F90 @@ -250,7 +250,7 @@ subroutine amg_c_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in ! we will not reset. if (j>nr) cycle step1 if (ilaggr(j) > 0) cycle step1 - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.czero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -357,7 +357,7 @@ subroutine amg_c_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in do k=1, nz j = icol(k) if ((1<=j).and.(j<=nr)) then - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.czero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -545,4 +545,3 @@ subroutine amg_c_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in return end subroutine amg_c_soc1_map_bld - diff --git a/amgprec/impl/aggregator/amg_caggrmat_minnrg_bld.f90 b/amgprec/impl/aggregator/amg_caggrmat_minnrg_bld.f90 index 8ac86dc6..3d223279 100644 --- a/amgprec/impl/aggregator/amg_caggrmat_minnrg_bld.f90 +++ b/amgprec/impl/aggregator/amg_caggrmat_minnrg_bld.f90 @@ -69,6 +69,7 @@ ! ! ! Arguments: +! dol1smoothing - fictitious integer argument, it is not used inside ! a - type(psb_cspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -104,8 +105,8 @@ ! Error code. ! ! -subroutine amg_caggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_caggrmat_minnrg_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_c_inner_mod, amg_protect_name => amg_caggrmat_minnrg_bld @@ -113,6 +114,7 @@ subroutine amg_caggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_cspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -171,6 +173,13 @@ subroutine amg_caggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& filter_mat = (parms%aggr_filter == amg_filter_mat_) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + + !NEEDS TO BE REWORKED !! ! naggr: number of local aggregates diff --git a/amgprec/impl/aggregator/amg_caggrmat_nosmth_bld.f90 b/amgprec/impl/aggregator/amg_caggrmat_nosmth_bld.f90 index 87c79dc6..9699545e 100644 --- a/amgprec/impl/aggregator/amg_caggrmat_nosmth_bld.f90 +++ b/amgprec/impl/aggregator/amg_caggrmat_nosmth_bld.f90 @@ -94,10 +94,11 @@ ! ! info - integer, output. ! Error code. +! dol1smoothing - optional, this is here just for interfacing reasons. It is not used by the +! code ! -! -subroutine amg_caggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_caggrmat_nosmth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_c_inner_mod, amg_protect_name => amg_caggrmat_nosmth_bld @@ -105,6 +106,7 @@ subroutine amg_caggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_cspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -137,6 +139,12 @@ subroutine amg_caggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& ctxt = desc_a%get_context() call psb_info(ctxt, me, np) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() ncol = desc_a%get_local_cols() diff --git a/amgprec/impl/aggregator/amg_caggrmat_smth_bld.f90 b/amgprec/impl/aggregator/amg_caggrmat_smth_bld.f90 index 67fce476..209e4570 100644 --- a/amgprec/impl/aggregator/amg_caggrmat_smth_bld.f90 +++ b/amgprec/impl/aggregator/amg_caggrmat_smth_bld.f90 @@ -69,6 +69,8 @@ ! ! ! Arguments: +! dol1smooth - Integer taking the type of smoother that has to be used +! on the tentative prolongator ! a - type(psb_cspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -102,16 +104,18 @@ ! info - integer, output. ! Error code. ! -subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_caggrmat_smth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_c_inner_mod, amg_protect_name => amg_caggrmat_smth_bld use amg_c_base_aggregator_mod +! use, intrinsic :: ieee_arithmetic implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_cspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -132,7 +136,7 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& type(psb_c_coo_sparse_mat) :: coo_prol, coo_restr type(psb_c_csr_sparse_mat) :: acsr1, acsrf, csr_prol, acsr complex(psb_spk_), allocatable :: adiag(:) - real(psb_spk_), allocatable :: arwsum(:) + real(psb_spk_), allocatable :: arwsum(:),l1rwsum(:) integer(psb_ipk_) :: ierr(5) logical :: filter_mat integer(psb_ipk_) :: debug_level, debug_unit, err_act @@ -141,6 +145,7 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& logical, parameter :: debug_new=.false. character(len=80) :: filename logical, parameter :: do_timings=.false. + logical :: do_l1correction=.false. integer(psb_ipk_), save :: idx_spspmm=-1, idx_phase1=-1, idx_gtrans=-1, idx_phase2=-1, idx_refine=-1 integer(psb_ipk_), save :: idx_phase3=-1, idx_cdasb=-1, idx_ptap=-1 @@ -173,6 +178,9 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if ((do_timings).and.(idx_ptap==-1)) & & idx_ptap = psb_get_timer_idx("DEC_SMTH_BLD: ptap_bld ") + ! check if we have to use Jacobi or l1-Jacobi to smooth the tentative prolongator + if (dol1smoothing.eq.amg_l1_smooth_prol_) do_l1correction=.true. + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() @@ -185,7 +193,7 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& naggrm1 = sum(nlaggr(1:me)) naggrp1 = sum(nlaggr(1:me+1)) - filter_mat = (parms%aggr_filter == amg_filter_mat_) + filter_mat = (parms%aggr_filter == amg_filter_mat_).or.(parms%aggr_filter == amg_filter_prow_mat_) ! ! naggr: number of local aggregates @@ -200,6 +208,24 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (info == psb_success_) & & call psb_halo(adiag,desc_a,info) if (info == psb_success_) call a%cp_to(acsr) + ! + ! Do the l1-correction on the diagonal if it is requested + ! + if (do_l1correction) then + allocate(l1rwsum(nrow)) + call acsr%arwsum(l1rwsum) + if (info == psb_success_) & + & call psb_realloc(ncol,l1rwsum,info) + if (info == psb_success_) & + & call psb_halo(l1rwsum,desc_a,info) + ! \tilde{D}_{i,i} = \sum_{j \ne i} |a_{i,j}| + !$OMP parallel do private(i) schedule(static) + do i=1,size(adiag) + adiag(i) = adiag(i) + l1rwsum(i) - abs(adiag(i)) + end do + !$OMP end parallel do + end if + if(info /= psb_success_) then call psb_errpush(psb_err_from_subroutine_,name,a_err='sp_getdiag') @@ -230,9 +256,15 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& enddo if (jd == -1) then - write(0,*) name,': Warning: there is no diagonal element', i - else + ! if (.not.do_l1correction) + write(0,*) 'Wrong input: we need the diagonal!!!!', i + else if (parms%aggr_filter == amg_filter_mat_) then + ! We perform filtering in the standard way assuming that A is an M-matrix acsrf%val(jd)=acsrf%val(jd)-tmp + else if (parms%aggr_filter == amg_filter_prow_mat_) then + ! We are probably doing l1-correction, hence we want to preserve the + ! row sum of the matrix: note the change in sign + acsrf%val(jd)=acsrf%val(jd)+tmp end if enddo !$OMP end parallel do @@ -240,7 +272,6 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& call acsrf%clean_zeros(info) end if - !$OMP parallel do private(i) schedule(static) do i=1,size(adiag) if (adiag(i) /= czero) then @@ -252,14 +283,17 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& !$OMP end parallel do if (parms%aggr_omega_alg == amg_eig_est_) then - if (parms%aggr_eig == amg_max_norm_) then + if ( (parms%aggr_filter == amg_filter_prow_mat_).and.(do_l1correction) ) then + ! For l1-Jacobi this can be estimated with 1: + ! this makes sense only if we are preserving the row-sum! + parms%aggr_omega_val = done + else if (parms%aggr_eig == amg_max_norm_) then allocate(arwsum(nrow)) call acsr%arwsum(arwsum) anorm = maxval(abs(adiag(1:nrow)*arwsum(1:nrow))) call psb_amx(ctxt,anorm) omega = 4.d0/(3.d0*anorm) parms%aggr_omega_val = omega - else info = psb_err_internal_error_ call psb_errpush(info,name,a_err='invalid amg_aggr_eig_') @@ -322,6 +356,7 @@ subroutine amg_caggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (debug_level >= psb_debug_outer_) & & write(debug_unit,*) me,' ',trim(name),& & 'Done smooth_aggregate ' + if (allocated(l1rwsum)) deallocate(l1rwsum) call psb_erractionrestore(err_act) return diff --git a/amgprec/impl/aggregator/amg_d_dec_aggregator_mat_bld.f90 b/amgprec/impl/aggregator/amg_d_dec_aggregator_mat_bld.f90 index 7b01a0b8..65bd08e8 100644 --- a/amgprec/impl/aggregator/amg_d_dec_aggregator_mat_bld.f90 +++ b/amgprec/impl/aggregator/amg_d_dec_aggregator_mat_bld.f90 @@ -177,23 +177,24 @@ subroutine amg_d_dec_aggregator_mat_bld(ag,parms,a,desc_a,ilaggr,nlaggr,& select case (parms%aggr_prol) case (amg_no_smooth_) - call amg_daggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,& - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_daggrmat_nosmth_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - case(amg_smooth_prol_) + case(amg_smooth_prol_,amg_l1_smooth_prol_) - call amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_daggrmat_smth_bld(parms%aggr_prol,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,& + op_restr,t_prol,info) !!$ case(amg_biz_prol_) !!$ !!$ call amg_daggrmat_biz_bld(a,desc_a,ilaggr,nlaggr, & !!$ & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - + case(amg_min_energy_) - call amg_daggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_daggrmat_minnrg_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) case default info = psb_err_internal_error_ diff --git a/amgprec/impl/aggregator/amg_d_parmatch_aggregator_mat_bld.F90 b/amgprec/impl/aggregator/amg_d_parmatch_aggregator_mat_bld.F90 index 9b1171e0..1766d4a4 100644 --- a/amgprec/impl/aggregator/amg_d_parmatch_aggregator_mat_bld.F90 +++ b/amgprec/impl/aggregator/amg_d_parmatch_aggregator_mat_bld.F90 @@ -184,20 +184,23 @@ subroutine amg_d_parmatch_aggregator_mat_bld(ag,parms,a,desc_a,ilaggr,nlaggr,& ! select case (parms%aggr_prol) case (amg_no_smooth_) - call amg_d_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_d_parmatch_unsmth_bld(parms%aggr_prol,ag,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,op_restr,& + t_prol,info) - case(amg_smooth_prol_) - call amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) + case(amg_smooth_prol_,amg_l1_smooth_prol_) + call amg_d_parmatch_smth_bld(parms%aggr_prol,ag,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,op_restr,& + t_prol,info) !!$ case(amg_biz_prol_) !!$ call amg_daggrmat_biz_bld(a,desc_a,ilaggr,nlaggr, & !!$ & parms,ac,desc_ac,op_prol,op_restr,info) case(amg_min_energy_) - call amg_daggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_daggrmat_minnrg_bld(parms%aggr_prol,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,op_restr,& + t_prol,info) case default info = psb_err_internal_error_ diff --git a/amgprec/impl/aggregator/amg_d_parmatch_smth_bld.F90 b/amgprec/impl/aggregator/amg_d_parmatch_smth_bld.F90 index b20b2fa8..23cd1459 100644 --- a/amgprec/impl/aggregator/amg_d_parmatch_smth_bld.F90 +++ b/amgprec/impl/aggregator/amg_d_parmatch_smth_bld.F90 @@ -69,6 +69,8 @@ ! ! ! Arguments: +! dol1smoothing - Select between l1-Jacobi and Jacobi as smoother for the +! tentative prolongator ! a - type(psb_dspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -102,8 +104,8 @@ ! info - integer, output. ! Error code. ! -subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_d_parmatch_smth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_d_inner_mod @@ -116,6 +118,7 @@ subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_d_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a @@ -137,7 +140,7 @@ subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& type(psb_d_coo_sparse_mat) :: coo_prol, coo_restr type(psb_d_csr_sparse_mat) :: acsrf, csr_prol, acsr, tcsr real(psb_dpk_), allocatable :: adiag(:) - real(psb_dpk_), allocatable :: arwsum(:) + real(psb_dpk_), allocatable :: arwsum(:),l1rwsum(:) logical :: filter_mat integer(psb_ipk_) :: debug_level, debug_unit, err_act integer(psb_ipk_), parameter :: ncmax=16 @@ -145,6 +148,7 @@ subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& logical, parameter :: debug_new=.false., dump_r=.false., dump_p=.false., debug=.false. character(len=80) :: filename logical, parameter :: do_timings=.false. + logical :: do_l1correction=.false. integer(psb_ipk_), save :: idx_spspmm=-1, idx_phase1=-1, idx_gtrans=-1, idx_phase2=-1, idx_refine=-1, idx_phase3=-1 integer(psb_ipk_), save :: idx_cdasb=-1, idx_ptap=-1 @@ -166,6 +170,10 @@ subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& ncol = desc_a%get_local_cols() theta = parms%aggr_thresh + ! Check if we have to perform l1-Jacobi or Jacobi as smoother + if(dol1smoothing.eq.amg_l1_smooth_prol_) do_l1correction=.true. + + !write(0,*) me,' ',trim(name),' Start ',idx_spspmm if ((do_timings).and.(idx_spspmm==-1)) & & idx_spspmm = psb_get_timer_idx("PMC_SMTH_BLD: par_spspmm") @@ -217,6 +225,19 @@ subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& if (info == psb_success_) & & call psb_halo(adiag,desc_a,info) if (info == psb_success_) call a%cp_to(acsr) + ! Get the l1-diagonal of D + if (do_l1correction) then + allocate(l1rwsum(nrow)) + call acsr%arwsum(l1rwsum) + if (info == psb_success_) & + & call psb_realloc(ncol,l1rwsum,info) + if (info == psb_success_) & + & call psb_halo(l1rwsum,desc_a,info) + ! \tilde{D}_{i,i} = \sum_{j \ne i} |a_{i,j}| + do i=1,size(adiag) + adiag(i) = adiag(i) + l1rwsum(i) - abs(adiag(i)) + end do + end if if(info /= psb_success_) then call psb_errpush(psb_err_from_subroutine_,name,a_err='sp_getdiag') @@ -267,7 +288,10 @@ subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& if (parms%aggr_omega_alg == amg_eig_est_) then - if (parms%aggr_eig == amg_max_norm_) then + if (do_l1correction) then + ! For l1-Jacobi this can be estimated with 1 + parms%aggr_omega_val = done + else if (parms%aggr_eig == amg_max_norm_) then allocate(arwsum(nrow)) call acsr%arwsum(arwsum) anorm = maxval(abs(adiag(1:nrow)*arwsum(1:nrow))) @@ -373,6 +397,7 @@ subroutine amg_d_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& end block end if + if (allocated(l1rwsum)) deallocate(l1rwsum) if (do_timings) call psb_toc(idx_phase2) if (debug_level >= psb_debug_outer_) & diff --git a/amgprec/impl/aggregator/amg_d_parmatch_unsmth_bld.F90 b/amgprec/impl/aggregator/amg_d_parmatch_unsmth_bld.F90 index dc6574a0..85f1ec28 100644 --- a/amgprec/impl/aggregator/amg_d_parmatch_unsmth_bld.F90 +++ b/amgprec/impl/aggregator/amg_d_parmatch_unsmth_bld.F90 @@ -68,6 +68,8 @@ ! ! ! Arguments: +! dol1smoothing - this not actually used inside unsmoothed aggregation, it +! is used just to perform a check ! a - type(psb_dspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -101,8 +103,8 @@ ! info - integer, output. ! Error code. ! -subroutine amg_d_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_d_parmatch_unsmth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_d_inner_mod @@ -115,6 +117,7 @@ subroutine amg_d_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_d_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a @@ -159,6 +162,11 @@ subroutine amg_d_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& ictxt = desc_a%get_context() call psb_info(ictxt, me, np) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if #if !defined(SERIAL_MPI) nglob = desc_a%get_global_rows() diff --git a/amgprec/impl/aggregator/amg_d_ptap_bld.f90 b/amgprec/impl/aggregator/amg_d_ptap_bld.f90 index 4006c04c..b3f4c3a7 100644 --- a/amgprec/impl/aggregator/amg_d_ptap_bld.f90 +++ b/amgprec/impl/aggregator/amg_d_ptap_bld.f90 @@ -216,6 +216,7 @@ subroutine amg_d_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -420,6 +421,7 @@ subroutine amg_d_ld_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -629,6 +631,7 @@ subroutine amg_ld_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_d_rap.f90 b/amgprec/impl/aggregator/amg_d_rap.f90 index 73e0c034..e3af161f 100644 --- a/amgprec/impl/aggregator/amg_d_rap.f90 +++ b/amgprec/impl/aggregator/amg_d_rap.f90 @@ -142,6 +142,7 @@ subroutine amg_d_rap(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_d_soc1_map_bld.F90 b/amgprec/impl/aggregator/amg_d_soc1_map_bld.F90 index 200d630c..b38344ed 100644 --- a/amgprec/impl/aggregator/amg_d_soc1_map_bld.F90 +++ b/amgprec/impl/aggregator/amg_d_soc1_map_bld.F90 @@ -250,7 +250,7 @@ subroutine amg_d_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in ! we will not reset. if (j>nr) cycle step1 if (ilaggr(j) > 0) cycle step1 - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.dzero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -357,7 +357,7 @@ subroutine amg_d_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in do k=1, nz j = icol(k) if ((1<=j).and.(j<=nr)) then - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.dzero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -545,4 +545,3 @@ subroutine amg_d_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in return end subroutine amg_d_soc1_map_bld - diff --git a/amgprec/impl/aggregator/amg_daggrmat_minnrg_bld.f90 b/amgprec/impl/aggregator/amg_daggrmat_minnrg_bld.f90 index 510339e7..c9f87f72 100644 --- a/amgprec/impl/aggregator/amg_daggrmat_minnrg_bld.f90 +++ b/amgprec/impl/aggregator/amg_daggrmat_minnrg_bld.f90 @@ -69,6 +69,7 @@ ! ! ! Arguments: +! dol1smoothing - fictitious integer argument, it is not used inside ! a - type(psb_dspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -104,8 +105,8 @@ ! Error code. ! ! -subroutine amg_daggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_daggrmat_minnrg_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_d_inner_mod, amg_protect_name => amg_daggrmat_minnrg_bld @@ -113,6 +114,7 @@ subroutine amg_daggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -171,6 +173,13 @@ subroutine amg_daggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& filter_mat = (parms%aggr_filter == amg_filter_mat_) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + + !NEEDS TO BE REWORKED !! ! naggr: number of local aggregates diff --git a/amgprec/impl/aggregator/amg_daggrmat_nosmth_bld.f90 b/amgprec/impl/aggregator/amg_daggrmat_nosmth_bld.f90 index 78e396cc..345316b1 100644 --- a/amgprec/impl/aggregator/amg_daggrmat_nosmth_bld.f90 +++ b/amgprec/impl/aggregator/amg_daggrmat_nosmth_bld.f90 @@ -94,10 +94,11 @@ ! ! info - integer, output. ! Error code. +! dol1smoothing - optional, this is here just for interfacing reasons. It is not used by the +! code ! -! -subroutine amg_daggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_daggrmat_nosmth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_d_inner_mod, amg_protect_name => amg_daggrmat_nosmth_bld @@ -105,6 +106,7 @@ subroutine amg_daggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -137,6 +139,12 @@ subroutine amg_daggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& ctxt = desc_a%get_context() call psb_info(ctxt, me, np) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() ncol = desc_a%get_local_cols() diff --git a/amgprec/impl/aggregator/amg_daggrmat_smth_bld.f90 b/amgprec/impl/aggregator/amg_daggrmat_smth_bld.f90 index d12f9c6b..742a96fb 100644 --- a/amgprec/impl/aggregator/amg_daggrmat_smth_bld.f90 +++ b/amgprec/impl/aggregator/amg_daggrmat_smth_bld.f90 @@ -69,6 +69,8 @@ ! ! ! Arguments: +! dol1smooth - Integer taking the type of smoother that has to be used +! on the tentative prolongator ! a - type(psb_dspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -102,16 +104,18 @@ ! info - integer, output. ! Error code. ! -subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_daggrmat_smth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_d_inner_mod, amg_protect_name => amg_daggrmat_smth_bld use amg_d_base_aggregator_mod +! use, intrinsic :: ieee_arithmetic implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_dspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -132,7 +136,7 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& type(psb_d_coo_sparse_mat) :: coo_prol, coo_restr type(psb_d_csr_sparse_mat) :: acsr1, acsrf, csr_prol, acsr real(psb_dpk_), allocatable :: adiag(:) - real(psb_dpk_), allocatable :: arwsum(:) + real(psb_dpk_), allocatable :: arwsum(:),l1rwsum(:) integer(psb_ipk_) :: ierr(5) logical :: filter_mat integer(psb_ipk_) :: debug_level, debug_unit, err_act @@ -141,6 +145,7 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& logical, parameter :: debug_new=.false. character(len=80) :: filename logical, parameter :: do_timings=.false. + logical :: do_l1correction=.false. integer(psb_ipk_), save :: idx_spspmm=-1, idx_phase1=-1, idx_gtrans=-1, idx_phase2=-1, idx_refine=-1 integer(psb_ipk_), save :: idx_phase3=-1, idx_cdasb=-1, idx_ptap=-1 @@ -173,6 +178,9 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if ((do_timings).and.(idx_ptap==-1)) & & idx_ptap = psb_get_timer_idx("DEC_SMTH_BLD: ptap_bld ") + ! check if we have to use Jacobi or l1-Jacobi to smooth the tentative prolongator + if (dol1smoothing.eq.amg_l1_smooth_prol_) do_l1correction=.true. + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() @@ -185,7 +193,7 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& naggrm1 = sum(nlaggr(1:me)) naggrp1 = sum(nlaggr(1:me+1)) - filter_mat = (parms%aggr_filter == amg_filter_mat_) + filter_mat = (parms%aggr_filter == amg_filter_mat_).or.(parms%aggr_filter == amg_filter_prow_mat_) ! ! naggr: number of local aggregates @@ -200,6 +208,24 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (info == psb_success_) & & call psb_halo(adiag,desc_a,info) if (info == psb_success_) call a%cp_to(acsr) + ! + ! Do the l1-correction on the diagonal if it is requested + ! + if (do_l1correction) then + allocate(l1rwsum(nrow)) + call acsr%arwsum(l1rwsum) + if (info == psb_success_) & + & call psb_realloc(ncol,l1rwsum,info) + if (info == psb_success_) & + & call psb_halo(l1rwsum,desc_a,info) + ! \tilde{D}_{i,i} = \sum_{j \ne i} |a_{i,j}| + !$OMP parallel do private(i) schedule(static) + do i=1,size(adiag) + adiag(i) = adiag(i) + l1rwsum(i) - abs(adiag(i)) + end do + !$OMP end parallel do + end if + if(info /= psb_success_) then call psb_errpush(psb_err_from_subroutine_,name,a_err='sp_getdiag') @@ -230,9 +256,15 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& enddo if (jd == -1) then - write(0,*) name,': Warning: there is no diagonal element', i - else + ! if (.not.do_l1correction) + write(0,*) 'Wrong input: we need the diagonal!!!!', i + else if (parms%aggr_filter == amg_filter_mat_) then + ! We perform filtering in the standard way assuming that A is an M-matrix acsrf%val(jd)=acsrf%val(jd)-tmp + else if (parms%aggr_filter == amg_filter_prow_mat_) then + ! We are probably doing l1-correction, hence we want to preserve the + ! row sum of the matrix: note the change in sign + acsrf%val(jd)=acsrf%val(jd)+tmp end if enddo !$OMP end parallel do @@ -240,7 +272,6 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& call acsrf%clean_zeros(info) end if - !$OMP parallel do private(i) schedule(static) do i=1,size(adiag) if (adiag(i) /= dzero) then @@ -252,14 +283,17 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& !$OMP end parallel do if (parms%aggr_omega_alg == amg_eig_est_) then - if (parms%aggr_eig == amg_max_norm_) then + if ( (parms%aggr_filter == amg_filter_prow_mat_).and.(do_l1correction) ) then + ! For l1-Jacobi this can be estimated with 1: + ! this makes sense only if we are preserving the row-sum! + parms%aggr_omega_val = done + else if (parms%aggr_eig == amg_max_norm_) then allocate(arwsum(nrow)) call acsr%arwsum(arwsum) anorm = maxval(abs(adiag(1:nrow)*arwsum(1:nrow))) call psb_amx(ctxt,anorm) omega = 4.d0/(3.d0*anorm) parms%aggr_omega_val = omega - else info = psb_err_internal_error_ call psb_errpush(info,name,a_err='invalid amg_aggr_eig_') @@ -322,6 +356,7 @@ subroutine amg_daggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (debug_level >= psb_debug_outer_) & & write(debug_unit,*) me,' ',trim(name),& & 'Done smooth_aggregate ' + if (allocated(l1rwsum)) deallocate(l1rwsum) call psb_erractionrestore(err_act) return diff --git a/amgprec/impl/aggregator/amg_s_dec_aggregator_mat_bld.f90 b/amgprec/impl/aggregator/amg_s_dec_aggregator_mat_bld.f90 index 39e96ec7..750889de 100644 --- a/amgprec/impl/aggregator/amg_s_dec_aggregator_mat_bld.f90 +++ b/amgprec/impl/aggregator/amg_s_dec_aggregator_mat_bld.f90 @@ -177,23 +177,24 @@ subroutine amg_s_dec_aggregator_mat_bld(ag,parms,a,desc_a,ilaggr,nlaggr,& select case (parms%aggr_prol) case (amg_no_smooth_) - call amg_saggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,& - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_saggrmat_nosmth_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - case(amg_smooth_prol_) + case(amg_smooth_prol_,amg_l1_smooth_prol_) - call amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_saggrmat_smth_bld(parms%aggr_prol,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,& + op_restr,t_prol,info) !!$ case(amg_biz_prol_) !!$ !!$ call amg_saggrmat_biz_bld(a,desc_a,ilaggr,nlaggr, & !!$ & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - + case(amg_min_energy_) - call amg_saggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_saggrmat_minnrg_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) case default info = psb_err_internal_error_ diff --git a/amgprec/impl/aggregator/amg_s_parmatch_aggregator_mat_bld.F90 b/amgprec/impl/aggregator/amg_s_parmatch_aggregator_mat_bld.F90 index 8c10d006..38dff13c 100644 --- a/amgprec/impl/aggregator/amg_s_parmatch_aggregator_mat_bld.F90 +++ b/amgprec/impl/aggregator/amg_s_parmatch_aggregator_mat_bld.F90 @@ -184,20 +184,23 @@ subroutine amg_s_parmatch_aggregator_mat_bld(ag,parms,a,desc_a,ilaggr,nlaggr,& ! select case (parms%aggr_prol) case (amg_no_smooth_) - call amg_s_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_s_parmatch_unsmth_bld(parms%aggr_prol,ag,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,op_restr,& + t_prol,info) - case(amg_smooth_prol_) - call amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) + case(amg_smooth_prol_,amg_l1_smooth_prol_) + call amg_s_parmatch_smth_bld(parms%aggr_prol,ag,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,op_restr,& + t_prol,info) !!$ case(amg_biz_prol_) !!$ call amg_saggrmat_biz_bld(a,desc_a,ilaggr,nlaggr, & !!$ & parms,ac,desc_ac,op_prol,op_restr,info) case(amg_min_energy_) - call amg_saggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_saggrmat_minnrg_bld(parms%aggr_prol,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,op_restr,& + t_prol,info) case default info = psb_err_internal_error_ diff --git a/amgprec/impl/aggregator/amg_s_parmatch_smth_bld.F90 b/amgprec/impl/aggregator/amg_s_parmatch_smth_bld.F90 index 323853d1..7a53055f 100644 --- a/amgprec/impl/aggregator/amg_s_parmatch_smth_bld.F90 +++ b/amgprec/impl/aggregator/amg_s_parmatch_smth_bld.F90 @@ -69,6 +69,8 @@ ! ! ! Arguments: +! dol1smoothing - Select between l1-Jacobi and Jacobi as smoother for the +! tentative prolongator ! a - type(psb_sspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -102,8 +104,8 @@ ! info - integer, output. ! Error code. ! -subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_s_parmatch_smth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_s_inner_mod @@ -116,6 +118,7 @@ subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_s_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a @@ -137,7 +140,7 @@ subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& type(psb_s_coo_sparse_mat) :: coo_prol, coo_restr type(psb_s_csr_sparse_mat) :: acsrf, csr_prol, acsr, tcsr real(psb_spk_), allocatable :: adiag(:) - real(psb_spk_), allocatable :: arwsum(:) + real(psb_spk_), allocatable :: arwsum(:),l1rwsum(:) logical :: filter_mat integer(psb_ipk_) :: debug_level, debug_unit, err_act integer(psb_ipk_), parameter :: ncmax=16 @@ -145,6 +148,7 @@ subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& logical, parameter :: debug_new=.false., dump_r=.false., dump_p=.false., debug=.false. character(len=80) :: filename logical, parameter :: do_timings=.false. + logical :: do_l1correction=.false. integer(psb_ipk_), save :: idx_spspmm=-1, idx_phase1=-1, idx_gtrans=-1, idx_phase2=-1, idx_refine=-1, idx_phase3=-1 integer(psb_ipk_), save :: idx_cdasb=-1, idx_ptap=-1 @@ -166,6 +170,10 @@ subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& ncol = desc_a%get_local_cols() theta = parms%aggr_thresh + ! Check if we have to perform l1-Jacobi or Jacobi as smoother + if(dol1smoothing.eq.amg_l1_smooth_prol_) do_l1correction=.true. + + !write(0,*) me,' ',trim(name),' Start ',idx_spspmm if ((do_timings).and.(idx_spspmm==-1)) & & idx_spspmm = psb_get_timer_idx("PMC_SMTH_BLD: par_spspmm") @@ -217,6 +225,19 @@ subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& if (info == psb_success_) & & call psb_halo(adiag,desc_a,info) if (info == psb_success_) call a%cp_to(acsr) + ! Get the l1-diagonal of D + if (do_l1correction) then + allocate(l1rwsum(nrow)) + call acsr%arwsum(l1rwsum) + if (info == psb_success_) & + & call psb_realloc(ncol,l1rwsum,info) + if (info == psb_success_) & + & call psb_halo(l1rwsum,desc_a,info) + ! \tilde{D}_{i,i} = \sum_{j \ne i} |a_{i,j}| + do i=1,size(adiag) + adiag(i) = adiag(i) + l1rwsum(i) - abs(adiag(i)) + end do + end if if(info /= psb_success_) then call psb_errpush(psb_err_from_subroutine_,name,a_err='sp_getdiag') @@ -267,7 +288,10 @@ subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& if (parms%aggr_omega_alg == amg_eig_est_) then - if (parms%aggr_eig == amg_max_norm_) then + if (do_l1correction) then + ! For l1-Jacobi this can be estimated with 1 + parms%aggr_omega_val = done + else if (parms%aggr_eig == amg_max_norm_) then allocate(arwsum(nrow)) call acsr%arwsum(arwsum) anorm = maxval(abs(adiag(1:nrow)*arwsum(1:nrow))) @@ -373,6 +397,7 @@ subroutine amg_s_parmatch_smth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& end block end if + if (allocated(l1rwsum)) deallocate(l1rwsum) if (do_timings) call psb_toc(idx_phase2) if (debug_level >= psb_debug_outer_) & diff --git a/amgprec/impl/aggregator/amg_s_parmatch_unsmth_bld.F90 b/amgprec/impl/aggregator/amg_s_parmatch_unsmth_bld.F90 index 6fcf65ac..4a96ab95 100644 --- a/amgprec/impl/aggregator/amg_s_parmatch_unsmth_bld.F90 +++ b/amgprec/impl/aggregator/amg_s_parmatch_unsmth_bld.F90 @@ -68,6 +68,8 @@ ! ! ! Arguments: +! dol1smoothing - this not actually used inside unsmoothed aggregation, it +! is used just to perform a check ! a - type(psb_sspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -101,8 +103,8 @@ ! info - integer, output. ! Error code. ! -subroutine amg_s_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_s_parmatch_unsmth_bld(dol1smoothing,ag,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_s_inner_mod @@ -115,6 +117,7 @@ subroutine amg_s_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing class(amg_s_parmatch_aggregator_type), target, intent(inout) :: ag type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a @@ -159,6 +162,11 @@ subroutine amg_s_parmatch_unsmth_bld(ag,a,desc_a,ilaggr,nlaggr,parms,& ictxt = desc_a%get_context() call psb_info(ictxt, me, np) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if #if !defined(SERIAL_MPI) nglob = desc_a%get_global_rows() diff --git a/amgprec/impl/aggregator/amg_s_ptap_bld.f90 b/amgprec/impl/aggregator/amg_s_ptap_bld.f90 index e1a6c867..b46ac118 100644 --- a/amgprec/impl/aggregator/amg_s_ptap_bld.f90 +++ b/amgprec/impl/aggregator/amg_s_ptap_bld.f90 @@ -216,6 +216,7 @@ subroutine amg_s_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -420,6 +421,7 @@ subroutine amg_s_ls_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -629,6 +631,7 @@ subroutine amg_ls_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_s_rap.f90 b/amgprec/impl/aggregator/amg_s_rap.f90 index 2ba1dbbf..720c760e 100644 --- a/amgprec/impl/aggregator/amg_s_rap.f90 +++ b/amgprec/impl/aggregator/amg_s_rap.f90 @@ -142,6 +142,7 @@ subroutine amg_s_rap(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_s_soc1_map_bld.F90 b/amgprec/impl/aggregator/amg_s_soc1_map_bld.F90 index 0f8bb7dd..d5331992 100644 --- a/amgprec/impl/aggregator/amg_s_soc1_map_bld.F90 +++ b/amgprec/impl/aggregator/amg_s_soc1_map_bld.F90 @@ -250,7 +250,7 @@ subroutine amg_s_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in ! we will not reset. if (j>nr) cycle step1 if (ilaggr(j) > 0) cycle step1 - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.szero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -357,7 +357,7 @@ subroutine amg_s_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in do k=1, nz j = icol(k) if ((1<=j).and.(j<=nr)) then - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.szero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -545,4 +545,3 @@ subroutine amg_s_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in return end subroutine amg_s_soc1_map_bld - diff --git a/amgprec/impl/aggregator/amg_saggrmat_minnrg_bld.f90 b/amgprec/impl/aggregator/amg_saggrmat_minnrg_bld.f90 index a609e382..80a35344 100644 --- a/amgprec/impl/aggregator/amg_saggrmat_minnrg_bld.f90 +++ b/amgprec/impl/aggregator/amg_saggrmat_minnrg_bld.f90 @@ -69,6 +69,7 @@ ! ! ! Arguments: +! dol1smoothing - fictitious integer argument, it is not used inside ! a - type(psb_sspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -104,8 +105,8 @@ ! Error code. ! ! -subroutine amg_saggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_saggrmat_minnrg_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_s_inner_mod, amg_protect_name => amg_saggrmat_minnrg_bld @@ -113,6 +114,7 @@ subroutine amg_saggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -171,6 +173,13 @@ subroutine amg_saggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& filter_mat = (parms%aggr_filter == amg_filter_mat_) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + + !NEEDS TO BE REWORKED !! ! naggr: number of local aggregates diff --git a/amgprec/impl/aggregator/amg_saggrmat_nosmth_bld.f90 b/amgprec/impl/aggregator/amg_saggrmat_nosmth_bld.f90 index ceeca998..fe684d0c 100644 --- a/amgprec/impl/aggregator/amg_saggrmat_nosmth_bld.f90 +++ b/amgprec/impl/aggregator/amg_saggrmat_nosmth_bld.f90 @@ -94,10 +94,11 @@ ! ! info - integer, output. ! Error code. +! dol1smoothing - optional, this is here just for interfacing reasons. It is not used by the +! code ! -! -subroutine amg_saggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_saggrmat_nosmth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_s_inner_mod, amg_protect_name => amg_saggrmat_nosmth_bld @@ -105,6 +106,7 @@ subroutine amg_saggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -137,6 +139,12 @@ subroutine amg_saggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& ctxt = desc_a%get_context() call psb_info(ctxt, me, np) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() ncol = desc_a%get_local_cols() diff --git a/amgprec/impl/aggregator/amg_saggrmat_smth_bld.f90 b/amgprec/impl/aggregator/amg_saggrmat_smth_bld.f90 index b70490ba..6caf61e5 100644 --- a/amgprec/impl/aggregator/amg_saggrmat_smth_bld.f90 +++ b/amgprec/impl/aggregator/amg_saggrmat_smth_bld.f90 @@ -69,6 +69,8 @@ ! ! ! Arguments: +! dol1smooth - Integer taking the type of smoother that has to be used +! on the tentative prolongator ! a - type(psb_sspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -102,16 +104,18 @@ ! info - integer, output. ! Error code. ! -subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_saggrmat_smth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_s_inner_mod, amg_protect_name => amg_saggrmat_smth_bld use amg_s_base_aggregator_mod +! use, intrinsic :: ieee_arithmetic implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_sspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -132,7 +136,7 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& type(psb_s_coo_sparse_mat) :: coo_prol, coo_restr type(psb_s_csr_sparse_mat) :: acsr1, acsrf, csr_prol, acsr real(psb_spk_), allocatable :: adiag(:) - real(psb_spk_), allocatable :: arwsum(:) + real(psb_spk_), allocatable :: arwsum(:),l1rwsum(:) integer(psb_ipk_) :: ierr(5) logical :: filter_mat integer(psb_ipk_) :: debug_level, debug_unit, err_act @@ -141,6 +145,7 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& logical, parameter :: debug_new=.false. character(len=80) :: filename logical, parameter :: do_timings=.false. + logical :: do_l1correction=.false. integer(psb_ipk_), save :: idx_spspmm=-1, idx_phase1=-1, idx_gtrans=-1, idx_phase2=-1, idx_refine=-1 integer(psb_ipk_), save :: idx_phase3=-1, idx_cdasb=-1, idx_ptap=-1 @@ -173,6 +178,9 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if ((do_timings).and.(idx_ptap==-1)) & & idx_ptap = psb_get_timer_idx("DEC_SMTH_BLD: ptap_bld ") + ! check if we have to use Jacobi or l1-Jacobi to smooth the tentative prolongator + if (dol1smoothing.eq.amg_l1_smooth_prol_) do_l1correction=.true. + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() @@ -185,7 +193,7 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& naggrm1 = sum(nlaggr(1:me)) naggrp1 = sum(nlaggr(1:me+1)) - filter_mat = (parms%aggr_filter == amg_filter_mat_) + filter_mat = (parms%aggr_filter == amg_filter_mat_).or.(parms%aggr_filter == amg_filter_prow_mat_) ! ! naggr: number of local aggregates @@ -200,6 +208,24 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (info == psb_success_) & & call psb_halo(adiag,desc_a,info) if (info == psb_success_) call a%cp_to(acsr) + ! + ! Do the l1-correction on the diagonal if it is requested + ! + if (do_l1correction) then + allocate(l1rwsum(nrow)) + call acsr%arwsum(l1rwsum) + if (info == psb_success_) & + & call psb_realloc(ncol,l1rwsum,info) + if (info == psb_success_) & + & call psb_halo(l1rwsum,desc_a,info) + ! \tilde{D}_{i,i} = \sum_{j \ne i} |a_{i,j}| + !$OMP parallel do private(i) schedule(static) + do i=1,size(adiag) + adiag(i) = adiag(i) + l1rwsum(i) - abs(adiag(i)) + end do + !$OMP end parallel do + end if + if(info /= psb_success_) then call psb_errpush(psb_err_from_subroutine_,name,a_err='sp_getdiag') @@ -230,9 +256,15 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& enddo if (jd == -1) then - write(0,*) name,': Warning: there is no diagonal element', i - else + ! if (.not.do_l1correction) + write(0,*) 'Wrong input: we need the diagonal!!!!', i + else if (parms%aggr_filter == amg_filter_mat_) then + ! We perform filtering in the standard way assuming that A is an M-matrix acsrf%val(jd)=acsrf%val(jd)-tmp + else if (parms%aggr_filter == amg_filter_prow_mat_) then + ! We are probably doing l1-correction, hence we want to preserve the + ! row sum of the matrix: note the change in sign + acsrf%val(jd)=acsrf%val(jd)+tmp end if enddo !$OMP end parallel do @@ -240,7 +272,6 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& call acsrf%clean_zeros(info) end if - !$OMP parallel do private(i) schedule(static) do i=1,size(adiag) if (adiag(i) /= szero) then @@ -252,14 +283,17 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& !$OMP end parallel do if (parms%aggr_omega_alg == amg_eig_est_) then - if (parms%aggr_eig == amg_max_norm_) then + if ( (parms%aggr_filter == amg_filter_prow_mat_).and.(do_l1correction) ) then + ! For l1-Jacobi this can be estimated with 1: + ! this makes sense only if we are preserving the row-sum! + parms%aggr_omega_val = done + else if (parms%aggr_eig == amg_max_norm_) then allocate(arwsum(nrow)) call acsr%arwsum(arwsum) anorm = maxval(abs(adiag(1:nrow)*arwsum(1:nrow))) call psb_amx(ctxt,anorm) omega = 4.d0/(3.d0*anorm) parms%aggr_omega_val = omega - else info = psb_err_internal_error_ call psb_errpush(info,name,a_err='invalid amg_aggr_eig_') @@ -322,6 +356,7 @@ subroutine amg_saggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (debug_level >= psb_debug_outer_) & & write(debug_unit,*) me,' ',trim(name),& & 'Done smooth_aggregate ' + if (allocated(l1rwsum)) deallocate(l1rwsum) call psb_erractionrestore(err_act) return diff --git a/amgprec/impl/aggregator/amg_z_dec_aggregator_mat_bld.f90 b/amgprec/impl/aggregator/amg_z_dec_aggregator_mat_bld.f90 index 7135bfc3..e3b9f6af 100644 --- a/amgprec/impl/aggregator/amg_z_dec_aggregator_mat_bld.f90 +++ b/amgprec/impl/aggregator/amg_z_dec_aggregator_mat_bld.f90 @@ -177,23 +177,24 @@ subroutine amg_z_dec_aggregator_mat_bld(ag,parms,a,desc_a,ilaggr,nlaggr,& select case (parms%aggr_prol) case (amg_no_smooth_) - call amg_zaggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,& - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_zaggrmat_nosmth_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - case(amg_smooth_prol_) + case(amg_smooth_prol_,amg_l1_smooth_prol_) - call amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_zaggrmat_smth_bld(parms%aggr_prol,a,desc_a,& + ilaggr,nlaggr,parms,ac,desc_ac,op_prol,& + op_restr,t_prol,info) !!$ case(amg_biz_prol_) !!$ !!$ call amg_zaggrmat_biz_bld(a,desc_a,ilaggr,nlaggr, & !!$ & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) - + case(amg_min_energy_) - call amg_zaggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr, & - & parms,ac,desc_ac,op_prol,op_restr,t_prol,info) + call amg_zaggrmat_minnrg_bld(parms%aggr_prol,a,desc_a,ilaggr,& + nlaggr,parms,ac,desc_ac,op_prol,op_restr,t_prol,info) case default info = psb_err_internal_error_ diff --git a/amgprec/impl/aggregator/amg_z_ptap_bld.f90 b/amgprec/impl/aggregator/amg_z_ptap_bld.f90 index e322a303..bfd098b0 100644 --- a/amgprec/impl/aggregator/amg_z_ptap_bld.f90 +++ b/amgprec/impl/aggregator/amg_z_ptap_bld.f90 @@ -216,6 +216,7 @@ subroutine amg_z_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -420,6 +421,7 @@ subroutine amg_z_lz_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() @@ -629,6 +631,7 @@ subroutine amg_lz_ptap_bld(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_z_rap.f90 b/amgprec/impl/aggregator/amg_z_rap.f90 index 814a859c..e591e1f8 100644 --- a/amgprec/impl/aggregator/amg_z_rap.f90 +++ b/amgprec/impl/aggregator/amg_z_rap.f90 @@ -142,6 +142,7 @@ subroutine amg_z_rap(a_csr,desc_a,nlaggr,parms,ac,& call ac_csr%set_nrows(desc_ac%get_local_rows()) call ac_csr%set_ncols(desc_ac%get_local_cols()) + call ac_csr%clean_zeros(info) call ac%mv_from(ac_csr) call ac%set_asb() diff --git a/amgprec/impl/aggregator/amg_z_soc1_map_bld.F90 b/amgprec/impl/aggregator/amg_z_soc1_map_bld.F90 index 7961921a..de5eb91c 100644 --- a/amgprec/impl/aggregator/amg_z_soc1_map_bld.F90 +++ b/amgprec/impl/aggregator/amg_z_soc1_map_bld.F90 @@ -250,7 +250,7 @@ subroutine amg_z_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in ! we will not reset. if (j>nr) cycle step1 if (ilaggr(j) > 0) cycle step1 - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.zzero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -357,7 +357,7 @@ subroutine amg_z_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in do k=1, nz j = icol(k) if ((1<=j).and.(j<=nr)) then - if (abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))) then + if ((abs(val(k)) > theta*sqrt(abs(diag(i)*diag(j)))).and.(diag(i).ne.zzero)) then ip = ip + 1 icol(ip) = icol(k) end if @@ -545,4 +545,3 @@ subroutine amg_z_soc1_map_bld(iorder,theta,clean_zeros,a,desc_a,nlaggr,ilaggr,in return end subroutine amg_z_soc1_map_bld - diff --git a/amgprec/impl/aggregator/amg_zaggrmat_minnrg_bld.f90 b/amgprec/impl/aggregator/amg_zaggrmat_minnrg_bld.f90 index 7bafbc18..eaa7f273 100644 --- a/amgprec/impl/aggregator/amg_zaggrmat_minnrg_bld.f90 +++ b/amgprec/impl/aggregator/amg_zaggrmat_minnrg_bld.f90 @@ -69,6 +69,7 @@ ! ! ! Arguments: +! dol1smoothing - fictitious integer argument, it is not used inside ! a - type(psb_zspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -104,8 +105,8 @@ ! Error code. ! ! -subroutine amg_zaggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_zaggrmat_minnrg_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_z_inner_mod, amg_protect_name => amg_zaggrmat_minnrg_bld @@ -113,6 +114,7 @@ subroutine amg_zaggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_zspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -171,6 +173,13 @@ subroutine amg_zaggrmat_minnrg_bld(a,desc_a,ilaggr,nlaggr,parms,& filter_mat = (parms%aggr_filter == amg_filter_mat_) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + + !NEEDS TO BE REWORKED !! ! naggr: number of local aggregates diff --git a/amgprec/impl/aggregator/amg_zaggrmat_nosmth_bld.f90 b/amgprec/impl/aggregator/amg_zaggrmat_nosmth_bld.f90 index 6fd53861..2a0b631c 100644 --- a/amgprec/impl/aggregator/amg_zaggrmat_nosmth_bld.f90 +++ b/amgprec/impl/aggregator/amg_zaggrmat_nosmth_bld.f90 @@ -94,10 +94,11 @@ ! ! info - integer, output. ! Error code. +! dol1smoothing - optional, this is here just for interfacing reasons. It is not used by the +! code ! -! -subroutine amg_zaggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_zaggrmat_nosmth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_z_inner_mod, amg_protect_name => amg_zaggrmat_nosmth_bld @@ -105,6 +106,7 @@ subroutine amg_zaggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_zspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -137,6 +139,12 @@ subroutine amg_zaggrmat_nosmth_bld(a,desc_a,ilaggr,nlaggr,parms,& ctxt = desc_a%get_context() call psb_info(ctxt, me, np) + if (dol1smoothing.ne.amg_no_smooth_) then + info=psb_err_fatal_; + call psb_errpush(info,name,a_err='Are you trying to smooth an unsmoothed aggregation?') + goto 9999 + end if + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() ncol = desc_a%get_local_cols() diff --git a/amgprec/impl/aggregator/amg_zaggrmat_smth_bld.f90 b/amgprec/impl/aggregator/amg_zaggrmat_smth_bld.f90 index 11f30589..7d2cbe5f 100644 --- a/amgprec/impl/aggregator/amg_zaggrmat_smth_bld.f90 +++ b/amgprec/impl/aggregator/amg_zaggrmat_smth_bld.f90 @@ -69,6 +69,8 @@ ! ! ! Arguments: +! dol1smooth - Integer taking the type of smoother that has to be used +! on the tentative prolongator ! a - type(psb_zspmat_type), input. ! The sparse matrix structure containing the local part of ! the fine-level matrix. @@ -102,16 +104,18 @@ ! info - integer, output. ! Error code. ! -subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& - & ac,desc_ac,op_prol,op_restr,t_prol,info) +subroutine amg_zaggrmat_smth_bld(dol1smoothing,a,desc_a,ilaggr,nlaggr,& + parms,ac,desc_ac,op_prol,op_restr,t_prol,info) use psb_base_mod use amg_base_prec_type use amg_z_inner_mod, amg_protect_name => amg_zaggrmat_smth_bld use amg_z_base_aggregator_mod +! use, intrinsic :: ieee_arithmetic implicit none ! Arguments + integer(psb_ipk_), intent(in) :: dol1smoothing type(psb_zspmat_type), intent(in) :: a type(psb_desc_type), intent(inout) :: desc_a integer(psb_lpk_), intent(inout) :: ilaggr(:), nlaggr(:) @@ -132,7 +136,7 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& type(psb_z_coo_sparse_mat) :: coo_prol, coo_restr type(psb_z_csr_sparse_mat) :: acsr1, acsrf, csr_prol, acsr complex(psb_dpk_), allocatable :: adiag(:) - real(psb_dpk_), allocatable :: arwsum(:) + real(psb_dpk_), allocatable :: arwsum(:),l1rwsum(:) integer(psb_ipk_) :: ierr(5) logical :: filter_mat integer(psb_ipk_) :: debug_level, debug_unit, err_act @@ -141,6 +145,7 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& logical, parameter :: debug_new=.false. character(len=80) :: filename logical, parameter :: do_timings=.false. + logical :: do_l1correction=.false. integer(psb_ipk_), save :: idx_spspmm=-1, idx_phase1=-1, idx_gtrans=-1, idx_phase2=-1, idx_refine=-1 integer(psb_ipk_), save :: idx_phase3=-1, idx_cdasb=-1, idx_ptap=-1 @@ -173,6 +178,9 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if ((do_timings).and.(idx_ptap==-1)) & & idx_ptap = psb_get_timer_idx("DEC_SMTH_BLD: ptap_bld ") + ! check if we have to use Jacobi or l1-Jacobi to smooth the tentative prolongator + if (dol1smoothing.eq.amg_l1_smooth_prol_) do_l1correction=.true. + nglob = desc_a%get_global_rows() nrow = desc_a%get_local_rows() @@ -185,7 +193,7 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& naggrm1 = sum(nlaggr(1:me)) naggrp1 = sum(nlaggr(1:me+1)) - filter_mat = (parms%aggr_filter == amg_filter_mat_) + filter_mat = (parms%aggr_filter == amg_filter_mat_).or.(parms%aggr_filter == amg_filter_prow_mat_) ! ! naggr: number of local aggregates @@ -200,6 +208,24 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (info == psb_success_) & & call psb_halo(adiag,desc_a,info) if (info == psb_success_) call a%cp_to(acsr) + ! + ! Do the l1-correction on the diagonal if it is requested + ! + if (do_l1correction) then + allocate(l1rwsum(nrow)) + call acsr%arwsum(l1rwsum) + if (info == psb_success_) & + & call psb_realloc(ncol,l1rwsum,info) + if (info == psb_success_) & + & call psb_halo(l1rwsum,desc_a,info) + ! \tilde{D}_{i,i} = \sum_{j \ne i} |a_{i,j}| + !$OMP parallel do private(i) schedule(static) + do i=1,size(adiag) + adiag(i) = adiag(i) + l1rwsum(i) - abs(adiag(i)) + end do + !$OMP end parallel do + end if + if(info /= psb_success_) then call psb_errpush(psb_err_from_subroutine_,name,a_err='sp_getdiag') @@ -230,9 +256,15 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& enddo if (jd == -1) then - write(0,*) name,': Warning: there is no diagonal element', i - else + ! if (.not.do_l1correction) + write(0,*) 'Wrong input: we need the diagonal!!!!', i + else if (parms%aggr_filter == amg_filter_mat_) then + ! We perform filtering in the standard way assuming that A is an M-matrix acsrf%val(jd)=acsrf%val(jd)-tmp + else if (parms%aggr_filter == amg_filter_prow_mat_) then + ! We are probably doing l1-correction, hence we want to preserve the + ! row sum of the matrix: note the change in sign + acsrf%val(jd)=acsrf%val(jd)+tmp end if enddo !$OMP end parallel do @@ -240,7 +272,6 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& call acsrf%clean_zeros(info) end if - !$OMP parallel do private(i) schedule(static) do i=1,size(adiag) if (adiag(i) /= zzero) then @@ -252,14 +283,17 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& !$OMP end parallel do if (parms%aggr_omega_alg == amg_eig_est_) then - if (parms%aggr_eig == amg_max_norm_) then + if ( (parms%aggr_filter == amg_filter_prow_mat_).and.(do_l1correction) ) then + ! For l1-Jacobi this can be estimated with 1: + ! this makes sense only if we are preserving the row-sum! + parms%aggr_omega_val = done + else if (parms%aggr_eig == amg_max_norm_) then allocate(arwsum(nrow)) call acsr%arwsum(arwsum) anorm = maxval(abs(adiag(1:nrow)*arwsum(1:nrow))) call psb_amx(ctxt,anorm) omega = 4.d0/(3.d0*anorm) parms%aggr_omega_val = omega - else info = psb_err_internal_error_ call psb_errpush(info,name,a_err='invalid amg_aggr_eig_') @@ -322,6 +356,7 @@ subroutine amg_zaggrmat_smth_bld(a,desc_a,ilaggr,nlaggr,parms,& if (debug_level >= psb_debug_outer_) & & write(debug_unit,*) me,' ',trim(name),& & 'Done smooth_aggregate ' + if (allocated(l1rwsum)) deallocate(l1rwsum) call psb_erractionrestore(err_act) return diff --git a/amgprec/impl/level/amg_c_base_onelev_descr.f90 b/amgprec/impl/level/amg_c_base_onelev_descr.f90 index 8c3b1e5b..cf743b59 100644 --- a/amgprec/impl/level/amg_c_base_onelev_descr.f90 +++ b/amgprec/impl/level/amg_c_base_onelev_descr.f90 @@ -129,7 +129,7 @@ subroutine amg_c_base_onelev_descr(lv,il,nl,ilmin,info,iout,verbosity,prefix) & ' avg:', & & lv%linmap%nagavg end if - write(iout_,'(a,1xa,1x,f14.2)') trim(prefix_),& + write(iout_,'(a,1x,a,1x,f14.2)') trim(prefix_),& & ' Aggregation ratio: ', & & lv%szratio end if diff --git a/amgprec/impl/level/amg_c_base_onelev_dump.f90 b/amgprec/impl/level/amg_c_base_onelev_dump.f90 index 14b4c9b6..60e43280 100644 --- a/amgprec/impl/level/amg_c_base_onelev_dump.f90 +++ b/amgprec/impl/level/amg_c_base_onelev_dump.f90 @@ -127,8 +127,7 @@ subroutine amg_c_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& ivr = lv%linmap%p_desc_U%get_global_indices(owned=.false.) write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head,ivr=ivr) + call lv%tprol%print(fname,head=head,ivr=ivr) end if end if else @@ -151,8 +150,7 @@ subroutine amg_c_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& if (tprol_) then write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head) + call lv%tprol%print(fname,head=head) end if end if end if diff --git a/amgprec/impl/level/amg_d_base_onelev_descr.f90 b/amgprec/impl/level/amg_d_base_onelev_descr.f90 index cefa6ece..621b8444 100644 --- a/amgprec/impl/level/amg_d_base_onelev_descr.f90 +++ b/amgprec/impl/level/amg_d_base_onelev_descr.f90 @@ -129,7 +129,7 @@ subroutine amg_d_base_onelev_descr(lv,il,nl,ilmin,info,iout,verbosity,prefix) & ' avg:', & & lv%linmap%nagavg end if - write(iout_,'(a,1xa,1x,f14.2)') trim(prefix_),& + write(iout_,'(a,1x,a,1x,f14.2)') trim(prefix_),& & ' Aggregation ratio: ', & & lv%szratio end if diff --git a/amgprec/impl/level/amg_d_base_onelev_dump.f90 b/amgprec/impl/level/amg_d_base_onelev_dump.f90 index c1013d41..0b3e15f6 100644 --- a/amgprec/impl/level/amg_d_base_onelev_dump.f90 +++ b/amgprec/impl/level/amg_d_base_onelev_dump.f90 @@ -127,8 +127,7 @@ subroutine amg_d_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& ivr = lv%linmap%p_desc_U%get_global_indices(owned=.false.) write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head,ivr=ivr) + call lv%tprol%print(fname,head=head,ivr=ivr) end if end if else @@ -151,8 +150,7 @@ subroutine amg_d_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& if (tprol_) then write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head) + call lv%tprol%print(fname,head=head) end if end if end if diff --git a/amgprec/impl/level/amg_s_base_onelev_descr.f90 b/amgprec/impl/level/amg_s_base_onelev_descr.f90 index 9de05c6e..0bc10503 100644 --- a/amgprec/impl/level/amg_s_base_onelev_descr.f90 +++ b/amgprec/impl/level/amg_s_base_onelev_descr.f90 @@ -129,7 +129,7 @@ subroutine amg_s_base_onelev_descr(lv,il,nl,ilmin,info,iout,verbosity,prefix) & ' avg:', & & lv%linmap%nagavg end if - write(iout_,'(a,1xa,1x,f14.2)') trim(prefix_),& + write(iout_,'(a,1x,a,1x,f14.2)') trim(prefix_),& & ' Aggregation ratio: ', & & lv%szratio end if diff --git a/amgprec/impl/level/amg_s_base_onelev_dump.f90 b/amgprec/impl/level/amg_s_base_onelev_dump.f90 index d30c0bf7..6fb3454e 100644 --- a/amgprec/impl/level/amg_s_base_onelev_dump.f90 +++ b/amgprec/impl/level/amg_s_base_onelev_dump.f90 @@ -127,8 +127,7 @@ subroutine amg_s_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& ivr = lv%linmap%p_desc_U%get_global_indices(owned=.false.) write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head,ivr=ivr) + call lv%tprol%print(fname,head=head,ivr=ivr) end if end if else @@ -151,8 +150,7 @@ subroutine amg_s_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& if (tprol_) then write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head) + call lv%tprol%print(fname,head=head) end if end if end if diff --git a/amgprec/impl/level/amg_z_base_onelev_descr.f90 b/amgprec/impl/level/amg_z_base_onelev_descr.f90 index 99a1e9d7..57b4612a 100644 --- a/amgprec/impl/level/amg_z_base_onelev_descr.f90 +++ b/amgprec/impl/level/amg_z_base_onelev_descr.f90 @@ -129,7 +129,7 @@ subroutine amg_z_base_onelev_descr(lv,il,nl,ilmin,info,iout,verbosity,prefix) & ' avg:', & & lv%linmap%nagavg end if - write(iout_,'(a,1xa,1x,f14.2)') trim(prefix_),& + write(iout_,'(a,1x,a,1x,f14.2)') trim(prefix_),& & ' Aggregation ratio: ', & & lv%szratio end if diff --git a/amgprec/impl/level/amg_z_base_onelev_dump.f90 b/amgprec/impl/level/amg_z_base_onelev_dump.f90 index 5d0b8f27..49e8eb98 100644 --- a/amgprec/impl/level/amg_z_base_onelev_dump.f90 +++ b/amgprec/impl/level/amg_z_base_onelev_dump.f90 @@ -127,8 +127,7 @@ subroutine amg_z_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& ivr = lv%linmap%p_desc_U%get_global_indices(owned=.false.) write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head,ivr=ivr) + call lv%tprol%print(fname,head=head,ivr=ivr) end if end if else @@ -151,8 +150,7 @@ subroutine amg_z_base_onelev_dump(lv,level,info,prefix,head,ac,rp,& if (tprol_) then write(fname(lname+1:),'(a,i3.3,a)')'_l',level,'_tprol.mtx' ! - ! This is not implemented yet. - !call lv%tprol%print(fname,head=head) + call lv%tprol%print(fname,head=head) end if end if end if diff --git a/amgprec/impl/smoother/amg_c_jac_smoother_apply_vect.f90 b/amgprec/impl/smoother/amg_c_jac_smoother_apply_vect.f90 index a4238980..6d32e2e2 100644 --- a/amgprec/impl/smoother/amg_c_jac_smoother_apply_vect.f90 +++ b/amgprec/impl/smoother/amg_c_jac_smoother_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_c_jac_smoother_apply_vect(alpha,sm,x,beta,y,desc_data,trans,& use psb_base_mod use amg_c_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_c_jac_smoother, amg_protect_name => amg_c_jac_smoother_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/smoother/amg_c_jac_smoother_clone_settings.f90 b/amgprec/impl/smoother/amg_c_jac_smoother_clone_settings.f90 index d44680e0..4dd2ee0d 100644 --- a/amgprec/impl/smoother/amg_c_jac_smoother_clone_settings.f90 +++ b/amgprec/impl/smoother/amg_c_jac_smoother_clone_settings.f90 @@ -41,7 +41,7 @@ subroutine amg_c_jac_smoother_clone_settings(sm,smout,info) use amg_c_jac_smoother, amg_protect_name => amg_c_jac_smoother_clone_settings Implicit None ! Arguments - class(amg_c_jac_smoother_type), intent(inout) :: sm + class(amg_c_jac_smoother_type), intent(inout) :: sm class(amg_c_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info integer(psb_ipk_) :: err_act diff --git a/amgprec/impl/smoother/amg_d_jac_smoother_apply_vect.f90 b/amgprec/impl/smoother/amg_d_jac_smoother_apply_vect.f90 index cafb6c8c..b1148119 100644 --- a/amgprec/impl/smoother/amg_d_jac_smoother_apply_vect.f90 +++ b/amgprec/impl/smoother/amg_d_jac_smoother_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_d_jac_smoother_apply_vect(alpha,sm,x,beta,y,desc_data,trans,& use psb_base_mod use amg_d_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_d_jac_smoother, amg_protect_name => amg_d_jac_smoother_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/smoother/amg_d_jac_smoother_clone_settings.f90 b/amgprec/impl/smoother/amg_d_jac_smoother_clone_settings.f90 index fd1bff2c..00192b2d 100644 --- a/amgprec/impl/smoother/amg_d_jac_smoother_clone_settings.f90 +++ b/amgprec/impl/smoother/amg_d_jac_smoother_clone_settings.f90 @@ -41,7 +41,7 @@ subroutine amg_d_jac_smoother_clone_settings(sm,smout,info) use amg_d_jac_smoother, amg_protect_name => amg_d_jac_smoother_clone_settings Implicit None ! Arguments - class(amg_d_jac_smoother_type), intent(inout) :: sm + class(amg_d_jac_smoother_type), intent(inout) :: sm class(amg_d_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info integer(psb_ipk_) :: err_act diff --git a/amgprec/impl/smoother/amg_d_poly_smoother_apply_vect.f90 b/amgprec/impl/smoother/amg_d_poly_smoother_apply_vect.f90 index 7cfa7c3d..6ea8444d 100644 --- a/amgprec/impl/smoother/amg_d_poly_smoother_apply_vect.f90 +++ b/amgprec/impl/smoother/amg_d_poly_smoother_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_d_poly_smoother_apply_vect(alpha,sm,x,beta,y,desc_data,trans,& use psb_base_mod use amg_d_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_d_poly_smoother, amg_protect_name => amg_d_poly_smoother_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/smoother/amg_d_poly_smoother_clone_settings.f90 b/amgprec/impl/smoother/amg_d_poly_smoother_clone_settings.f90 index d72cce67..d3dcaa0b 100644 --- a/amgprec/impl/smoother/amg_d_poly_smoother_clone_settings.f90 +++ b/amgprec/impl/smoother/amg_d_poly_smoother_clone_settings.f90 @@ -41,7 +41,7 @@ subroutine amg_d_poly_smoother_clone_settings(sm,smout,info) use amg_d_poly_smoother, amg_protect_name => amg_d_poly_smoother_clone_settings Implicit None ! Arguments - class(amg_d_poly_smoother_type), intent(inout) :: sm + class(amg_d_poly_smoother_type), intent(inout) :: sm class(amg_d_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info integer(psb_ipk_) :: err_act diff --git a/amgprec/impl/smoother/amg_d_poly_smoother_dmp.f90 b/amgprec/impl/smoother/amg_d_poly_smoother_dmp.f90 index 19144f07..296ce7e6 100644 --- a/amgprec/impl/smoother/amg_d_poly_smoother_dmp.f90 +++ b/amgprec/impl/smoother/amg_d_poly_smoother_dmp.f90 @@ -77,11 +77,9 @@ subroutine amg_d_poly_smoother_dmp(sm,desc,level,info,prefix,head,smoother,solve end if lname = len_trim(prefix_) fname = trim(prefix_) - write(fname(lname+1:lname+5),'(a,i3.3)') '_poly',iam + write(fname(lname+1:lname+8),'(a,i3.3)') '_poly',iam lname = lname + 8 ! to be completed - - ! At base level do nothing for the smoother if (allocated(sm%sv)) & diff --git a/amgprec/impl/smoother/amg_s_jac_smoother_apply_vect.f90 b/amgprec/impl/smoother/amg_s_jac_smoother_apply_vect.f90 index fff7ac1e..1bbb390f 100644 --- a/amgprec/impl/smoother/amg_s_jac_smoother_apply_vect.f90 +++ b/amgprec/impl/smoother/amg_s_jac_smoother_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_s_jac_smoother_apply_vect(alpha,sm,x,beta,y,desc_data,trans,& use psb_base_mod use amg_s_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_s_jac_smoother, amg_protect_name => amg_s_jac_smoother_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/smoother/amg_s_jac_smoother_clone_settings.f90 b/amgprec/impl/smoother/amg_s_jac_smoother_clone_settings.f90 index 5e0481ab..8089a628 100644 --- a/amgprec/impl/smoother/amg_s_jac_smoother_clone_settings.f90 +++ b/amgprec/impl/smoother/amg_s_jac_smoother_clone_settings.f90 @@ -41,7 +41,7 @@ subroutine amg_s_jac_smoother_clone_settings(sm,smout,info) use amg_s_jac_smoother, amg_protect_name => amg_s_jac_smoother_clone_settings Implicit None ! Arguments - class(amg_s_jac_smoother_type), intent(inout) :: sm + class(amg_s_jac_smoother_type), intent(inout) :: sm class(amg_s_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info integer(psb_ipk_) :: err_act diff --git a/amgprec/impl/smoother/amg_s_poly_smoother_apply_vect.f90 b/amgprec/impl/smoother/amg_s_poly_smoother_apply_vect.f90 index b5807873..cdf93b7d 100644 --- a/amgprec/impl/smoother/amg_s_poly_smoother_apply_vect.f90 +++ b/amgprec/impl/smoother/amg_s_poly_smoother_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_s_poly_smoother_apply_vect(alpha,sm,x,beta,y,desc_data,trans,& use psb_base_mod use amg_s_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_s_poly_smoother, amg_protect_name => amg_s_poly_smoother_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/smoother/amg_s_poly_smoother_clone_settings.f90 b/amgprec/impl/smoother/amg_s_poly_smoother_clone_settings.f90 index ddbad88f..d45e6b88 100644 --- a/amgprec/impl/smoother/amg_s_poly_smoother_clone_settings.f90 +++ b/amgprec/impl/smoother/amg_s_poly_smoother_clone_settings.f90 @@ -41,7 +41,7 @@ subroutine amg_s_poly_smoother_clone_settings(sm,smout,info) use amg_s_poly_smoother, amg_protect_name => amg_s_poly_smoother_clone_settings Implicit None ! Arguments - class(amg_s_poly_smoother_type), intent(inout) :: sm + class(amg_s_poly_smoother_type), intent(inout) :: sm class(amg_s_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info integer(psb_ipk_) :: err_act diff --git a/amgprec/impl/smoother/amg_s_poly_smoother_dmp.f90 b/amgprec/impl/smoother/amg_s_poly_smoother_dmp.f90 index f6fa2f8a..da16b187 100644 --- a/amgprec/impl/smoother/amg_s_poly_smoother_dmp.f90 +++ b/amgprec/impl/smoother/amg_s_poly_smoother_dmp.f90 @@ -77,11 +77,9 @@ subroutine amg_s_poly_smoother_dmp(sm,desc,level,info,prefix,head,smoother,solve end if lname = len_trim(prefix_) fname = trim(prefix_) - write(fname(lname+1:lname+5),'(a,i3.3)') '_poly',iam + write(fname(lname+1:lname+8),'(a,i3.3)') '_poly',iam lname = lname + 8 ! to be completed - - ! At base level do nothing for the smoother if (allocated(sm%sv)) & diff --git a/amgprec/impl/smoother/amg_z_jac_smoother_apply_vect.f90 b/amgprec/impl/smoother/amg_z_jac_smoother_apply_vect.f90 index 16d2a484..03bc6095 100644 --- a/amgprec/impl/smoother/amg_z_jac_smoother_apply_vect.f90 +++ b/amgprec/impl/smoother/amg_z_jac_smoother_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_z_jac_smoother_apply_vect(alpha,sm,x,beta,y,desc_data,trans,& use psb_base_mod use amg_z_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_z_jac_smoother, amg_protect_name => amg_z_jac_smoother_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/smoother/amg_z_jac_smoother_clone_settings.f90 b/amgprec/impl/smoother/amg_z_jac_smoother_clone_settings.f90 index 99c5146c..489575b4 100644 --- a/amgprec/impl/smoother/amg_z_jac_smoother_clone_settings.f90 +++ b/amgprec/impl/smoother/amg_z_jac_smoother_clone_settings.f90 @@ -41,7 +41,7 @@ subroutine amg_z_jac_smoother_clone_settings(sm,smout,info) use amg_z_jac_smoother, amg_protect_name => amg_z_jac_smoother_clone_settings Implicit None ! Arguments - class(amg_z_jac_smoother_type), intent(inout) :: sm + class(amg_z_jac_smoother_type), intent(inout) :: sm class(amg_z_base_smoother_type), intent(inout) :: smout integer(psb_ipk_), intent(out) :: info integer(psb_ipk_) :: err_act diff --git a/amgprec/impl/solver/amg_c_jac_solver_apply.f90 b/amgprec/impl/solver/amg_c_jac_solver_apply.f90 index 6eec4e5e..d47a9f6b 100644 --- a/amgprec/impl/solver/amg_c_jac_solver_apply.f90 +++ b/amgprec/impl/solver/amg_c_jac_solver_apply.f90 @@ -40,7 +40,7 @@ subroutine amg_c_jac_solver_apply(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_c_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_c_jac_solver, amg_protect_name => amg_c_jac_solver_apply implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_c_jac_solver_apply_vect.f90 b/amgprec/impl/solver/amg_c_jac_solver_apply_vect.f90 index 7d48442e..1fccaf63 100644 --- a/amgprec/impl/solver/amg_c_jac_solver_apply_vect.f90 +++ b/amgprec/impl/solver/amg_c_jac_solver_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_c_jac_solver_apply_vect(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_c_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_c_jac_solver, amg_protect_name => amg_c_jac_solver_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_c_krm_solver_impl.f90 b/amgprec/impl/solver/amg_c_krm_solver_impl.f90 index 3c93488f..fe3cb53a 100644 --- a/amgprec/impl/solver/amg_c_krm_solver_impl.f90 +++ b/amgprec/impl/solver/amg_c_krm_solver_impl.f90 @@ -169,7 +169,7 @@ subroutine amg_c_krm_solver_apply_vect(alpha,sv,x,beta,y,desc_data,& & trans,work,wv,info,init,initu) use psb_base_mod - use psb_krylov_mod + use psb_linsolve_mod use amg_c_krm_solver, amg_protect_name => amg_c_krm_solver_apply_vect Implicit None diff --git a/amgprec/impl/solver/amg_d_jac_solver_apply.f90 b/amgprec/impl/solver/amg_d_jac_solver_apply.f90 index 4e5b9421..9a37a162 100644 --- a/amgprec/impl/solver/amg_d_jac_solver_apply.f90 +++ b/amgprec/impl/solver/amg_d_jac_solver_apply.f90 @@ -40,7 +40,7 @@ subroutine amg_d_jac_solver_apply(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_d_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_d_jac_solver, amg_protect_name => amg_d_jac_solver_apply implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_d_jac_solver_apply_vect.f90 b/amgprec/impl/solver/amg_d_jac_solver_apply_vect.f90 index bc35f7ea..ab6b943c 100644 --- a/amgprec/impl/solver/amg_d_jac_solver_apply_vect.f90 +++ b/amgprec/impl/solver/amg_d_jac_solver_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_d_jac_solver_apply_vect(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_d_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_d_jac_solver, amg_protect_name => amg_d_jac_solver_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_d_krm_solver_impl.f90 b/amgprec/impl/solver/amg_d_krm_solver_impl.f90 index dd308157..b955a8d6 100644 --- a/amgprec/impl/solver/amg_d_krm_solver_impl.f90 +++ b/amgprec/impl/solver/amg_d_krm_solver_impl.f90 @@ -169,7 +169,7 @@ subroutine amg_d_krm_solver_apply_vect(alpha,sv,x,beta,y,desc_data,& & trans,work,wv,info,init,initu) use psb_base_mod - use psb_krylov_mod + use psb_linsolve_mod use amg_d_krm_solver, amg_protect_name => amg_d_krm_solver_apply_vect Implicit None diff --git a/amgprec/impl/solver/amg_s_jac_solver_apply.f90 b/amgprec/impl/solver/amg_s_jac_solver_apply.f90 index 500391c8..217f8027 100644 --- a/amgprec/impl/solver/amg_s_jac_solver_apply.f90 +++ b/amgprec/impl/solver/amg_s_jac_solver_apply.f90 @@ -40,7 +40,7 @@ subroutine amg_s_jac_solver_apply(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_s_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_s_jac_solver, amg_protect_name => amg_s_jac_solver_apply implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_s_jac_solver_apply_vect.f90 b/amgprec/impl/solver/amg_s_jac_solver_apply_vect.f90 index da84ffea..7dc5b9ad 100644 --- a/amgprec/impl/solver/amg_s_jac_solver_apply_vect.f90 +++ b/amgprec/impl/solver/amg_s_jac_solver_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_s_jac_solver_apply_vect(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_s_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_s_jac_solver, amg_protect_name => amg_s_jac_solver_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_s_krm_solver_impl.f90 b/amgprec/impl/solver/amg_s_krm_solver_impl.f90 index 1b0efd1b..b2d3d0e5 100644 --- a/amgprec/impl/solver/amg_s_krm_solver_impl.f90 +++ b/amgprec/impl/solver/amg_s_krm_solver_impl.f90 @@ -169,7 +169,7 @@ subroutine amg_s_krm_solver_apply_vect(alpha,sv,x,beta,y,desc_data,& & trans,work,wv,info,init,initu) use psb_base_mod - use psb_krylov_mod + use psb_linsolve_mod use amg_s_krm_solver, amg_protect_name => amg_s_krm_solver_apply_vect Implicit None diff --git a/amgprec/impl/solver/amg_z_jac_solver_apply.f90 b/amgprec/impl/solver/amg_z_jac_solver_apply.f90 index 12288551..f55745bb 100644 --- a/amgprec/impl/solver/amg_z_jac_solver_apply.f90 +++ b/amgprec/impl/solver/amg_z_jac_solver_apply.f90 @@ -40,7 +40,7 @@ subroutine amg_z_jac_solver_apply(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_z_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_z_jac_solver, amg_protect_name => amg_z_jac_solver_apply implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_z_jac_solver_apply_vect.f90 b/amgprec/impl/solver/amg_z_jac_solver_apply_vect.f90 index 241797e8..a97a21a0 100644 --- a/amgprec/impl/solver/amg_z_jac_solver_apply_vect.f90 +++ b/amgprec/impl/solver/amg_z_jac_solver_apply_vect.f90 @@ -40,7 +40,7 @@ subroutine amg_z_jac_solver_apply_vect(alpha,sv,x,beta,y,desc_data,trans,& use psb_base_mod use amg_z_diag_solver - use psb_base_krylov_conv_mod, only : log_conv + use psb_base_linsolve_conv_mod, only : log_conv use amg_z_jac_solver, amg_protect_name => amg_z_jac_solver_apply_vect implicit none type(psb_desc_type), intent(in) :: desc_data diff --git a/amgprec/impl/solver/amg_z_krm_solver_impl.f90 b/amgprec/impl/solver/amg_z_krm_solver_impl.f90 index 33972c4b..ca5d7125 100644 --- a/amgprec/impl/solver/amg_z_krm_solver_impl.f90 +++ b/amgprec/impl/solver/amg_z_krm_solver_impl.f90 @@ -169,7 +169,7 @@ subroutine amg_z_krm_solver_apply_vect(alpha,sv,x,beta,y,desc_data,& & trans,work,wv,info,init,initu) use psb_base_mod - use psb_krylov_mod + use psb_linsolve_mod use amg_z_krm_solver, amg_protect_name => amg_z_krm_solver_apply_vect Implicit None diff --git a/cbind/amgprec/amg_dprec_cbind_mod.F90 b/cbind/amgprec/amg_dprec_cbind_mod.F90 index 29c514a9..f6db13c9 100644 --- a/cbind/amgprec/amg_dprec_cbind_mod.F90 +++ b/cbind/amgprec/amg_dprec_cbind_mod.F90 @@ -264,7 +264,7 @@ contains & ah,ph,bh,xh,cdh,options) bind(c) result(res) use psb_base_mod use psb_prec_mod - use psb_krylov_mod + use psb_linsolve_mod use psb_prec_cbind_mod use psb_dkrylov_cbind_mod implicit none @@ -285,7 +285,7 @@ contains & ah,ph,bh,xh,eps,cdh,itmax,iter,err,itrace,irst,istop) bind(c) result(res) use psb_base_mod use psb_prec_mod - use psb_krylov_mod + use psb_linsolve_mod use psb_objhandle_mod use psb_prec_cbind_mod use psb_base_string_cbind_mod diff --git a/cbind/amgprec/amg_zprec_cbind_mod.F90 b/cbind/amgprec/amg_zprec_cbind_mod.F90 index 167509d9..8ae1c964 100644 --- a/cbind/amgprec/amg_zprec_cbind_mod.F90 +++ b/cbind/amgprec/amg_zprec_cbind_mod.F90 @@ -264,7 +264,7 @@ contains & ah,ph,bh,xh,cdh,options) bind(c) result(res) use psb_base_mod use psb_prec_mod - use psb_krylov_mod + use psb_linsolve_mod use psb_prec_cbind_mod use psb_zkrylov_cbind_mod implicit none @@ -285,7 +285,7 @@ contains & ah,ph,bh,xh,eps,cdh,itmax,iter,err,itrace,irst,istop) bind(c) result(res) use psb_base_mod use psb_prec_mod - use psb_krylov_mod + use psb_linsolve_mod use psb_objhandle_mod use psb_prec_cbind_mod use psb_base_string_cbind_mod diff --git a/cbind/test/pargen/Makefile b/cbind/test/pargen/Makefile index 3a2921e8..f5bf13f3 100644 --- a/cbind/test/pargen/Makefile +++ b/cbind/test/pargen/Makefile @@ -9,8 +9,8 @@ HERE=. FINCLUDES=$(FMFLAG). $(FMFLAG)$(LIBDIR) $(FMFLAG)$(PSBLAS_INCDIR) #PSBLAS_LIBS= -L$(PSBLAS_LIBDIR) -L$(LIBDIR) $(CPSBLAS_LIB) $(PSBLAS_LIB) # -lpsb_krylov_cbind -lpsb_prec_cbind -lpsb_base_cbind -PSBC_LIBS= -L$(PSBLAS_LIBDIR) -lpsb_cbind -lpsb_krylov -lpsb_prec -MLDC_LIBS=-L$(LIBDIR) -lmld_cbind -lmld_prec +PSBC_LIBS= -L$(PSBLAS_LIBDIR) -lpsb_cbind -lpsb_linsolve -lpsb_prec +AMGC_LIBS=-L$(LIBDIR) -lamg_cbind -lamg_prec # # Compilers and such # @@ -23,15 +23,15 @@ EXEDIR=./runs #UMFLIBS=-lumfpack -lamd -lcholmod -lcolamd -lcamd -lccolamd -L/usr/include/suitesparse #UMFFLAGS=-DHave_UMF_ -I/usr/include/suitesparse - all: mldec + all: amgec -mldec: mldec.o - $(MPFC) mldec.o -o mldec $(MLDC_LIBS) $(PSBC_LIBS) $(PSBCLDLIBS) $(PSBLAS_LIBS) \ +amgec: amgec.o + $(MPFC) amgec.o -o amgec $(AMGC_LIBS) $(PSBC_LIBS) $(PSBCLDLIBS) $(PSBLAS_LIBS) \ $(UMFLIBS) $(PSBLDLIBS) $(LDLIBS) -lm -lgfortran # \ # -lifcore -lifcoremt -lguide -limf -lirc -lintlc -lcxaguard -L/opt/intel/fc/10.0.023/lib/ -lm - /bin/mv mldec $(EXEDIR) + /bin/mv amgec $(EXEDIR) .f90.o: $(MPFC) $(F90COPT) $(FINCLUDES) $(FDEFINES) -c $< @@ -40,13 +40,13 @@ mldec: mldec.o clean: - /bin/rm -f mldec.o $(EXEDIR)/mldec + /bin/rm -f amgec.o $(EXEDIR)/amgec verycleanlib: (cd ../..; make veryclean) lib: (cd ../../; make library) tests: all - cd runs ; ./mldec < mlde.inp + cd runs ; ./amgec < amge.inp diff --git a/cbind/test/pargen/amgec.c b/cbind/test/pargen/amgec.c index 9a2c42c0..329b0649 100644 --- a/cbind/test/pargen/amgec.c +++ b/cbind/test/pargen/amgec.c @@ -77,7 +77,7 @@ #include #include "psb_base_cbind.h" -#include "mld_cbind.h" +#include "amg_cbind.h" double a1(double x, double y, double z) @@ -123,7 +123,7 @@ double g(double x, double y, double z) #define NBMAX 20 -psb_i_t matgen(psb_i_t ictxt, psb_i_t nl, psb_i_t idim, psb_l_t vl[], +psb_i_t matgen(psb_c_ctxt cctxt, psb_i_t nl, psb_i_t idim, psb_l_t vl[], psb_c_dspmat *ah,psb_c_descriptor *cdh, psb_c_dvector *xh, psb_c_dvector *bh, psb_c_dvector *rh) { @@ -135,7 +135,7 @@ psb_i_t matgen(psb_i_t ictxt, psb_i_t nl, psb_i_t idim, psb_l_t vl[], psb_l_t irow[10*NBMAX], icol[10*NBMAX]; info = 0; - psb_c_info(ictxt,&iam,&np); + psb_c_info(cctxt,&iam,&np); deltah = (double) 1.0/(idim+1); sqdeltah = deltah*deltah; deltah2 = 2.0* deltah; @@ -253,11 +253,12 @@ void get_hparm(FILE *fp, char *val) int main(int argc, char *argv[]) { - psb_i_t ictxt, iam, np; + psb_c_ctxt *cctxt; + psb_i_t iam, np; char methd[40], ptype[40], afmt[8]; psb_i_t nparms; psb_i_t idim,info,istop,itmax,itrace,irst,iter,ret; - mld_c_dprec *ph; + amg_c_dprec *ph; psb_c_dspmat *ah; psb_c_dvector *bh, *xh, *rh; psb_i_t nb,nlr, nl; @@ -269,12 +270,13 @@ int main(int argc, char *argv[]) psb_c_descriptor *cdh; FILE *vectfile; - ictxt = psb_c_init(); - psb_c_info(ictxt,&iam,&np); + cctxt = psb_c_new_ctxt(); + psb_c_init(cctxt); + psb_c_info(*cctxt,&iam,&np); fprintf(stdout,"Initialization: am %d of %d\n",iam,np); fflush(stdout); - psb_c_barrier(ictxt); + psb_c_barrier(*cctxt); if (iam == 0) { get_iparm(stdin,&nparms); get_hparm(stdin,methd); @@ -287,17 +289,17 @@ int main(int argc, char *argv[]) get_iparm(stdin,&irst); } /* Now broadcast the values, and check they're OK */ - psb_c_ibcast(ictxt,1,&nparms,0); - psb_c_hbcast(ictxt,methd,0); - psb_c_hbcast(ictxt,ptype,0); - psb_c_hbcast(ictxt,afmt,0); - psb_c_ibcast(ictxt,1,&idim,0); - psb_c_ibcast(ictxt,1,&istop,0); - psb_c_ibcast(ictxt,1,&itmax,0); - psb_c_ibcast(ictxt,1,&itrace,0); - psb_c_ibcast(ictxt,1,&irst,0); + psb_c_ibcast(*cctxt,1,&nparms,0); + psb_c_hbcast(*cctxt,methd,0); + psb_c_hbcast(*cctxt,ptype,0); + psb_c_hbcast(*cctxt,afmt,0); + psb_c_ibcast(*cctxt,1,&idim,0); + psb_c_ibcast(*cctxt,1,&istop,0); + psb_c_ibcast(*cctxt,1,&itmax,0); + psb_c_ibcast(*cctxt,1,&itrace,0); + psb_c_ibcast(*cctxt,1,&irst,0); - psb_c_barrier(ictxt); + psb_c_barrier(*cctxt); cdh=psb_c_new_descriptor(); psb_c_set_index_base(0); @@ -310,15 +312,15 @@ int main(int argc, char *argv[]) fprintf(stderr,"%d: Input data %d %ld %d %d\n",iam,idim,ng,nb, nl); if ((vl=malloc(nb*sizeof(psb_l_t)))==NULL) { fprintf(stderr,"On %d: malloc failure\n",iam); - psb_c_abort(ictxt); + psb_c_abort(*cctxt); } i = ((psb_l_t)iam) * nb; for (k=0; kdescriptor); @@ -440,6 +442,7 @@ int main(int argc, char *argv[]) if (iam == 0) fprintf(stderr,"program completed successfully\n"); - psb_c_barrier(ictxt); - psb_c_exit(ictxt); + psb_c_barrier(*cctxt); + psb_c_exit(*cctxt); + free(cctxt); } diff --git a/cbind/test/pargen/runs/mlde.inp b/cbind/test/pargen/runs/amge.inp similarity index 100% rename from cbind/test/pargen/runs/mlde.inp rename to cbind/test/pargen/runs/amge.inp diff --git a/config/pac.m4 b/config/pac.m4 index c0ad6f45..099c5a41 100644 --- a/config/pac.m4 +++ b/config/pac.m4 @@ -409,7 +409,7 @@ save_LDFLAGS=$LDFLAGS; ## dnl AC_MSG_NOTICE([psblas dir $pac_cv_psblas_dir]) ## PSBLAS_LIBS="-L$pac_cv_psblas_dir/lib" ## fi -PSBLAS_LIBS="-lpsb_krylov -lpsb_prec -lpsb_util -lpsb_base -L$PSBLAS_LIBDIR" +PSBLAS_LIBS="-lpsb_linsolve -lpsb_prec -lpsb_util -lpsb_base -L$PSBLAS_LIBDIR" LDFLAGS=" $PSBLAS_LIBS $save_LDFLAGS" dnl ac_compile='${MPIFC-$FC} -c -o conftest${ac_objext} $FMFLAG$PSBLAS_DIR/include $FMFLAG$PSBLAS_DIR/lib conftest.$ac_ext 1>&5' @@ -484,7 +484,7 @@ dnl AC_MSG_NOTICE([psblas dir $pac_cv_psblas_dir]) PSBLAS_INCLUDES="$FMFLAG$pac_cv_psblas_dir/modules $PSBLAS_INCLUDES" fi FCFLAGS=" $PSBLAS_INCLUDES $save_FCFLAGS" -PSBLAS_LIBS="-lpsb_krylov -lpsb_prec -lpsb_util -lpsb_base $PSBLAS_LIBS" +PSBLAS_LIBS="-lpsb_linsolve -lpsb_prec -lpsb_util -lpsb_base $PSBLAS_LIBS" LDFLAGS=" $PSBLAS_LIBS $save_LDFLAGS" dnl ac_compile='${MPIFC-$FC} -c -o conftest${ac_objext} $FMFLAG$PSBLAS_DIR/include $FMFLAG$PSBLAS_DIR/lib conftest.$ac_ext 1>&5' diff --git a/configure b/configure index 8f1485a7..5ce85c96 100755 --- a/configure +++ b/configure @@ -7417,7 +7417,7 @@ then FIFLAG="-I" BASEMODNAME=PSB_BASE_MOD PRECMODNAME=PSB_PREC_MOD - METHDMODNAME=PSB_KRYLOV_MOD + METHDMODNAME=PSB_LINSOLVE_MOD UTILMODNAME=PSB_UTIL_MOD else @@ -7550,7 +7550,7 @@ printf "%s\n" "$ax_cv_f90_modflag" >&6; } FIFLAG=-I BASEMODNAME=psb_base_mod PRECMODNAME=psb_prec_mod - METHDMODNAME=psb_krylov_mod + METHDMODNAME=psb_linsolve_mod UTILMODNAME=psb_util_mod fi @@ -7683,7 +7683,7 @@ save_LDFLAGS=$LDFLAGS; ## dnl AC_MSG_NOTICE([psblas dir $pac_cv_psblas_dir]) ## PSBLAS_LIBS="-L$pac_cv_psblas_dir/lib" ## fi -PSBLAS_LIBS="-lpsb_krylov -lpsb_prec -lpsb_util -lpsb_base -L$PSBLAS_LIBDIR" +PSBLAS_LIBS="-lpsb_linsolve -lpsb_prec -lpsb_util -lpsb_base -L$PSBLAS_LIBDIR" LDFLAGS=" $PSBLAS_LIBS $save_LDFLAGS" ac_link='${MPIFC-$FC} -o conftest${ac_exeext} $FCFLAGS conftest.$ac_ext $LDFLAGS $LIBS 1>&5' @@ -7787,7 +7787,7 @@ elif test "x$pac_cv_psblas_dir" != "x"; then PSBLAS_INCLUDES="$FMFLAG$pac_cv_psblas_dir/modules $PSBLAS_INCLUDES" fi FCFLAGS=" $PSBLAS_INCLUDES $save_FCFLAGS" -PSBLAS_LIBS="-lpsb_krylov -lpsb_prec -lpsb_util -lpsb_base $PSBLAS_LIBS" +PSBLAS_LIBS="-lpsb_linsolve -lpsb_prec -lpsb_util -lpsb_base $PSBLAS_LIBS" LDFLAGS=" $PSBLAS_LIBS $save_LDFLAGS" diff --git a/configure.ac b/configure.ac index 7f13fa95..39380cce 100755 --- a/configure.ac +++ b/configure.ac @@ -525,7 +525,7 @@ then FIFLAG="-I" BASEMODNAME=PSB_BASE_MOD PRECMODNAME=PSB_PREC_MOD - METHDMODNAME=PSB_KRYLOV_MOD + METHDMODNAME=PSB_LINSOLVE_MOD UTILMODNAME=PSB_UTIL_MOD else @@ -536,7 +536,7 @@ else FIFLAG=-I BASEMODNAME=psb_base_mod PRECMODNAME=psb_prec_mod - METHDMODNAME=psb_krylov_mod + METHDMODNAME=psb_linsolve_mod UTILMODNAME=psb_util_mod fi diff --git a/docs/Makefile b/docs/Makefile new file mode 100644 index 00000000..6ac24e64 --- /dev/null +++ b/docs/Makefile @@ -0,0 +1,7 @@ +all: guide + +guide: + cd src && $(MAKE) clean all + +doxy: + doxygen doxypsb diff --git a/docs/amg4psblas_1.0-guide.pdf b/docs/amg4psblas_1.2-guide.pdf similarity index 85% rename from docs/amg4psblas_1.0-guide.pdf rename to docs/amg4psblas_1.2-guide.pdf index b3c4f559..faf76c48 100644 Binary files a/docs/amg4psblas_1.0-guide.pdf and b/docs/amg4psblas_1.2-guide.pdf differ diff --git a/docs/html/index.html b/docs/html/index.html index 876b4291..8a71fff6 100644 --- a/docs/html/index.html +++ b/docs/html/index.html @@ -29,9 +29,9 @@ class="cmbx-12">Salvatore Filippone
University of Rome Tor-Vergata and IAC-CNR
Software version: 1.0
Software version: 1.2
May 11th, 2021 +class="cmr-12">December 31st, 2024 @@ -45,140 +45,62 @@ class="newline" /> @@ -189,6 +111,9 @@ class="cmr-12">References + + + diff --git a/docs/html/userhtml.css b/docs/html/userhtml.css index a5ede259..f0aa87fa 100644 --- a/docs/html/userhtml.css +++ b/docs/html/userhtml.css @@ -22,6 +22,9 @@ .cmmi-8{font-size:72%;font-style: italic;} .cmsy-10x-x-120{font-size:109%;} .cmsy-8{font-size:72%;} +.cmtt-10{font-size:90%;font-family: monospace,monospace;} +.cmtt-10{font-family: monospace,monospace;} +.cmtt-10{font-family: monospace,monospace;} .tctt-1200{font-size:109%;font-family: monospace,monospace;} .cmmi-10x-x-109{font-style: italic;} .cmsy-10x-x-109{} @@ -29,9 +32,6 @@ .cmtt-10x-x-109{font-family: monospace,monospace;} .cmtt-10x-x-109{font-family: monospace,monospace;} .cmcsc-10x-x-109{} -.cmtt-10{font-size:90%;font-family: monospace,monospace;} -.cmtt-10{font-family: monospace,monospace;} -.cmtt-10{font-family: monospace,monospace;} .cmbx-10x-x-109{ font-weight: bold;} .cmbx-10x-x-109{ font-weight: bold;} .cmbx-10x-x-109{ font-weight: bold;} @@ -42,21 +42,26 @@ p.indent{text-indent:0;} p + p{margin-top:1em;} p + div, p + pre {margin-top:1em;} div + p, pre + p {margin-top:1em;} +a { overflow-wrap: break-word; word-wrap: break-word; word-break: break-word; hyphens: auto; } @media print {div.crosslinks {visibility:hidden;}} +table.tabular{border-collapse: collapse; border-spacing: 0;} a img { border-top: 0; border-left: 0; border-right: 0; } center { margin-top:1em; margin-bottom:1em; } td center { margin-top:0em; margin-bottom:0em; } .Canvas { position:relative; } img.math{vertical-align:middle;} +div.par-math-display, div.math-display{text-align:center;} li p.indent { text-indent: 0em } li p:first-child{ margin-top:0em; } li p:last-child, li div:last-child { margin-bottom:0.5em; } +li p:first-child{ margin-bottom:0; } li p~ul:last-child, li p~ol:last-child{ margin-bottom:0.5em; } .enumerate1 {list-style-type:decimal;} .enumerate2 {list-style-type:lower-alpha;} .enumerate3 {list-style-type:lower-roman;} .enumerate4 {list-style-type:upper-alpha;} div.newtheorem { margin-bottom: 2em; margin-top: 2em;} +div.newtheorem .head{font-weight: bold;} .obeylines-h,.obeylines-v {white-space: nowrap; } div.obeylines-v p { margin-top:0; margin-bottom:0; } .overline{ text-decoration:overline; } @@ -100,6 +105,9 @@ table[rules] {border-left:solid black 0.4pt; border-right:solid black 0.4pt; } .hline hr, .cline hr{ height : 0px; margin:0px; } .hline td, .cline td{ padding: 0; } .hline hr, .cline hr{border:none;border-top:1px solid black;} +.hline {border-top: 1px solid black;} +.hline + .vspace:last-child{display:none;} +.hline:first-child{border-bottom:1px solid black;border-top:none;} .tabbing-right {text-align:right;} div.float, div.figure {margin-left: auto; margin-right: auto;} div.float img {text-align:center;} @@ -124,15 +132,16 @@ table.pmatrix {width:100%;} span.bar-css {text-decoration:overline;} img.cdots{vertical-align:middle;} .partToc a, .partToc, .likepartToc a, .likepartToc {line-height: 200%; font-weight:bold; font-size:110%;} +.chapterToc a, .chapterToc, .likechapterToc a, .likechapterToc, .appendixToc a, .appendixToc {line-height: 200%; font-weight:bold;} .index-item, .index-subitem, .index-subsubitem {display:block} div.caption {text-indent:-2em; margin-left:3em; margin-right:1em; text-align:left;} div.caption span.id{font-weight: bold; white-space: nowrap; } h1.partHead{text-align: center} p.bibitem { text-indent: -2em; margin-left: 2em; margin-top:0.6em; margin-bottom:0.6em; } p.bibitem-p { text-indent: 0em; margin-left: 2em; margin-top:0.6em; margin-bottom:0.6em; } +.subsubsectionHead, .likesubsubsectionHead { font-size: 1em; } .paragraphHead, .likeparagraphHead { margin-top:2em; font-weight: bold;} .subparagraphHead, .likesubparagraphHead { font-weight: bold;} -.quote {margin-bottom:0.25em; margin-top:0.25em; margin-left:1em; margin-right:1em; text-align:justify;} .verse{white-space:nowrap; margin-left:2em} div.maketitle {text-align:center;} h2.titleHead{text-align:center;} @@ -140,121 +149,95 @@ div.maketitle{ margin-bottom: 2em; } div.author, div.date {text-align:center;} div.thanks{text-align:left; margin-left:10%; font-size:85%; font-style:italic; } div.author{white-space: nowrap;} -.quotation {margin-bottom:0.25em; margin-top:0.25em; margin-left:1em; } -.abstract p {margin-left:5%; margin-right:5%;} +div.abstract p {margin-left:5%; margin-right:5%;} div.abstract {width:100%;} +.abstracttitle{text-align:center;margin-bottom:1em;} .subsectionToc, .likesubsectionToc {margin-left:2em;} .subsubsectionToc, .likesubsubsectionToc {margin-left:4em;} +.paragraphToc, .likeparagraphToc {margin-left:6em;} +.subparagraphToc, .likesubparagraphToc {margin-left:8em;} .ovalbox { padding-left:3pt; padding-right:3pt; border:solid thin; } .Ovalbox-thick { padding-left:3pt; padding-right:3pt; border:solid thick; } .shadowbox { padding-left:3pt; padding-right:3pt; border:solid thin; border-right:solid thick; border-bottom:solid thick; } .doublebox { padding-left:3pt; padding-right:3pt; border-style:double; border:solid thick; } .rotatebox{display: inline-block;} +code.lstinline{font-family:monospace,monospace;} +pre.listings{font-family: monospace,monospace; white-space: pre-wrap; margin-top:0.5em; margin-bottom:0.5em; } .lstlisting .label{margin-right:0.5em; } -div.lstlisting{font-family: monospace,monospace; white-space: nowrap; margin-top:0.5em; margin-bottom:0.5em; } -div.lstinputlisting{ font-family: monospace,monospace; white-space: nowrap; } +pre.lstlisting{font-family: monospace,monospace; white-space: pre-wrap; margin-top:0.5em; margin-bottom:0.5em; } +pre.lstinputlisting{ font-family: monospace,monospace; white-space: pre-wrap; } .lstinputlisting .label{margin-right:0.5em;} -#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-1{border-collapse:collapse;} -#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-1{border-collapse:collapse;} -#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-1{border-collapse:collapse;} -#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-1{border-collapse:collapse;} -#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-4{border-collapse:collapse;} -#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-4{border-collapse:collapse;} -#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-4{border-collapse:collapse;} -#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-4{border-collapse:collapse;} -#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-4{border-collapse:collapse;} -#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-4{border-collapse:collapse;} -#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-5{border-collapse:collapse;} -#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-5{border-collapse:collapse;} -#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-5{border-collapse:collapse;} -#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-5{border-collapse:collapse;} -#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-5{border-collapse:collapse;} -#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-5{border-collapse:collapse;} +#TBL-1-1{border-left: 1px solid black;} +#TBL-1-1{border-right:1px solid black;} +#TBL-1-2{border-right:1px solid black;} +#TBL-1-3{border-right:1px solid black;} +#TBL-4-1{border-left: 1px solid black;} +#TBL-4-1{border-right:1px solid black;} +#TBL-4-2{border-right:1px solid black;} +#TBL-4-3{border-right:1px solid black;} +#TBL-4-4{border-right:1px solid black;} +#TBL-4-5{border-right:1px solid black;} +#TBL-5-1{border-left: 1px solid black;} +#TBL-5-1{border-right:1px solid black;} +#TBL-5-2{border-right:1px solid black;} +#TBL-5-3{border-right:1px solid black;} +#TBL-5-4{border-right:1px solid black;} +#TBL-5-5{border-right:1px solid black;} td#TBL-5-10-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +td#TBL-5-10-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +td#TBL-5-11-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-5-11-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-5-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} -#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-6{border-collapse:collapse;} -#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-6{border-collapse:collapse;} -#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-6{border-collapse:collapse;} -#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-6{border-collapse:collapse;} -#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-6{border-collapse:collapse;} -#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-6{border-collapse:collapse;} +td#TBL-5-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +#TBL-6-1{border-left: 1px solid black;} +#TBL-6-1{border-right:1px solid black;} +#TBL-6-2{border-right:1px solid black;} +#TBL-6-3{border-right:1px solid black;} +#TBL-6-4{border-right:1px solid black;} +#TBL-6-5{border-right:1px solid black;} +td#TBL-6-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-6-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-6-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} -#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-7{border-collapse:collapse;} -#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-7{border-collapse:collapse;} -#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-7{border-collapse:collapse;} -#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-7{border-collapse:collapse;} -#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-7{border-collapse:collapse;} -#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-7{border-collapse:collapse;} +td#TBL-6-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +#TBL-7-1{border-left: 1px solid black;} +#TBL-7-1{border-right:1px solid black;} +#TBL-7-2{border-right:1px solid black;} +#TBL-7-3{border-right:1px solid black;} +#TBL-7-4{border-right:1px solid black;} +#TBL-7-5{border-right:1px solid black;} +td#TBL-7-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-7-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-7-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +td#TBL-7-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +td#TBL-7-7-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-7-7-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-7-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +td#TBL-7-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} +td#TBL-7-13-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} td#TBL-7-13-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;} -#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-8{border-collapse:collapse;} -#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-8{border-collapse:collapse;} -#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-8{border-collapse:collapse;} -#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-8{border-collapse:collapse;} -#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-8{border-collapse:collapse;} -#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-8{border-collapse:collapse;} -#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-9{border-collapse:collapse;} -#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-9{border-collapse:collapse;} -#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-9{border-collapse:collapse;} -#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-9{border-collapse:collapse;} -#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-9{border-collapse:collapse;} -#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-9{border-collapse:collapse;} -#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-10{border-collapse:collapse;} -#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-10{border-collapse:collapse;} -#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-10{border-collapse:collapse;} -#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-10{border-collapse:collapse;} -#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-10{border-collapse:collapse;} -#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;} -#TBL-10{border-collapse:collapse;} +#TBL-8-1{border-left: 1px solid black;} +#TBL-8-1{border-right:1px solid black;} +#TBL-8-2{border-right:1px solid black;} +#TBL-8-3{border-right:1px solid black;} +#TBL-8-4{border-right:1px solid black;} +#TBL-8-5{border-right:1px solid black;} +#TBL-9-1{border-left: 1px solid black;} +#TBL-9-1{border-right:1px solid black;} +#TBL-9-2{border-right:1px solid black;} +#TBL-9-3{border-right:1px solid black;} +#TBL-9-4{border-right:1px solid black;} +#TBL-9-5{border-right:1px solid black;} +#TBL-10-1{border-left: 1px solid black;} +#TBL-10-1{border-right:1px solid black;} +#TBL-10-2{border-right:1px solid black;} +#TBL-10-3{border-right:1px solid black;} +#TBL-10-4{border-right:1px solid black;} +#TBL-10-5{border-right:1px solid black;} +#TBL-11-1{border-left: 1px solid black;} +#TBL-11-1{border-right:1px solid black;} +#TBL-11-2{border-right:1px solid black;} +#TBL-11-3{border-right:1px solid black;} +#TBL-11-4{border-right:1px solid black;} +#TBL-11-5{border-right:1px solid black;} /* end css.sty */ diff --git a/docs/html/userhtml.html b/docs/html/userhtml.html index 876b4291..8a71fff6 100644 --- a/docs/html/userhtml.html +++ b/docs/html/userhtml.html @@ -29,9 +29,9 @@ class="cmbx-12">Salvatore Filippone
University of Rome Tor-Vergata and IAC-CNR
Software version: 1.0
Software version: 1.2
May 11th, 2021 +class="cmr-12">December 31st, 2024 @@ -45,140 +45,62 @@ class="newline" /> @@ -189,6 +111,9 @@ class="cmr-12">References + + + diff --git a/docs/html/userhtmlli1.html b/docs/html/userhtmlli1.html index dcc83f43..328c05e6 100644 --- a/docs/html/userhtmlli1.html +++ b/docs/html/userhtmlli1.html @@ -77,9 +77,6 @@ class="cmr-12">PSCToolkit (Parallel Sparse Computation Toolkit) software framewo class="cmr-12">of a software development project started in 2007, named MLD2P4, which originally implemented a multilevel version of some domain decomposition preconditioners of - - - additive-Schwarz type, and was based on a parallel decoupled version of the well known and preconditioners are represented as PSBLAS distributed sparse class="cmr-12">AMG4PSBLAS enables the user to easily specify different features of an algebraic multilevel preconditioner, thus allowing to experiment with different preconditioners for + + + the problem and parallel computers at hand.

of AMG4PSBLAS. +

+ + + + + +

id="x3-2000">Contents diff --git a/docs/html/userhtmlli3.html b/docs/html/userhtmlli3.html index e358a8db..eaf2b366 100644 --- a/docs/html/userhtmlli3.html +++ b/docs/html/userhtmlli3.html @@ -1,7 +1,7 @@ -Contributors +References @@ -10,46 +10,739 @@ -

Citing AMG4PSBLAS

When use the library, please cite the following: @@ -41,6 +41,7 @@ class="cmr-12">When use the library, please cite the following:        archivePrefix = {arXiv},        year={2021}      } + @Misc{psctoolkit-web-page,        author = {D’Ambra, Pasqua and Durastante, Fabio and Filippone, Salvatore},        title =  {{PSCToolkit} {W}eb page}, @@ -56,6 +57,9 @@ class="cmr-12">When use the library, please cite the following: + + +

References

@@ -303,22 +303,41 @@ class="cmr-12">  P. D’Ambra, F D’Ambra, F. Durastante, S. Filippone, AMG preconditioners for Linear Solvers towards Extreme Scale, 2020, arXiv:2006.16147v3. +class="cmr-12">, SIAM Journal on Scientific Computing + 43, no. 5 (2021): S679-S703.

[15]   P. D’Ambra, F. Durastante, S. Filippone, S. Massei, S. Thomas + Optimal Polynomial Smoothers for Parallel AMG, 2024, arXiv:2407.09848. +

+

+ [16]   T.)

[16][17]   SIAM Journal on Matrix Analysis and Applications, 20 (3), 1999, 7

[17][18]   Software, 16 (1) 1990, 1–17.

[18][19]   extended set of FORTRAN Basic Linear Algebra Subprograms< class="cmr-12">, ACM Transactions on Mathematical Software, 14 (1) 1988, 1–17. + + +

[19][20]   Clusters, in Proc. of ParCo 2001, Parallel Computing, Advances and Current Issues, 2002. - - -

[20][21]   .

[21][22]    23.

[22][23]   Transactions on Mathematical Software, 26 (4), 2000, 527–55

[23][24]   2016, 23:501-518

[24][25]   , MIT Press, 1998.

[25][26]   Algebra Subprograms for FORTRAN usage, ACM Transactions on Mathematical Software, 5 (3), 1979, 308–323. + + +

[26][27]   J. Lottes, Optimal polynomial smoothers for multigrid V-cycles, + Numerical Linear Algebra with Applications 30.6 (2023): e2518. +

+

+ [28]   Distributed-memory Sparse Direct Solver for Unsymmetric Linear S class="cmr-12">, ACM Transactions on Mathematical Software, 29 (2), 2003, 110–140. - - -

[27][29]   Numerical Linear Algebra with Applications, 15 (5), 2008, 473R

[28][30]   2003.

[29][31]   University Press, 1996.

[30][32]   Press, 1998.

[31][33]    Oosterlee, Multigrid, Academic Press, 2001.

[32][34]   Aggregation Strategies on Massively Parallel Machines, in J. Donnelley, editor, Proceedings of SuperComputing 2000, Dallas, 2000. + + +

[33][35]   (3) 1996, 179–196.

+ + +

3 Configuring and Building AMG4PSBLAS

In order to build AMG4PSBLAS it is necessary to set up a Makefile with appropriate @@ -64,7 +64,7 @@ class="cmr-12">in both single and double precision.

Building AMG4PSBLAS requires some base libraries (see Section 3.1); interfaces to optional third-party libraries, which extend the functionalities Section 3.2), are also available. A number of Linux distributions (e.g., Ubuntu, the base and optional software used by AMG4PSBLAS is given in the sections.

-

+

3.1 Prerequisites

+

The following base libraries are needed: +

+

+BLAS

+

[18, 19, 26] Many vendors provide optimized versions of BLAS; if no + vendor version is available for a given platform, the ATLAS software + (math-atlas.sourceforge .net) may be employed. The reference BLAS from + Netlib (www.netlib.org/blas) are meant to define the standard behaviour of + the BLAS interface, so they are not optimized for any particular platform, + and should only be used as a last resort. Note that BLAS computations form + + + a relatively small part of the AMG4PSBLAS/PSBLAS; however they are + critical when using preconditioners based on the MUMPS, UMFPACK or + SuperLU third party libraries. UMFPACK requires a full LAPACK library; + our experience is that configuring ATLAS for building full LAPACK does + not always work in the expected way. Our advice is first to download the + LAPACK tarfile from www.netlib.org/lapack and install it independently of + ATLAS. In this case, you need to modify the OPTS and NOOPT definitions + for including -fPIC compilation option in the make.inc file of the LAPACK + library. +

+

+MPI

+

[25, 32] A version of MPI is available on most high-performance computing + systems. +

+

+PSBLAS

+

[21, 23] Parallel Sparse BLAS + (PSBLAS) is available from psctoolkit.github.io/ products/psblas/; version + 3.7.0 (or later) is required. Indeed, all the prerequisites listed so far are also + prerequisites of PSBLAS.

+

Please note that the four previous libraries must have Fortran interfaces compatible with +AMG4PSBLAS; usually this means that they should all be built with the same +compiler being used for AMG4PSBLAS. +

If you want to use the PSBLAS support for NVIDIA GPUs, you will also +need a working version of the CUDA Toolkit that is compatible with the +compiler choice made to compile PSBLAS and AMG4PSBLAS. After that +you will need to have configured and compiled the PSBLAS library with the +options: + + +

+./configure --enable-cuda --with-cudadir=${CUDA_HOME} --with-cudacc=xx,yy,zz
+
+

Previous versions required you to have the auxiliary libraries SPGPU and +PSBLAS-EXT compiled, this is no longer necessary because they have been integrated +into PSBLAS and are compiled by activating the previous flags during configuration. +See also Sec 4.2. +

+

3.2 Optional third party libraries

+

We provide interfaces to the following third-party software libraries; note that these are +optional, but if you enable them some defaults for multilevel preconditioners may +change to reflect their presence. +

+

+

+UMFPACK

+

[16] A sparse LU factorization package included in the SuiteSparse library, + available from faculty.cse.tamu.edu/davis/suitesparse.html; + it provides sequential factorization and triangular system solution for + double precision real and complex data. We tested version 4.5.4 + of SuiteSparse. Note that for configuring SuiteSparse you should + provide the right path to the BLAS and LAPACK libraries in the + SuiteSparse_config/SuiteSparse_config.mk file. +

+

+MUMPS

+

[2] A sparse LU factorization package available from mumps.enseeiht.fr; + it provides sequential and parallel factorizations and triangular system + solution for single and double precision, real and complex data. We tested + versions 4.10.0 and 5.0.1. +

+

+SuperLU

+ -
+

+SuperLU_Dist

+

[28] A sparse LU factorization package available from the same site as + SuperLU; it provides parallel factorization and triangular system solution + for double precision real and complex data. We tested versions 3.3 and + 4.2. If you installed BLAS from ATLAS, remember to define the BLASLIB + variable in the make.inc file and to add the -std=c99 option to the C + compiler options. Note that this library requires the ParMETIS library for + parallel graph partitioning and fill-reducing matrix ordering, available from + glaros.dtc.umn.edu/gkhome/metis/parmetis/overview.

+

+

3.3 Configuration options

+

In order to build AMG4PSBLAS, the first step is to use the configure script in the +main directory to generate the necessary makefile. +

As a minimal example consider the following: + + + +

+./configure --with-psblas=PSB-INSTALL-DIR
+
+

which assumes that the various MPI compilers and support libraries are available in +the standard directories on the system, and specifies only the PSBLAS install directory +(note that the latter directory must be specified with an absolute path). The full set of +options may be looked at by issuing the command ./configure --help, which +produces:

configure configures AMG4PSBLAS 1.0.0 to adapt to many kinds of systems. 
+ 
+Usage: ./configure [OPTION]... [VAR=VALUE]... 
+ 
+To assign environment variables (e.g., CC, CFLAGS...), specify them as 
+VAR=VALUE.  See below for descriptions of some of the useful variables. 
+ 
+Defaults for the options are specified in brackets. 
+ 
+Configuration: 
+  -h, --help              display this help and exit 
+      --help=short        display options specific to this package 
+      --help=recursive    display the short help of all the included packages 
+  -V, --version           display version information and exit 
+  -q, --quiet, --silent   do not print checking ...’ messages 
+      --cache-file=FILE   cache test results in FILE [disabled] 
+  -C, --config-cache      alias for ‘--cache-file=config.cache 
+  -n, --no-create         do not create output files 
+      --srcdir=DIR        find the sources in DIR [configure dir or ‘..’] 
+ 
+Installation directories: 
+  --prefix=PREFIX         install architecture-independent files in PREFIX 
+                          [/usr/local] 
+  --exec-prefix=EPREFIX   install architecture-dependent files in EPREFIX 
+                          [PREFIX] 
+ 
+By default, make install will install all the files in 
+‘/usr/local/bin’, ‘/usr/local/lib etc.  You can specify 
+an installation prefix other than ‘/usr/local using ‘--prefix’, 
+for instance ‘--prefix=$HOME’. 
+ 
+For better control, use the options below. 
+ 
+Fine tuning of the installation directories: 
+  --bindir=DIR            user executables [EPREFIX/bin] 
+  --sbindir=DIR           system admin executables [EPREFIX/sbin] 
+  --libexecdir=DIR        program executables [EPREFIX/libexec] 
+  --sysconfdir=DIR        read-only single-machine data [PREFIX/etc] 
+  --sharedstatedir=DIR    modifiable architecture-independent data [PREFIX/com] 
+  --localstatedir=DIR     modifiable single-machine data [PREFIX/var] 
+  --libdir=DIR            object code libraries [EPREFIX/lib] 
+  --includedir=DIR        C header files [PREFIX/include] 
+  --oldincludedir=DIR     C header files for non-gcc [/usr/include] 
+  --datarootdir=DIR       read-only arch.-independent data root [PREFIX/share] 
+  --datadir=DIR           read-only architecture-independent data [DATAROOTDIR] 
+  --infodir=DIR           info documentation [DATAROOTDIR/info] 
+  --localedir=DIR         locale-dependent data [DATAROOTDIR/locale] 
+  --mandir=DIR            man documentation [DATAROOTDIR/man] 
+  --docdir=DIR            documentation root [DATAROOTDIR/doc/amg4psblas] 
+  --htmldir=DIR           html documentation [DOCDIR] 
+  --dvidir=DIR            dvi documentation [DOCDIR] 
+  --pdfdir=DIR            pdf documentation [DOCDIR] 
+  --psdir=DIR             ps documentation [DOCDIR] 
+ 
+Program names: 
+  --program-prefix=PREFIX            prepend PREFIX to installed program names 
+  --program-suffix=SUFFIX            append SUFFIX to installed program names 
+  --program-transform-name=PROGRAM   run sed PROGRAM on installed program names 
+ 
+Optional Features: 
+  --disable-option-checking  ignore unrecognized --enable/--with options 
+  --disable-FEATURE       do not include FEATURE (same as --enable-FEATURE=no) 
+  --enable-FEATURE[=ARG]  include FEATURE [ARG=yes] 
+  --enable-silent-rules   less verbose build output (undo: "make V=1") 
+  --disable-silent-rules  verbose build output (undo: "make V=0") 
+  --enable-dependency-tracking 
+                          do not reject slow dependency extractors 
+  --disable-dependency-tracking 
+                          speeds up one-time build 
+  --enable-serial         Specify whether to enable a fake mpi library to run 
+                          in serial mode. 
+ 
+Optional Packages: 
+  --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes] 
+  --without-PACKAGE       do not use PACKAGE (same as --with-PACKAGE=no) 
+  --with-psblas=DIR       The install directory for PSBLAS, for example, 
+                          --with-psblas=/opt/packages/psblas-3.5 
+  --with-psblas-incdir=DIR 
+                          Specify the directory for PSBLAS C includes. 
+  --with-psblas-moddir=DIR 
+                          Specify the directory for PSBLAS Fortran modules. 
+  --with-psblas-libdir=DIR 
+                          Specify the directory for PSBLAS library. 
+  --with-ccopt            additional [CCOPT] flags to be added: will prepend 
+                          to [CCOPT] 
+  --with-fcopt            additional [FCOPT] flags to be added: will prepend 
+                          to [FCOPT] 
+  --with-libs             List additional link flags here. For example, 
+                          --with-libs=-lspecial_system_lib or 
+                          --with-libs=-L/path/to/libs 
+  --with-clibs            additional [CLIBS] flags to be added: will prepend 
+                          to [CLIBS] 
+  --with-flibs            additional [FLIBS] flags to be added: will prepend 
+                          to [FLIBS] 
+  --with-library-path     additional [LIBRARYPATH] flags to be added: will 
+                          prepend to [LIBRARYPATH] 
+  --with-include-path     additional [INCLUDEPATH] flags to be added: will 
+                          prepend to [INCLUDEPATH] 
+  --with-module-path      additional [MODULE_PATH] flags to be added: will 
+                          prepend to [MODULE_PATH] 
+  --with-extra-libs       List additional link flags here. For example, 
+                          --with-extra-libs=-lspecial_system_lib or 
+                          --with-extra-libs=-L/path/to/libs 
+  --with-blas=<lib>       use BLAS library <lib> 
+  --with-blasdir=<dir>    search for BLAS library in <dir> 
+  --with-lapack=<lib>     use LAPACK library <lib> 
+  --with-mumps=LIBNAME    Specify the libname for MUMPS. Default: autodetect 
+                          with minimum "-lmumps_common -lpord" 
+  --with-mumpsdir=DIR     Specify the directory for MUMPS library and 
+                          includes. Note: you will need to add auxiliary 
+                          libraries with --extra-libs; this depends on how 
+                          MUMPS was configured and installed, at a minimum you 
+                          will need SCALAPACK and BLAS 
+  --with-mumpsincdir=DIR  Specify the directory for MUMPS includes. 
+  --with-mumpsmoddir=DIR  Specify the directory for MUMPS Fortran modules. 
+  --with-mumpslibdir=DIR  Specify the directory for MUMPS library. 
+  --with-umfpack=LIBNAME  Specify the library name for UMFPACK and its support 
+                          libraries. Default: "-lumfpack -lamd" 
+  --with-umfpackdir=DIR   Specify the directory for UMFPACK library and 
+                          includes. 
+  --with-umfpackincdir=DIR 
+                          Specify the directory for UMFPACK includes. 
+  --with-umfpacklibdir=DIR 
+                          Specify the directory for UMFPACK library. 
+  --with-superlu=LIBNAME  Specify the library name for SUPERLU library. 
+                          Default: "-lsuperlu" 
+  --with-superludir=DIR   Specify the directory for SUPERLU library and 
+                          includes. 
+  --with-superluincdir=DIR 
+                          Specify the directory for SUPERLU includes. 
+  --with-superlulibdir=DIR 
+                          Specify the directory for SUPERLU library. 
+  --with-superludist=LIBNAME 
+                          Specify the libname for SUPERLUDIST library. 
+                          Requires you also specify SuperLU. Default: 
+                          "-lsuperlu_dist" 
+  --with-superludistdir=DIR 
+                          Specify the directory for SUPERLUDIST library and 
+                          includes. 
+  --with-superludistincdir=DIR 
+                          Specify the directory for SUPERLUDIST includes. 
+  --with-superludistlibdir=DIR 
+                          Specify the directory for SUPERLUDIST library. 
+ 
+Some influential environment variables: 
+  FC          Fortran compiler command 
+  FCFLAGS     Fortran compiler flags 
+  LDFLAGS     linker flags, e.g. -L<lib dir> if you have libraries in a 
+              nonstandard directory <lib dir> 
+  LIBS        libraries to pass to the linker, e.g. -l<library> 
+  CC          C compiler command 
+  CFLAGS      C compiler flags 
+  CPPFLAGS    (Objective) C/C++ preprocessor flags, e.g. -I<include dir> if 
+              you have headers in a nonstandard directory <include dir> 
+  MPICC       MPI C compiler command 
+  MPIFC       MPI Fortran compiler command 
+  CPP         C preprocessor 
+ 
+Use these variables to override the choices made by configure or to help 
+it to find libraries and programs with nonstandard names/locations. 
+ 
+Report bugs to <https://github.com/psctoolkit/psctoolkit/issues>.
+   
+

For instance, if a user has built and installed PSBLAS 3.7 under the /opt directory and is +using the SuiteSparse package (which includes UMFPACK), then AMG4PSBLAS +might be configured with: + + + +

+./configure --with-psblas=/opt/psblas-3.7/ \
+--with-umfpackincdir=/usr/include/suitesparse/
+
+

Once the configure script has completed execution, it will have generated the file +Make.inc which will then be used by all Makefiles in the directory tree; this file will be +copied in the install directory under the name Make.inc.AMG4PSBLAS. +

To use the MUMPS solver package, the user has to add the appropriate options to +the configure script; by default we are looking for the libraries -ldmumps -lsmumps + -lzmumps -lcmumps -mumps_common -lpord. MUMPS often uses additional +packages such as ScaLAPACK, ParMETIS, SCOTCH, as well as enabling OpenMP; in +such cases it is necessary to add linker options with the --with-extra-libs configure +option. +

To build the library the user will now enter + + + +

+make
+
+

followed (optionally) by + + + +

+make install
+
+

+

3.4 Bug reporting

+

If you find any bugs in our codes, please report them through our issues page +on
https://github.com/psctoolkit/psctoolkit/issues
+

To enable us to track the bug, please provide a log from the failing application, the +test conditions, and ideally a self-contained test program reproducing the +issue. +

+

3.5 Example and test programs

+

The package contains a samples directory, divided in two subdirs simple and +advanced; both of them are further divided into fileread and pdegen subdirectories. +Their purpose is as follows: +

+

+simple

+

contains a set of simple example programs with a predefined choice of + preconditioners, selectable via integer values. These are intended to get + acquainted with the multilevel preconditioners available in AMG4PSBLAS. +

+

+advanced

+

contains a set of more sophisticated examples that will allow the user, via + the input files in the runs subdirectories, to experiment with the full range + of preconditioners implemented in the package.

+

The fileread directories contain sample programs that read sparse matrices from files, +according to the Matrix Market or the Harwell-Boeing storage format; the pdegen +programs generate matrices in full parallel mode from the discretization of a sample +partial differential equation. + + + + + + + + + + + + + + + + + + +

+ id="tailuserhtmlse3.html"> diff --git a/docs/html/userhtmlse4.html b/docs/html/userhtmlse4.html index 0a78b57d..4961bca3 100644 --- a/docs/html/userhtmlse4.html +++ b/docs/html/userhtmlse4.html @@ -29,7 +29,7 @@ class="cmr-12">up]

4 Getting Started

This section describes the basics for building and applying AMG4PSBLAS one-level @@ -39,15 +39,16 @@ class="cmr-12">and multilevel (i.e., AMG) preconditioners with the Krylov solver class="cmr-12">PSBLAS [2021].

The following steps are required:

    -
  1. +

    Declare the preconditioner data structure. It is a derived data type, structure is accessed by the user only through the AMG4PSBLAS rou following an object-oriented approach.

  2. -
  3. +

    Allocate and initialize the preconditioner data structure, according to a preconditioner type chosen by the user. This is performed by the routine - initinit, which also sets defaults for each preconditioner type selected by the user. The preconditioner types and the defaults associated with them are given in Table 1, where the strings used by init , where the strings used by init to identify the preconditioner types are also given. Note that these strings are valid also if uppercase letters are substituted by corresponding lowercase ones.

  4. -
  5. +

    Modify the selected preconditioner type, by properly setting preconditioner parameters. This is performed by the routine setThis is performed by the routine set. This routine must be called if the user wants to modify the default values of the parameters associated with the selected preconditioner type, to obtain a variant of that preconditioner. Examples of use of set preconditioner. Examples of use of set are given in Section 4.1; a complete list of all the preconditioner parameters and their allowed and d is provided in Section 5, Tables 2-8. + + +

  6. -
  7. +

    Build the preconditioner for a given matrix. If the selected preconditioner is multilevel, then two steps must be performed, as specified next. - - -

      -
    1. +

      Build the AMG hierarchy for a given matrix. This is performed by the routine hierarchy_buildroutine hierarchy_build.

    2. -
    3. +

      Build the preconditioner for a given matrix. This is performed by the routine smoothers_buildroutine smoothers_build.

    If the selected preconditioner is one-level, it is built in a single step, performed by the routine bldthe routine bld.

  8. -
  9. +

    Apply the preconditioner at each iteration of a Krylov solver. This is performed by the method applythe method apply. When using the PSBLAS Krylov solvers, this step is completely transparent to the user, since apply completely transparent to the user, since apply is called by the PSBLAS routine implementing the Krylov solver (psb_krylovimplementing the Krylov solver (psb_krylov).

  10. -
  11. +

    Free the preconditioner data structure. This is performed by the routine free. This is performed by the routine free. This step is complementary to step 1 and should be performed when the @@ -208,13 +205,13 @@ class="cmr-12">All the previous routines are available as methods of the precond detailed description of them is given in Section 5. Examples showing the basic use of AMG4PSBLAS are reported in Section 4.1.

    @@ -222,7 +219,7 @@ class="cmr-12">.



    @@ -231,7 +228,7 @@ class="cmr-12">. >

    r class="hline"> @@ -327,16 +302,8 @@ local solver. class="hline"> @@ -344,36 +311,22 @@ backward Gauss-Seidel. class="hline">



    No preconditioner

    NONE

    NONE

    Considered to use the PSBLAS Krylov solvers with no preconditioner.




    Diagonal

    DIAG, -JACOBI, -L1-JACOBI

    DIAG, +JACOBI, +L1-JACOBI

    Diagonal preconditioner. For any zero diagonal entry of the matrix to be preconditioned, the corresponding entry @@ -310,16 +293,8 @@ of the preconditioner is set to 1.




    Gauss-Seidel

    GS, -L1-GS

    GS, +L1-GS

    Hybrid Gauss-Seidel (forward), that is, global block Jacobi with Gauss-Seidel as local solver.




    Symmetrized Gauss-Seidel

    FBGS, -L1-FBGS

    FBGS, +L1-FBGS

    Symmetrized hybrid Gauss-Seidel, that is, forward Gauss-Seidel followed by backward Gauss-Seidel.




    Block Jacobi

    BJAC, -L1-BJAC

    BJAC, +L1-BJAC

    Block-Jacobi with ILU(0) on the local blocks.




    Additive Schwarz

    AS

    AS

    Additive Schwarz (AS), with overlap 1 and ILU(0) on the local blocks.




    Multilevel

    ML

    ML

    V-cycle with one hybrid forward Gauss-Seidel (GS) sweep as pre-smoother and one hybrid backward @@ -382,16 +335,16 @@ smoothed aggregation as coarsening algorithm, and LU (plus triangular solve) as coarsest-level solver. See the default values in Tables 2-8 for further details of +href="userhtmlse5.html#x8-18009r2">2-8 for further details of the preconditioner.




    +class="td11">

    Table 1: Preconditioner types, corresponding strings and default choices.
    +class="content">Preconditioner types, corresponding strings and default choices.
    @@ -399,23 +352,20 @@ class="content">Preconditioner types, corresponding strings and default choices.

    Note that the module amg_prec_modNote that the module amg_prec_mod, containing the definition of the preconditioner data type and the interfaces to the routines of AMG4PSBLAS, must be used in any program calling such routines. The modules psb_base_modin any program calling such routines. The modules psb_base_mod, for the sparse matrix and communication descriptor data types, and psb_krylov_modsparse matrix and communication descriptor data types, and psb_krylov_mod, for interfacing with the Krylov solvers, must be also used (see Section 4.1).
    problems. However, this does not necessarily correspond to the sh on parallel computers. -

    -  4.1 Examples -
     4.2 GPU example -
    +

    4.1 Examples

    +

    The code reported in Figure 1 shows how to set and apply the default multilevel +preconditioner available in the real double precision version of AMG4PSBLAS +(see Table 1). This preconditioner is chosen by simply specifying ML as the +second argument of P%init (a call to P%set is not needed) and is applied +with the CG solver provided by PSBLAS (the matrix of the system to be +solved is assumed to be positive definite). As previously observed, the modules +psb_base_mod, amg_prec_mod and psb_krylov_mod must be used by the example +program. +

    The part of the code dealing with reading and assembling the sparse matrix and the +right-hand side vector and the deallocation of the relevant data structures, performed +through the PSBLAS routines for sparse matrix and vector management, +is not reported here for the sake of conciseness. The complete code can be +found in the example program file amg_dexample_ml.f90, in the directory +samples/simple/fileread of the AMG4PSBLAS implementation (see Section 3.5). A +sample test problem along with the relevant input data is available in +samples/simple/fileread/runs. For details on the use of the PSBLAS routines, see +the PSBLAS User’s Guide [21]. +

    The setup and application of the default multilevel preconditioner for the real single +precision and the complex, single and double precision, versions are obtained +with straightforward modifications of the previous example (see Section 5 for +details). If these versions are installed, the corresponding codes are available in +samples/simple/fileread. + + + +


    + + + +
    +

    + + + +

    +  use psb_base_mod
    +  use amg_prec_mod
    +  use psb_krylov_mod
    +... ...
    +!
    +! sparse matrix
    +  type(psb_dspmat_type) :: A
    +! sparse matrix descriptor
    +  type(psb_desc_type)   :: desc_A
    +! preconditioner
    +  type(amg_dprec_type)  :: P
    +! right-hand side and solution vectors
    +  type(psb_d_vect_type) :: b, x
    +... ...
    +!
    +! initialize the parallel environment
    +  call psb_init(ctxt)
    +  call psb_info(ctxt,iam,np)
    +... ...
    +!
    +! read and assemble the spd matrix A and the right-hand side b
    +! using PSBLAS routines for sparse matrix / vector management
    +... ...
    +!
    +! initialize the default multilevel preconditioner, i.e. V-cycle
    +! with basic smoothed aggregation, 1 hybrid forward/backward
    +! GS sweep as pre/post-smoother and UMFPACK as coarsest-level
    +! solver
    +  call P%init(ctxt,’ML’,info)
    +!
    +! build the preconditioner
    +  call P%hierarchy_build(A,desc_A,info)
    +  call P%smoothers_build(A,desc_A,info)
    +
    +!
    +! set the solver parameters and the initial guess
    +  ... ...
    +!
    +! solve Ax=b with preconditioned FCG
    +  call psb_krylov(’FCG’,A,P,b,x,tol,desc_A,info)
    +  ... ...
    +!
    +! deallocate the preconditioner
    +  call P%free(info)
    +!
    +! deallocate other data structures
    +  ... ...
    +!
    +! exit the parallel environment
    +  call psb_exit(ctxt)
    +  stop
    +
    +

    + + + +
    Listing 1: setup and application of the default multilevel preconditioner (example 1). +
    +
    + + + +

    +

    Different versions of the multilevel preconditioner can be obtained by changing the +default values of the preconditioner parameters. The code reported in Figure 2 shows +how to set a V-cycle preconditioner which applies 1 block-Jacobi sweep as pre- +and post-smoother, and solves the coarsest-level system with 8 block-Jacobi +sweeps. Note that the ILU(0) factorization (plus triangular solve) is used as +local solver for the block-Jacobi sweeps, since this is the default associated +with block-Jacobi and set by P%init. Furthermore, specifying block-Jacobi as +coarsest-level solver implies that the coarsest-level matrix is distributed among +the processes. Figure 3 shows how to set a W-cycle preconditioner using the +Coarsening based on Compatible Weighted Matching, aggregates of size at +most 8 and smoothed prolongators. It applies 2 hybrid Gauss-Seidel sweeps as +pre- and post-smoother, and solves the coarsest-level system with the parallel +flexible Conjugate Gradient method (KRM) coupled with the block-Jacobi +preconditioner having ILU(0) on the blocks. Default parameters are used for stopping +criterion of the coarsest solver. Note that, also in this case, specifying KRM as +coarsest-level solver implies that the coarsest-level matrix is distributed among the +processes. +

    The code fragments shown in Figures 2 and 3 are included in the example program +file amg_dexample_ml.f90 too. +

    Finally, Figure 4 shows the setup of a one-level additive Schwarz preconditioner, +i.e., RAS with overlap 2. Note also that a Krylov method different from CG +must be used to solve the preconditioned system, since the preconditione in +nonsymmetric. The corresponding example program is available in the file +amg_dexample_1lev.f90. +

    For all the previous preconditioners, example programs where the sparse matrix +and the right-hand side are generated by discretizing a PDE with Dirichlet +boundary conditions are also available in the directory samples/simple/pdegen. + + + +


    + + + +
    +

    +

    +... ...
    +! build a V-cycle preconditioner with 1 block-Jacobi sweep (with
    +! ILU(0) on the blocks) as pre- and post-smoother, and 8  block-Jacobi
    +! sweeps (with ILU(0) on the blocks) as coarsest-level solver
    +  call P%init(ctxt,’ML’,info)
    +  call P%set(’SMOOTHER_TYPE’,’BJAC’,info)
    +  call P%set(’COARSE_SOLVE’,’BJAC’,info)
    +  call P%set(’COARSE_SWEEPS’,8,info)
    +  call P%hierarchy_build(A,desc_A,info)
    +  call P%smoothers_build(A,desc_A,info)
    +... ...
    +
    +

    +
    Listing 2: setup of a multilevel preconditioner based on the default decoupled coarsening
    + + + +

    + + + +


    + + + +
    +

    +

    +... ...
    +! build a W-cycle preconditioner with 2 hybrid Gauss-Seidel sweeps
    +! as pre- and post-smoother, a distributed coarsest
    +! matrix, and MUMPS as coarsest-level solver
    +  call P%init(ctxt,’ML’,info)
    +  call P%set(’PAR_AGGR_ALG’,’COUPLED’,info)
    +  call P%set(’AGGR_TYPE’,’MATCHBOXP’,info)
    +  call P%set(’AGGR_SIZE’,8,info)
    +  call P%set(’ML_CYCLE’,’WCYCLE’,info)
    +  call P%set(’SMOOTHER_TYPE’,’FBGS’,info)
    +  call P%set(’SMOOTHER_SWEEPS’,2,info)
    +  call P%set(’COARSE_SOLVE’,’KRM’,info)
    +  call P%set(’COARSE_MAT’,’DIST’,info)
    +  call P%set(’KRM_METHOD’,’FCG’,info)
    +  call P%hierarchy_build(A,desc_A,info)
    +  call P%smoothers_build(A,desc_A,info)
    +... ...
    +
    +

    +
    Listing 3: setup of a multilevel preconditioner based on the coupled coarsening using +weighted matching
    + + + +

    + +


    + + + +
    +

    +

    +... ...
    +! set RAS with overlap 2 and ILU(0) on the local blocks
    +  call P%init(ctxt,’AS’,info)
    +  call P%set(’SUB_OVR’,2,info)
    +  call P%bld(A,desc_A,info)
    +... ...
    +! solve Ax=b with preconditioned BiCGSTAB
    +  call psb_krylov(’BICGSTAB’,A,P,b,x,tol,desc_A,info)
    +
    +

    +
    Listing 4: setup of a one-level Schwarz preconditioner.
    + + + +

    +

    4.2 GPU example

    +

    The code discussed here shows how to set up a program exploiting the combined GPU +capabilities of PSBLAS and AMG4PSBLAS. The code example is available in the +source distribution directory amg4psblas/examples/gpu. +

    First of all, we need to include the appropriate modules and declare some auxiliary +variables: + + + +


    + + + +
    +

    +

    +program amg_dexample_gpu
    +  use psb_base_mod
    +  use amg_prec_mod
    +  use psb_krylov_mod
    +  use psb_util_mod
    +  use psb_gpu_mod
    +  use data_input
    +  use amg_d_pde_mod
    +  implicit none
    +  .......
    +  ! GPU variables
    +  type(psb_d_hlg_sparse_mat) :: agmold
    +  type(psb_d_vect_gpu)       :: vgmold
    +  type(psb_i_vect_gpu)       :: igmold
    +
    + 
    +
    +

    +
    Listing 5: setup of a GPU-enabled test program part one.
    + + + +

    +

    In this particular example we are choosing to employ a HLG data structure for +sparse matrices on GPUs; for more information please refer to the PSBLAS-EXT users’ +guide. +

    We then have to initialize the GPU environment, and pass the appropriate MOLD +variables to the build methods (see also the PSBLAS and PSBLAS-EXT users’ +guides). + + + +


    + + + +
    +

    +

    +  call psb_init(ctxt)
    +  call psb_info(ctxt,iam,np)
    +  !
    +  ! BEWARE: if you have NGPUS  per node, the default is to
    +  ! attach to mod(IAM,NGPUS)
    +  !
    +  call psb_gpu_init(ictxt)
    +  ......
    +  t1 = psb_wtime()
    +  call prec%smoothers_build(a,desc_a,info, amold=agmold, vmold=vgmold, imold=igmold)
    +
    + 
    +
    +

    +
    Listing 6: setup of a GPU-enabled test program part two.
    + + + +

    +

    Finally, we convert the input matrix, the descriptor and the vectors to use a +GPU-enabled internal storage format. We then preallocate the preconditioner +workspace before entering the Krylov method. At the end of the code, we close the +GPU environment + + + +


    + + + +
    +

    +

    +  call desc_a%cnv(mold=igmold)
    +  call a%cscnv(info,mold=agmold)
    +  call psb_geasb(x,desc_a,info,mold=vgmold)
    +  call psb_geasb(b,desc_a,info,mold=vgmold)
    +
    +  !
    +  ! iterative method parameters
    +  !
    +  call psb_barrier(ctxt)
    +  call prec%allocate_wrk(info)
    +  t1 = psb_wtime()
    +  call psb_krylov(s_choice%kmethd,a,prec,b,x,s_choice%eps,&
    +       & desc_a,info,itmax=s_choice%itmax,iter=iter,err=err,itrace=s_choice%itrace,&
    +       & istop=s_choice%istopc,irst=s_choice%irst)
    +  call prec%deallocate_wrk(info)
    +  call psb_barrier(ctxt)
    +  tslv = psb_wtime() - t1
    +
    +  ......
    +  call psb_gpu_exit()
    +  call psb_exit(ctxt)
    +  stop
    +
    + 
    +
    +

    +
    Listing 7: setup of a GPU-enabled test program part three.
    + + + +

    +

    It is very important to employ smoothers and coarsest solvers that are suited to the +GPU, i.e. methods that do NOT employ triangular system solve kernels. Methods that +satisfy this constraint include: +

      +
    • +

      JACOBI +

    • +
    • +

      BJAC with the following methods on the local blocks: +

        +
      • +

        INVK +

      • +
      • +

        INVT +

      • +
      • +

        AINV

      +
    +

    and their 1 variants. + + + + + + + + + +

    + id="tailuserhtmlse4.html"> diff --git a/docs/html/userhtmlse5.html b/docs/html/userhtmlse5.html index a7e1fb0e..5a4c35b8 100644 --- a/docs/html/userhtmlse5.html +++ b/docs/html/userhtmlse5.html @@ -29,32 +29,24 @@ class="cmr-12">up]

    5 User Interface

    The basic user interface of AMG4PBLAS consists of eight methods. The six methods -init, set, build, hierarchy_build, smoothers_build and apply init, set, build, hierarchy_build, smoothers_build and apply encapsulate all the functionalities for the setup and the application of any multilevel and one-level preconditioner implemented in the package. The method free preconditioner implemented in the package. The method free deallocates the preconditioner data structure, while descr preconditioner data structure, while descr prints a description of the preconditioner setup by the user. For backward compatibility, methods are also accessible as @@ -67,7 +59,8 @@ class="cmr-12">real/complex and single/double precision data; arguments with app must be passed to the method, i.e.,

      -
    • +

      the sparse matrix data structure, containing the matrix to be preconditioned, must be of type for complex double precision;

    • -
    • +

      the preconditioner data structure must be of type amg_x, z, according to the sparse matrix data structure;

    • -
    • +

      the arrays containing the vectors v and , class="cmtt-12">z, in a manner completely analogous to the sparse matrix type;

    • -
    • +

      real parameters defining the preconditioner must be declared according to the precision of the sparse matrix and preconditioner data structures (see Section 5.2).

    A description of each method is given in the remainder of this se -

    -  5.1 Method init -
     5.2 Method set -
     5.3 5.1 Method init +
    +

    +

    call p%init(contxt,ptype,info)

    +

    This method allocates and initializes the preconditioner p, according to the +preconditioner type chosen by the user. +

    Arguments +

    + + + + + +

    contxt

    type(psb_ctxt_type), intent(in).

    The communication context.

    ptype

    character(len=*), intent(in) .

    The type of preconditioner. Its values are specified in Table 1.

    Note that strings are case insensitive.

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    + + + +

    5.2 Method set

    +
    +

    +

    call p%set(what,val,info [,ilev, ilmax, pos, idx])

    +

    This method sets the parameters defining the preconditioner p. More precisely, the +parameter identified by what is assigned the value contained in val. +

    Arguments +

    + + + + + + + + + + + + + +

    what

    character(len=*).

    The parameter to be set. It can be specified through its name; the +string is case-insensitive. See Tables 2-8.

    val

    integer or character(len=*) or real(psb_spk_) or +real(psb_dpk_), intent(in).

    The value of the parameter to be set. The list of allowed values and +the corresponding data types is given in Tables 2-8. When the value +is of type character(len=*), it is also treated as case insensitive.

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    ilev

    integer, optional, intent(in).

    For the multilevel preconditioner, the level at which the +preconditioner parameter has to be set. The levels are numbered +in increasing order starting from the finest one, i.e., level 1 is the +finest level. If ilev is not present, the parameter identified by what +is set at all levels that are appropriate (see Tables 2-8).

    ilmax

    integer, optional, intent(in).

    For the multilevel preconditioner, when both ilev and ilmax are +present, the settings are applied at all levels ilev:ilmax. When +ilev is present but ilmax is not, then the default is ilmax=ilev. +The levels are numbered in increasing order starting from the finest +one, i.e., level 1 is the finest level.

    pos

    character(len=*), optional, intent(in).

    Whether the other arguments apply only to the pre-smoother +(PRE) or to the post-smoother (POST). If pos is not present, +the other arguments are applied to both smoothers. If the +preconditioner is one-level or the parameter identified by what does +not concern the smoothers, pos is ignored.

    idx

    integer, optional, intent(in).

    An auxiliary input argument that can be passed to the underlying +objects.

    +

    +

    A variety of preconditioners can be obtained by setting the appropriate +preconditioner parameters. These parameters can be logically divided into four groups, + + + +i.e., parameters defining +

      +
    1. +

      the type of multilevel cycle and how many cycles must be applied; +

    2. +
    3. +

      the coarsening algorithm; +

    4. +
    5. +

      the solver at the coarsest level (for multilevel preconditioners only); +

    6. +
    7. +

      the smoother of the multilevel preconditioners, or the one-level + preconditioner.

    +

    A list of the parameters that can be set, along with their allowed and default values, is +given in Tables 2-8.
    +

    Remark 2. A smoother is usually obtained by combining two objects: a +smoother (SMOOTHER_TYPE) and a local solver (SUB_SOLVE), as specified in +Tables 7-9. For example, the block-Jacobi smoother using ILU(0) on the blocks is +obtained by combining the block-Jacobi smoother object with the ILU(0) solver +object. Similarly, the hybrid Gauss-Seidel smoother (see Note in Table 7) is +obtained by combining the block-Jacobi smoother object with a single sweep of +the Gauss-Seidel solver object, while the point-Jacobi smoother is the result +of combining the block-Jacobi smoother object with a single sweep of the +point-Jacobi solver object. In the same way are obtained the 1-versions of the +smoothers. However, for simplicity, shortcuts are provided to set all versions of +point-Jacobi, hybrid (forward) Gauss-Seidel, and hybrid backward Gauss-Seidel, i.e., +the previous smoothers can be defined just by setting SMOOTHER_TYPE to +certain specific values (see Tables 7), without the need to set SUB_SOLVE as +well. +

    The smoother and solver objects are arranged in a hierarchical manner. When +specifying a smoother object, its parameters, including the local solver, are set to +their default values, and when a solver object is specified, its defaults are also + + + +set, overriding in both cases any previous settings even if explicitly specified. +Therefore if the user sets a smoother, and wishes to use a solver different from +the default one, the call to set the solver must come after the call to set the +smoother. +

    Similar considerations apply to the point-Jacobi, Gauss-Seidel and block-Jacobi +coarsest-level solvers, and shortcuts are available in this case too (see Table 5). +
    +

    Remark 3. The polynomial-accelerated smoother described in Tables 7 and 9 +redefines a sweep or iteration as corresponding to the degree of the polynomial used. +Consequently, the SMOOTHER_SWEEPS option is overridden by the POLY_DEGREE +option. This smoother is paired with a base smoother object, whose iterations are +accelerated using the specified polynomial smoothing technique. By default, the +1-Jacobi smoother serves as the base smoother, offering theoretical guarantees on the +resulting convergence factor [15, 27]. Alternative combinations are experimental and +lack established guarantees.
    +

    Remark 4. Many of the coarsest-level solvers apply to a specific coarsest-matrix +layout; therefore, setting the solver after the layout may change the layout to either +distributed or replicated. Similarly, setting the layout after the solver may change the +solver. +

    More precisely, UMFPACK and SuperLU require the coarsest-level matrix to be +replicated, while SuperLU_Dist and KRM require it to be distributed. In these cases, +setting the coarsest-level solver implies that the layout is redefined according to the +solver, ovverriding any previous settings. MUMPS, point-Jacobi, hybrid Gauss-Seidel +and block-Jacobi can be applied to replicated and distributed matrices, thus their +choice does not modify any previously specified layout. It is worth noting that, when +the matrix is replicated, the point-Jacobi, hybrid Gauss-Seidel and block-Jacobi solvers +and their 1- versions reduce to the corresponding local solver objects (see Remark 2). +For the point-Jacobi and Gauss-Seidel solvers, these objects correspond to a single +point-Jacobi sweep and a single Gauss-Seidel sweep, respectively, which are very poor +solvers. +

    On the other hand, the distributed layout can be used with any solver but +UMFPACK and SuperLU; therefore, if any of these two solvers has already been +selected, the coarsest-level solver is changed to block-Jacobi, with the previously chosen +solver applied to the local blocks. Likewise, the replicated layout can be used with any +solver but SuperLu_Dist and KRM; therefore, if SuperLu_Dist or KRM have been +previously set, the coarsest-level solver is changed to the default sequential +solver. +

    In a parallel setting with many cores, we suggest to the users to change the default +coarsest solver for using the KRM choice, i.e. a parallel distributed iterative solution of +the coarsest system based on Krylov methods. + + + +

    Remark 4. The argument idx can be used to allow finer control for those solvers; +for instance, by specifying the keyword MUMPS_IPAR_ENTRY and an appropriate value +for idx, it is possible to set any entry in the MUMPS integer control array. See also +Sec. 6. +

    + + + +


    + + + +
    +

    +

    + +





    what

    data type

    val

    default

    comments






    ML_CYCLE

    character(len=*)

    VCYCLE +

    WCYCLE +

    KCYCLE +

    ADD

    VCYCLE

    Multilevel cycle: V-cycle, W-cycle, K-cycle, +and additive composition.






    CYCLE_SWEEPS

    integer

    Any integer +

    number 1

    1

    Number of multilevel cycles.






    +
    Table 2: Parameters defining the multilevel cycle and the number of cycles to be +applied.
    + + + +

    +
    +
    + + + +


    + + + +
    +

    + + + +

    + + + + + + + +
    Note. The aggregation algorithm stops when at least one of the following criteria is met: the coarse size threshold,
    +
    the minimum coarsening ratio, or the maximum number of levels is reached.
    +
    Therefore, the actual number of levels may be smaller than the specified maximum number of levels.
    +





    what

    data type

    val

    default

    comments






    MIN_COARSE_SIZE_PER_PROCESS

    integer

    Any number +

    > 0

    200

    Coarse size threshold per process. The +aggregation stops if the global number of +variables of the computed coarsest matrix +is lower than or equal to this threshold +multiplied by the number of processes +(see Note).






    MIN_COARSE_SIZE

    integer

    Any number +

    > 0

    -1

    Coarse size threshold. The aggregation +stops if the global number of variables +of the computed coarsest matrix is lower +than or equal +to this threshold (see Note). If negative, +it is ignored in favour of the default for +MIN_COARSE_SIZE_PER_PROCESS.






    MIN_CR_RATIO

    real

    Any number +

    > 1

    1.5

    Minimum coarsening +ratio. The aggregation stops if the ratio +between the global matrix dimensions at +two consecutive levels is lower than or +equal to this threshold (see Note).






    MAX_LEVS

    integer

    Any integer +

    number > 1

    20

    Maximum number of levels. The +aggregation stops if the number of levels +reaches this value (see Note).






    PAR_AGGR_ALG

    character(len=*)

    ’DEC’, +’SYMDEC’, +’COUPLED’

    ’DEC’

    Parallel aggregation algorithm. +

    the SYMDEC option applies decoupled +aggregation to the sparsity pattern of +A + AT .






    AGGR_TYPE

    character(len=*)

    SOC1, +SOC2, +MATCHBOXP

    SOC1

    Type of aggregation algorithm: currently, +for the +decoupled aggregation we implement two +measures of strength of connection, the +one by Vaněk, Mandel and Brezina [35], +and the one by Gratton et al [24]. The +coupled aggregation is based on a parallel +version of the half-approximate matching +implemented in the MatchBox-P software +package [9].






    AGGR_SIZE

    integer

    Any integer +

    power of 2, +with +aggr_size +2

    4

    Maximum size of aggregates when the +coupled aggregation based on matching +is applied. For aggressive coarsening +with size of aggregate larger than 8 +we recommend the use of smoothed +prolongators. Used only with ’COUPLED’ +and ’MATCHBOXP’






    AGGR_PROL

    character(len=*)

    SMOOTHED, +UNSMOOTHED

    SMOOTHED

    Prolongator used by the aggregation +algorithm: smoothed or unsmoothed (i.e., +tentative prolongator).











    + + + +
    Table 3: Parameters defining the aggregation algorithm.
    +
    + + + +

    +
    +
    + + + +


    + + + +
    +

    +

    + + +
    Note. Different thresholds at different levels, such as those used in [35, Section 5.1], can be easily set by invoking the rou-
    +
    tine set with the parameter ilev.
    +





    what

    data type

    val

    default

    comments






    AGGR_ORD

    character(len=*)

    ’NATURAL’ +

    ’DEGREE’

    ’NATURAL’

    Initial ordering of indices for +the decoupled aggregation algorithm: +either natural ordering or sorted by +descending degrees of the nodes in the +matrix graph.






    AGGR_THRESH

    real(kind_parameter)

    Any real +

    number  +[0,1]

    0.01

    The threshold θ in the strength of +connection algorithm. See also the note +at the bottom of this table.






    AGGR_FILTER

    character(len=*)

    ’FILTER’ +

    ’NOFILTER’

    ’NOFILTER’

    Matrix used in computing the smoothed +prolongator: filtered or unfiltered.











    +
    Table 4: Parameters defining the aggregation algorithm (continued).
    + + + +

    +
    +
    + + + +


    + + + +
    +

    + + + +

    + + +
    Note. Defaults for COARSE_SOLVE and COARSE_SUBSOLVE are chosen in the following order:
    +
    single precision version – MUMPS if installed, then SLU if installed, ILU otherwise;
    +
    double precision version – UMF if installed, then MUMPS if installed, then SLU if installed, ILU otherwise.
    + + + + +
    Note. Further options for coarse solvers are contained in Table 6.
    +
    For a first use it is suggested to use the default options obtained by simply selecting the solver type.
    +





    what

    data type

    val

    default

    comments






    COARSE_MAT

    character(len=*)

    DIST +

    REPL

    REPL

    Coarsest matrix layout: distributed among the +processes or replicated on each of them.






    COARSE_SOLVE

    character(len=*)

    MUMPS +

    UMF +

    SLU +

    SLUDIST +

    ILU +

    JACOBI +

    GS +

    BJAC +

    KRM +

    L1-JACOBI +

    L1-BJAC +

    L1-FBGS

    See Note.

    Solver used at the coarsest level: sequential +LU from MUMPS, UMFPACK, or SuperLU +(plus triangular solve); distributed LU from +MUMPS or SuperLU_Dist (plus triangular solve); +point-Jacobi, hybrid Gauss-Seidel or block-Jacobi +and related 1-versions; Krylov Method (flexible +Conjugate Gradient) coupled with the block-Jacobi +preconditioner with ILU(0) on the blocks. Note +that UMF and SLU require the coarsest matrix to +be replicated, SLUDIST, JACOBI, GS, BJAC and KRM +require it to be distributed, MUMPS can be used with +either a replicated or a distributed matrix. When +any of the previous solvers is specified, the matrix +layout is set to a default value which allows the use +of the solver (see Remark 4, p. 21). Note also that +UMFPACK and SuperLU_Dist are available only in +double precision.






    COARSE_SUBSOLVE

    character(len=*)

    ILU +

    ILUT +

    MILU +

    MUMPS +

    SLU +

    UMF +

    INVT +

    INVK +

    AINV

    See Note.

    Solver for the diagonal blocks of the coarsest +matrix, in case the block Jacobi solver is chosen +as coarsest-level solver: ILU(p), ILU(p,t), MILU(p), +LU from MUMPS, SuperLU or UMFPACK +(plus triangular solve), Approximate Inverses +INVK(p,q), INVT(p1,p2,t1,t2) and AINV(t); note +that approximate inverses are specifically suited +for GPUs since they do not employ triangular +system solve kernels, see [3]. Note that UMFPACK +and SuperLU_Dist are available only in double +precision.











    what

    data type

    val

    default

    comments






    COARSE_SWEEPS

    integer

    Any +integer +

    number > +0

    10

    Number of sweeps when JACOBI, GS or BJAC is +chosen as coarsest-level solver.






    COARSE_FILLIN

    integer

    Any +integer +

    number +0

    0

    Fill-in level p of the ILU factorizations and first +fill-in for the approximate inverses.






    COARSE_ILUTHRS

    real(kind_parameter)

    Any real +

    number +0

    0

    Drop tolerance t in the ILU(p,t) factorization and +first drop-tolerance for the approximate inverses.











    + + + +
    Table 5: Parameters defining the solver at the coarsest level (continued).
    + + + +

    +
    +
    + + + +


    + + + +
    +

    + + + +

    + + + + + + + + + + + + + + +





    what

    data type

    val

    default

    comments






    BJAC_STOP

    character(len=*)

    FALSE +

    TRUE

    FALSE

    Select whether to use a stopping criterion for +the Block-Jacobi method used as a coarse +solver.






    BJAC_TRACE

    character(len=*)

    FALSE +

    TRUE

    FALSE

    Select whether to print a trace for the +calculated residual for the Block-Jacobi +method used as a coarse solver.






    BJAC_ITRACE

    integer

    Any integer +

    > 0

    -1

    Number of iterations after which a trace is to +be printed.






    BJAC_RESCHECK

    integer

    Any integer +

    > 0

    -1

    Number of iterations after which a residual is +to be calculated.






    BJAC_STOPTOL

    real(kind_parameter)

    Any real +

    < 1

    0

    Tolerance for the stopping criterion on the +residual.






    KRM_METHOD

    character(len=*)

    CG +

    FCG +

    CGS +

    CGR +

    BICG +

    BICGSTAB +

    BICGSTABL +

    RGMRES

    FCG

    A string that defines the iterative method to +be used when employing a Krylov method +KRM as a coarse solver. CG the Conjugate +Gradient method; FCG the Flexible Conjugate +Gradient method; CGS the Conjugate Gradient +Stabilized +method; GCR the Generalized Conjugate +Residual method; FCG the Flexible Conjugate +Gradient method; BICG the Bi-Conjugate +Gradient method; BICGSTAB the Bi-Conjugate +Gradient Stabilized method; BICGSTABL the +Bi-Conjugate Gradient Stabilized method +with restarting; RGMRES the Generalized +Minimal Residual method with restarting. +Refer to the PSBLAS guide [21] for further +information.






    KRM_KPREC

    character(len=*)

    Table 1

    BJAC

    The one-level +preconditioners from the Table 1 can be used +for the coarse Krylov solver.






    KRM_SUB_SOLVE

    character(len=*)

    Table 5

    ILU

    Solver for the diagonal blocks of the coarsest +matrix preconditioner, in case the block Jacobi +solver is chosen +as KRM_KPREC: ILU(p), ILU(p,t), MILU(p), +LU from MUMPS, SuperLU or UMFPACK +(plus triangular solve), Approximate Inverses +INVK(p,q), INVT(p1,p2,t1,t2) and AINV(t); +The same caveat from Table 5 applies here.






    KRM_GLOBAL

    character(len=*)

    TRUE, +FALSE

    FALSE

    Choose between a global Krylov solver, all +unknowns on a single node, or a distributed +one. The default choice is the distributed +solver.






    KRM_EPS

    real(kind_parameter)

    Real < 1

    10-6

    The stopping tolerance.






    KRM_IRST

    integer

    Integer +

    1

    30

    An integer specifying the restart parameter. +This is employed for the BiCGSTABL or RGMRES +methods, otherwise it is ignored.






    KRM_ISTOPC

    integer

    Integers +1,2,3

    2

    If 1 then the method uses the normwise +backward error in the infinity norm; if 2, the +it uses the relative residual in the 2-norm; if 3 +the relative residual reduction in the 2-norm is +used instead; refer to the PSBLAS [21] guide +for the details.






    KRM_ITMAX

    integer

    Integer +

    1

    40

    The maximum number of iterations to +perform.






    KRM_ITRACE

    integer

    Integer +

    0

    -1

    If > 0 print out +an informational message about convergence +every KRM_ITRACE iterations. If = 0 print a +message in case of convergence failure.






    KRM_FILLIN

    integer

    Integer +

    0

    0

    Fill-in level p of the ILU factorizations and +first fill-in for the approximate inverses.






    + + + +
    Table 6: Additional parameters defining the solver at the coarsest level.
    + + + +

    +
    +
    + + + +


    + + + +
    +

    +

    + + + +





    what

    data type

    val

    default

    comments






    SMOOTHER_TYPE

    character(len=*)

    JACOBI +

    GS +

    BGS +

    BJAC +

    AS +

    L1-JACOBI +

    L1-BJAC +

    L1-FBGS +

    POLY

    FBGS

    Type of smoother used in the multilevel +preconditioner: point-Jacobi, hybrid +(forward) Gauss-Seidel, hybrid backward +Gauss-Seidel, block-Jacobi, 1-Jacobi, +1–hybrid (forward) +Gauss-Seidel, 1-point-Jacobi and Additive +Schwarz, polynomial accelerators; see [15] +and Remark 3 (p. 21). +

    It is ignored by one-level preconditioners.






    SUB_SOLVE

    character(len=*)

    JACOBI +GS +

    BGS +

    ILU +

    ILUT +

    MILU +

    MUMPS +

    SLU +

    UMF +

    INVT +

    INVK +

    AINV

    GS and BGS for pre- +and post-smoothers of +multilevel +preconditioners, +respectively +

    ILU for block-Jacobi +and Additive Schwarz +one-level +preconditioners

    The local solver to be used with the +smoother or one-level preconditioner (see +Remark 2, page 24): point-Jacobi, hybrid +(forward) Gauss-Seidel, hybrid backward +Gauss-Seidel, ILU(p), ILU(p,t), MILU(p), +LU from MUMPS, +SuperLU or UMFPACK (plus triangular +solve), Approximate Inverses INVK(p,q), +INVT(p1,p2,t1,t2) and AINV(t); note +that approximate inverses are specifically +suited for GPUs since they do not employ +triangular system solve kernels, see [3]. See +Note for details on hybrid Gauss-Seidel.






    SMOOTHER_SWEEPS

    integer

    Any integer +

    number 0

    1

    Number of sweeps of the smoother or +one-level preconditioner. In the multilevel +case, no pre-smother or post-smoother +is used if this parameter is set to 0 +together with pos=PRE or pos=POST, +respectively. Is ignored if the smoother is +POLY






    POLY_DEGREE

    integer

    Any integer +

    number 1 +and 30

    1

    Degree of the polynomial accelerator, is +equal to the number of matrix-vector +products performed by the smoother. Is +ignored if the smoother is not POLY






    +
    Table 7: Parameters defining the smoother or the details of the one-level +preconditioner.
    + + + +

    +
    +
    + + + +


    + + + +
    +

    +

    + + + + + + + +





    what

    data type

    val

    default

    comments






    SUB_OVR

    integer

    Any integer +

    number 0

    1

    Number of overlap layers, for Additive +Schwarz only.

    SUB_RESTR

    character(len=*)

    HALO +

    NONE

    HALO

    Type of restriction operator, for Additive +Schwarz only: HALO for taking into account +the overlap, NONE for neglecting it. +

    Note that HALO must be chosen for the +classical Addditive Schwarz smoother and +its RAS variant.






    SUB_PROL

    character(len=*)

    SUM +

    NONE

    NONE

    Type of prolongation operator, for Additive +Schwarz only: SUM for adding the +contributions from the overlap, NONE for +neglecting them. +

    Note that SUM must be chosen for the +classical Additive Schwarz smoother, and +NONE for its RAS variant.






    SUB_FILLIN

    integer

    Any integer +

    number 0

    0

    Fill-in level p of the incomplete LU +factorizations.






    SUB_ILUTHRS

    real(kind_parameter)

    Any real +number 0

    0

    Drop tolerance t in the ILU(p,t) +factorization.






    MUMPS_LOC_GLOB

    character(len=*)

    LOCAL_SOLVER +

    GLOBAL_SOLVER

    GLOBAL_SOLVER

    Whether MUMPS should be used as a +distributed solver, or as a serial solver acting +only on the part of the matrix local to each +process.






    MUMPS_IPAR_ENTRY

    integer

    Any integer +number

    0

    Set an entry in the MUMPS integer control +array, as chosen via the idx optional +argument.






    MUMPS_RPAR_ENTRY

    real

    Any real number

    0

    Set an entry in the MUMPS real control +array, as chosen via the idx optional +argument.






    +
    Table 8: Parameters defining the smoother or the details of the one-level preconditioner +(continued).
    + + + +

    +
    +
    + + + +


    + + + +
    +

    +

    + + + +





    what

    data type

    val

    default

    comments






    POLY_VARIANT

    character(len=*)

    CHEB_4 +

    CHEB_4_OPT +

    CHEB_1_OPT

    CHEB_4

    Select the type of +polynomial accelerator. +The CHEB_4 and +CHEB_4_OPT types +are those based on the +Chebyshev +polynomials of +the 4th-kind described +in [27]. The +CHEB_1_OPT version +is the one described +in [15] and based on +the Chebyshev +polynomials of the +1st-kind.






    POLY_RHO_ESTIMATE

    character(len=*)

    POLY_RHO_EST_POWER

    POLY_RHO_EST_POWER

    Algorithm for +estimating the spectral +radius of the smoother +to +which the polynomial +acceleration is applied. +The only implemented +algorithm is the power +method; see also the +two following options.






    POLY_RHO_ESTIMATE_ITERATIONS

    integer

    Any integer +

    number 1

    20

    Number of iterations +for the spectral radius +estimate.






    POLY_RHO_BA

    real(kind_parameter)

    Any real +

    number (0,1]

    1

    Sets an estimate of +the spectral radius of +the base smoother to +which the polynomial +accelerator is applied.






    +
    Table 9: Parameters defining the smoother or the details of the one-level preconditioner +(continued).
    + + + +

    +
    + + + +

    5.3 Method hierarchy_build -
     5.4 _build

    +
    +

    +

    call p%hierarchy_build(a,desc_a,info)
    +

    +

    This method builds the hierarchy of matrices and restriction/prolongation operators for +the multilevel preconditioner p, according to the requirements made by the user +through the methods init and set. +

    Arguments +

    + + + + + +

    a

    type(psb_xspmat_type), intent(in).

    The sparse matrix structure containing the local part of the matrix +to be preconditioned. Note that x must be chosen according to the +real/complex, single/double precision version of AMG4PSBLAS +under use. See the PSBLAS User’s Guide for details [21].

    desc_a

    type(psb_desc_type), intent(in).

    The communication descriptor of a. See the PSBLAS User’s Guide +for details [21].

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    + + + +

    5.4 Method smoothers_build -
     5.5 Method build -
     5.6 Method apply -
     5.7 Method free -
     5.8 Method descr -
     5.9 Auxiliary Methods -
      5.9.1 Method: dump -
      5.9.2 Method: clone -
      5.9.3 Method: sizeof -
      5.9.4 Method: allocate_wrk -
      5.9.5 Method: free_wrk -

    +class="cmr-12">_build
    +
    +

    +

    call p%smoothers_build(a,desc_a,p,info[,amold,vmold,imold])
    +

    +

    This method builds the smoothers and the coarsest-level solvers for the multilevel +preconditioner p, according to the requirements made by the user through the methods +init and set, and based on the aggregation hierarchy produced by a previous call to +hierarchy_build (see Section 5.3). +

    Arguments +

    + + + + + + + + + + + +

    a

    type(psb_xspmat_type), intent(in).

    The sparse matrix structure containing the local part of the matrix +to be preconditioned. Note that x must be chosen according to the +real/complex, single/double precision version of AMG4PSBLAS +under use. See the PSBLAS User’s Guide for details [21].

    desc_a

    type(psb_desc_type), intent(in).

    The communication descriptor of a. See the PSBLAS User’s Guide +for details [21].

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    amold

    class(psb_x_base_sparse_mat), intent(in), optional.

    The desired dynamic type for internal matrix components; this +allows e.g. running on GPUs; it needs not be the same on all +processes. See the PSBLAS User’s Guide for details [21].

    vmold

    class(psb_x_base_vect_type), intent(in), optional.

    The desired dynamic type for internal vector components; this +allows e.g. running on GPUs.

    imold

    class(psb_i_base_vect_type), intent(in), optional.

    The desired dynamic type for internal integer vector components; +this allows e.g. running on GPUs.

    + + + +

    5.5 Method build

    +
    +

    +

    call p%build(a,desc_a,info[,amold,vmold,imold])
    +

    +

    This method builds the preconditioner p according to the requirements made by the +user through the methods init and set (see Sections 5.3 and 5.4 for multilevel +preconditioners). It is mostly provided for backward compatibility; indeed, it is +internally implemented by invoking the two previous methods hierarchy_build and +smoothers_build, whose nomenclature would however be somewhat unnatural when +dealing with simple one-level preconditioners. +

    Arguments +

    + + + + + + + + + + + +

    a

    type(psb_xspmat_type), intent(in).

    The sparse matrix structure containing the local part of the matrix +to be preconditioned. Note that x must be chosen according to the +real/complex, single/double precision version of AMG4PSBLAS +under use. See the PSBLAS User’s Guide for details [21].

    desc_a

    type(psb_desc_type), intent(in).

    The communication descriptor of a. See the PSBLAS User’s Guide +for details [21].

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    amold

    class(psb_x_base_sparse_mat), intent(in), optional.

    The desired dynamic type for internal matrix components; this +allows e.g. running on GPUs; it needs not be the same on all +processes. See the PSBLAS User’s Guide for details [21].

    vmold

    class(psb_x_base_vect_type), intent(in), optional.

    The desired dynamic type for internal vector components; this +allows e.g. running on GPUs.

    imold

    class(psb_i_base_vect_type), intent(in), optional.

    The desired dynamic type for internal integer vector components; +this allows e.g. running on GPUs.

    +

    The method can be used to build multilevel preconditioners too. + + + +

    5.6 Method apply

    +
    +

    +

    call p%apply(x,y,desc_a,info [,trans,work])
    +

    +

    This method computes y = op(B-1) x, where B is a previously built preconditioner, +stored into p, and op denotes the preconditioner itself or its transpose, according to the +value of trans. Note that, when AMG4PSBLAS is used with a Krylov solver from +PSBLAS, p%apply is called within the PSBLAS method psb_krylov and hence it is +completely transparent to the user. +

    Arguments +

    + + + + + + + + + + + +

    x

    type(kind_parameter), dimension(:), intent(in)—.

    The local part of the vector x. Note that type and kind_parameter +must be chosen according to the real/complex, single/double +precision version of AMG4PSBLAS under use.

    y

    type(kind_parameter), dimension(:), intent(out)—.

    The local part of the vector y. Note that type and kind_parameter +must be chosen according to the real/complex, single/double +precision version of AMG4PSBLAS under use.

    desc_a

    type(psb_desc_type), intent(in).

    The communication descriptor associated to the matrix to be +preconditioned.

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    trans

    character(len=1), optional, intent(in).

    If trans = N,n then op(B-1) = B-1; if trans = T,t +then op(B-1) = B-T (transpose of B-1); if trans = C,c then +op(B-1) = B-C (conjugate transpose of B-1).

    work

    type(kind_parameter), dimension(:), optional, target—.

    Workspace. Its size should be at least 4 * psb_cd_get_local_ +cols(desc_a) (see the PSBLAS User’s Guide). Note that type and +kind_parameter must be chosen according to the real/complex, +single/double precision version of AMG4PSBLAS under use.

    + + + +

    5.7 Method free

    +
    +

    +

    call p%free(p,info)
    +

    +

    This method deallocates the preconditioner data structure p. +

    Arguments +

    + +

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for +details.

    + + +

    5.8 Method descr

    +
    +

    +

    call p%descr(info, [iout, root, verbosity])
    +

    +

    This method prints a description of the preconditioner p to the standard output or to a +file. It must be called after hierachy_build and smoothers_build, or build, have +been called. +

    Arguments +

    + + + + + + + +

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    iout

    integer, intent(in), optional.

    The id of the file where the preconditioner description will be +printed; the default is the standard output.

    root

    integer, intent(in), optional.

    The id of the process where the preconditioner description +will be printed; the default is psb_root_.

    verbosity

    integer, intent(in), optional.

    The verbosity level of the description. Default value is 0. For +values higher than 0, it prints out further information, e.g., for +a distributed multilevel preconditioner the size of the coarse +matrices on every process.

    +

    +

    5.9 Auxiliary Methods

    +

    Various functionalities are implemented as additional methods of the preconditioner +object. +

    +

    5.9.1 Method: dump
    +
    +

    +

    call p%dump(info[,istart,iend,prefix,head,ac,rp,smoother,solver,global_num])
    +

    + + +

    Dump on file. +

    Arguments +

    + + + +

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    amold

    class(psb_x_base_sparse_mat), intent(in), optional.

    The desired dynamic type for internal matrix components; this +allows e.g. running on GPUs; it needs not be the same on all +processes. See the PSBLAS User’s Guide for details [21].

    +

    +

    5.9.2 Method: clone
    +
    +

    +

    call p%clone(pout,info)
    +

    +

    Create a (deep) copy of the preconditioner object. +

    Arguments +

    + + + +

    pout

    type(amg_xprec_type), intent(out).

    The copy of the preconditioner data structure. Note that x must +be chosen according to the real/complex, single/double precision +version of AMG4PSBLAS under use.

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    +

    +

    5.9.3 Method: sizeof
    +
    +

    +

    sz = p%sizeof([global])
    +

    +
    + +

    global

    logical, optional.

    Whether the global or local preconditioner memory occupation is +desired. Default: .false..

    +Return memory footprint in bytes. + + +

    +

    5.9.4 Method: allocate_wrk
    +
    +

    +

    call p%allocate_wrk(info[, vmold])
    +

    +

    Allocate internal work vectors. Each application of the preconditioner uses a number of +work vectors which are allocated internally as necessary; therefore allocation and +deallocation of memory occurs multiple times during the execution of a Krylov method. +In most cases this strategy is perfectly acceptable, but on some platforms, most +notably GPUs, memory allocation is a slow operation, and the default behaviour would +lead to a slowdown. This method allows to trade space for time by preallocating +the internal workspace outside of the invocation of a Krylov method. When +using GPUs or other specialized devices, the vmold argument is also necessary +to ensure the internal work vectors are of the appropriate dynamic type to +exploit the accelerator hardware; when allocation occurs internally this is +taken care of based on the dynamic type of the x argument to the apply +method. +

    Arguments +

    + + + +

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    vmold

    class(psb_x_base_vect_type), intent(in), optional.

    The desired dynamic type for internal vector components; this +allows e.g. running on GPUs.

    +

    +

    5.9.5 Method: free_wrk
    +
    +

    +

    call p%free_wrk(info)
    +

    + + +

    Deallocate internal work vectors. +

    Arguments +

    + +

    info

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    + + + + + + + +

    + id="tailuserhtmlse5.html"> diff --git a/docs/html/userhtmlse6.html b/docs/html/userhtmlse6.html index 01eb6d6c..da9e4040 100644 --- a/docs/html/userhtmlse6.html +++ b/docs/html/userhtmlse6.html @@ -29,7 +29,7 @@ class="cmr-12">up]

    6 Adding new smoother and solver objects to AMG4PSBLAS

    Developers can add completely new smoother and/or solver classes derived from the @@ -37,7 +37,7 @@ class="cmr-12">Developers can add completely new smoother and/or solver classes class="cmr-12">base objects in the library (see Remark 2 in Section 5.2), without recompiling the Once the new smoother/solver class has been developed, to use it the multilevel preconditioners it is necessary to:

      -
    • +

      declare in the application program a variable of the new type;

    • -
    • +

      pass that variable as the argument to the set routine as in the following:

      -

      call p%set(smoother,info [,ilev,ilmax,pos])
      -call p%set(solver,info [,ilev,ilmax,pos])

      +

      call p%set(smoother,info [,ilev,ilmax,pos])
      +call p%set(solver,info [,ilev,ilmax,pos])

    • -
    • +

      link the code implementing the various methods into the application executable.

    + + +

    The new solver object is then dynamically included in the preconditioner structure, to which the preconditioner will conform, even though class="cmr-12">the AMG4PSBLAS library has not been modified to account for this new development. - - -

    It is possible to define new values for the keyword WHAT The interfaces for the calls shown above are defined using

    -

    +

    smoother

    class(amg_x_base_smoother_type)

    + style="vertical-align:baseline;" id="TBL-23-3-">

    smoother

    class(amg_x_base_smoother_type)

    The user-defined new smoother to be employed in the preconditioner.

    solver

    class(amg_x_base_solver_type)

    solver

    class(amg_x_base_solver_type)

    The user-defined new solver to be employed in the preconditioner.

    +class="cmr-12">The user-defined new solver to be employed in the preconditioner.

    The other arguments are defined in the way described in Sec. 5.2. As an example, in the pass it as follows: -

    +   
    +
       ! sparse matrix and preconditioner
       type(psb_dspmat_type) :: a
       type(amg_dprec_type)  :: prec
       type(amg_d_tlu_solver_type) :: tlusv
    +
     ......
       !
       !  prepare the preconditioner: an ML with defaults, but with TLU solver at
    @@ -230,6 +191,7 @@ class="cmr-12">pass it as follows:
       nlv = prec%get_nlevs()
       call prec%set(tlusv,   info,ilev=1,ilmax=max(1,nlv-1))
       call prec%smoothers_build(a,desc_a,info)
    +
     

    @@ -241,6 +203,9 @@ class="cmr-12">pass it as follows: + + +

    7 Error Handling

    The error handling in AMG4PSBLAS is based on the PSBLAS error handling. Error @@ -51,8 +51,8 @@ class="cmr-12">an error message should be printed. These options may be set by u class="cmr-12">PSBLAS error handling routines; for further details see the PSBLAS User’s Guide [2021]. @@ -67,7 +67,7 @@ class="cmr-12">. -

    A License

    AMG4PSBLAS is freely distributable under the following copyright terms: -

    +   
    +
                                AMG4PSBLAS  version 1.0
                   Algebraic MultiGrid Preconditioners Package
                  based on PSBLAS (Parallel Sparse BLAS version 3.7)
    +
       (C) Copyright 2021
    +
       Pasqua D’Ambra         IAC-CNR, IT
       Fabio Durastante       University of Pisa and IAC-CNR, IT
       Salvatore Filippone    University of Rome Tor-Vergata and IAC-CNR, IT
    +
       Redistribution and use in source and binary forms, with or without
       modification, are permitted provided that the following conditions
       are met:
    @@ -55,6 +59,7 @@ class="cmr-12">AMG4PSBLAS is freely distributable under the following copyright
         3. The name of the MLD2P4 group or the names of its contributors may
            not be used to endorse or promote products derived from this
            software without specific written permission.
    +
       THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
       ‘‘AS IS’’ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
       TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
    @@ -66,6 +71,7 @@ class="cmr-12">AMG4PSBLAS is freely distributable under the following copyright
       CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
       ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
       POSSIBILITY OF SUCH DAMAGE.
    +
     

    @@ -78,14 +84,20 @@ class="cmr-12">abide by its terms: -

    +   
    +
    +
                                MLD2P4  version 2.2
       MultiLevel Domain Decomposition Parallel Preconditioners Package
                  based on PSBLAS (Parallel Sparse BLAS version 3.5)
    +
       (C) Copyright 2008-2018
    +
           Salvatore Filippone
           Pasqua D’Ambra
           Daniela di Serafino
    +
    +
       Redistribution and use in source and binary forms, with or without
       modification, are permitted provided that the following conditions
       are met:
    @@ -97,6 +109,7 @@ class="cmr-12">abide by its terms:
         3. The name of the MLD2P4 group or the names of its contributors may
            not be used to endorse or promote products derived from this
            software without specific written permission.
    +
       THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
       ‘‘AS IS’’ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
       TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
    @@ -108,6 +121,7 @@ class="cmr-12">abide by its terms:
       CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
       ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
       POSSIBILITY OF SUCH DAMAGE.
    +
     

    AMG4PSBLAS is distributed together with (a small part of) the graph-matching @@ -118,7 +132,7 @@ class="cmr-12">AMG4PSBLAS is distributed together with (a small part of) the gra class="cmr-12">library MatchBox-P [9]. Per the license requirements, we reproduce the relevant part @@ -127,7 +141,7 @@ class="cmr-12">here. -

    +   
     // ***********************************************************************
     //
     //        MatchboxP: A C++ library for approximate weighted matching
    @@ -179,6 +193,9 @@ class="cmr-12">here.
                                                                                    
     
                                                                                    
    +                                                                               
    +
    +                                                                               
        
  12. and their variants. + + +

    This method allocates and initializes the preconditioner pThis method allocates and initializes the preconditioner p, according to the preconditioner type chosen by the user.

    Arguments

    +class="td11">

    contxt

    +class="cmr-12">. +class="td11">

    +class="td11">

    info

    +class="cmr-12">for details.

    contxt

    type(psb_ctxt_type), intent(in).

    type(psb_ctxt_type), intent(in).

    The communication context.

    The communication context.

    ptype

    character(len=*), intent(in)

    ptype

    character(len=*), intent(in) .

    The type of preconditioner. Its values are specified in Table 1.

    Note that strings are case insensitive.

    Note that strings are case insensitive.

    info

    integer, intent(out).

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    +class="td11"> diff --git a/docs/html/userhtmlsu9.html b/docs/html/userhtmlsu9.html index 051cae54..1fdaf8f1 100644 --- a/docs/html/userhtmlsu9.html +++ b/docs/html/userhtmlsu9.html @@ -34,40 +34,13 @@ class="cmr-12">Method set

    -

    call p%set(what,val,info [,ilev, ilmax, pos, idx])

    +

    call p%set(what,val,info [,ilev, ilmax, pos, idx])

    This method sets the parameters defining the preconditioner pThis method sets the parameters defining the preconditioner p. More precisely, the parameter identified by what is assigned the value contained in valparameter identified by what is assigned the value contained in val.

    Arguments @@ -75,25 +48,20 @@ class="cmbx-12">Arguments

    +class="td11">

    what

    +class="cmr-12">. + real(psb_dpk_), intent(in). + is of type character(len=*), it is also treated as case insensitive. +class="td11">

    info

    +class="cmr-12">for details. +class="td11">

    ilev

    +class="cmr-12">). +class="td11">

    ilmax

    + one, i.e., level 1 is the finest level. +class="td11">

    pos

    + not concern the smoothers, posis ignored. +class="td11">

    idx

    -

    what

    character(len=*).

    character(len=*).

    The parameter to be set. It can be specified through its name; the -string is case-insensitive. See Tables 2-8.

    val 

    integer or character(len=*) or real(psb_spk_)

    val

    integer or character(len=*) or real(psb_spk_) or -real(psb_dpk_), intent(in).

    The value of the parameter to be set. The list of allowed values and -the corresponding data types is given in Tables -8. When the value -is of type character(len=*), it is also treated as case insensitive.

    info

    integer, intent(out).

    integer, intent(out).

    Error code. If no error, 0 is returned. See Section 7 for details.

    ilev

    integer, optional, intent(in).

    integer, optional, intent(in).

    For the multilevel preconditioner, the level at which the -preconditioner parameter has to be set. The levels are numbered -in increasing order starting from the finest one, i.e., level 1 is the -finest level. If ilev is not present, the parameter identified by what -finest level. If ilev is not present, the parameter identified by what + is set at all levels that are appropriate (see Tables 2-8).

    ilmax

    integer, optional, intent(in).

    integer, optional, intent(in).

    For the multilevel preconditioner, when both ilev and ilmax

    For the multilevel preconditioner, when both ilev and ilmax are -present, the settings are applied at all levels ilev:ilmaxpresent, the settings are applied at all levels ilev:ilmax. When -ilev is present but ilmax is not, then the default is ilmax=ilevilev is present but ilmax is not, then the default is ilmax=ilev. -The levels are numbered in increasing order starting from the finest -one, i.e., level 1 is the finest level.

    pos

    character(len=*), optional, intent(in).

    character(len=*), optional, intent(in).

    Whether the other arguments apply only to the pre-smoother -(PRE) or to the post-smoother (POST). If pos (PRE) or to the post-smoother (POST). If pos is not present, -the other arguments are applied to both smoothers. If the -preconditioner is one-level or the parameter identified by what preconditioner is one-level or the parameter identified by what does -not concern the smoothers, pos is ignored.

    idx

    integer, optional, intent(in).

    integer, optional, intent(in).

    An auxiliary input argument that can be passed to the underlying -objects.

    + objects. +

    A variety of preconditioners can be obtained by setting the appropriate @@ -334,24 +204,28 @@ class="cmr-12">preconditioner parameters. These parameters can be logically divi i.e., parameters defining

      -
    1. +

      the type of multilevel cycle and how many cycles must be applied;

    2. -
    3. +

      the coarsening algorithm;

    4. -
    5. the solver at the coarsest level (for multilevel preconditioners only); -
    6. -
    7. the smoother of the multilevel preconditioners, or the one-level +
    8. +

      the solver at the coarsest level (for multilevel preconditioners only); +

    9. +
    10. +

      the smoother of the multilevel preconditioners, or the one-level preconditioner.

    class="cmbx-12">Remark 2. A smoother is usually obtained by combining two objects: a smoother (SMOOTHER_TYPE) and a local solver (SUB_SOLVEsmoother (SMOOTHER_TYPE) and a local solver (SUB_SOLVE), as specified in Tables 7-89. For example, the block-Jacobi smoother using ILU(0) on the blocks is obtained by combining the block-Jacobi smoother object with the ILU(0) solver @@ -412,20 +280,14 @@ class="cmr-12">smoothers. However, for simplicity, shortcuts are provided to set point-Jacobi, hybrid (forward) Gauss-Seidel, and hybrid backward Gauss-Seidel, i.e., the previous smoothers can be defined just by setting SMOOTHER_TYPE the previous smoothers can be defined just by setting SMOOTHER_TYPE to certain specific values (see Tables 7), without the need to set SUB_SOLVE ), without the need to set SUB_SOLVE as well. @@ -457,6 +319,47 @@ class="cmr-12">). class="newline" />

    Remark 3. The polynomial-accelerated smoother described in Tables 7 and 9 +redefines a sweep or iteration as corresponding to the degree of the polynomial used. +Consequently, the SMOOTHER_SWEEPS option is overridden by the POLY_DEGREE +option. This smoother is paired with a base smoother object, whose iterations are +accelerated using the specified polynomial smoothing technique. By default, the + + + +1-Jacobi smoother serves as the base smoother, offering theoretical guarantees on the +resulting convergence factor [15, 27]. Alternative combinations are experimental and +lack established guarantees.
    +

    Remark 4. Many of the coarsest-level solvers apply to a specific coarsest-matrix layout; therefore, setting the solver after the layout may change the layout to either @@ -464,7 +367,7 @@ class="cmr-12">layout; therefore, setting the solver after the layout may change class="cmr-12">distributed or replicated. Similarly, setting the layout after the solver may change the solver. -

    More precisely, UMFPACK and SuperLU require the coarsest-level matrix to be replicated, while SuperLU 2). For the point-Jacobi and Gauss-Seidel solvers, these objects correspond to a single - - - point-Jacobi sweep and a single Gauss-Seidel sweep, respectively, which are very poor solvers. -

    On the other hand, the distributed layout can be used with any solver but UMFPACK and SuperLU; therefore, if any of these two solvers has already been @@ -514,26 +414,21 @@ class="cmr-12">_Dist or KRM have been class="cmr-12">previously set, the coarsest-level solver is changed to the default sequential solver. -

    In a parallel setting with many cores, we suggest to the users to change the default coarsest solver for using the KRM choice, i.e. a parallel distributed iterative solution of the coarsest system based on Krylov methods. -

    Remark 4. The argument idx The argument idx can be used to allow finer control for those solvers; for instance, by specifying the keyword MUMPS_IPAR_ENTRY for instance, by specifying the keyword MUMPS_IPAR_ENTRY and an appropriate value for idxfor idx, it is possible to set any entry in the MUMPS integer control array. See also Sec.. -


    @@ -553,9 +448,9 @@ class="cmr-12">.
    -

    +

    s +class="td11">

    ML_CYCLE +

    +class="td11">

    CYCLE_SWEEPS +






    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    ML_CYCLE

    character(len=*)

    VCYCLE -

    WCYCLE -

    KCYCLE -

    ADD

    VCYCLE

    Multilevel cycle: V-cycle, W-cycle, K-cycle, -and additive composition.

    character(len=*) +

    VCYCLE +

    WCYCLE +

    KCYCLE +

    ADD +

    VCYCLE +

    Multilevel cycle: V-cycle, W-cycle, K-cycle, + and additive composition. +






    CYCLE_SWEEPS

    integer

    Any integer -

    number 1

    1

    Number of multilevel cycles.

    integer +

    Any integer +

    number 1 +

    1 +

    Number of multilevel cycles. +






    +class="td11">

    Table 2: Parameters defining the multilevel cycle and the number of cycles to be @@ -661,7 +538,7 @@ applied.
    -


    @@ -669,12 +546,12 @@ applied.
    -

    +

    s +class="td11">

    MIN_COARSE_SIZE_PER_PROCESS +

    +class="td11">

    MIN_COARSE_SIZE +

    +class="td11">

    MIN_CR_RATIO +

    +class="td11">

    MAX_LEVS +

    +class="cmmi-8">T. + +class="td11">

    AGGR_TYPE +

    + and ’MATCHBOXP’ + +class="td11">

    AGGR_PROL +

    Note. The aggregation algorithm stops when at least one of the following criteria is met: the coarse size threshold,
    +class="td11">
    Note. The aggregation algorithm stops when at least one of the following criteria is met: the coarse size threshold,
    the minimum coarsening ratio, or the maximum number of levels is reached.
    +class="td11">
    the minimum coarsening ratio, or the maximum number of levels is reached.
    Therefore, the actual number of levels may be smaller than the specified maximum number of levels.
    +class="td11">
    Therefore, the actual number of levels may be smaller than the specified maximum number of levels.





    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    MIN_COARSE_SIZE_PER_PROCESS

    integer

    Any number -

    > 0

    200

    Coarse size threshold per process. The -aggregation stops if the global number of -variables of the computed coarsest matrix -is lower than or equal to this threshold -multiplied by the number of processes -(see Note).

    integer +

    Any number +

    > 0 +

    200 +

    Coarse size threshold per process. The + aggregation stops if the global number of + variables of the computed coarsest matrix + is lower than or equal to this threshold + multiplied by the number of processes + (see Note). +






    MIN_COARSE_SIZE

    integer

    Any number -

    > 0

    -1

    Coarse size threshold. The aggregation -stops if the global number of variables -of the computed coarsest matrix is lower -than or equal -to this threshold (see Note). If negative, -it is ignored in favour of the default for -MIN_COARSE_SIZE_PER_PROCESS.

    integer +

    Any number +

    > 0 +

    -1 +

    Coarse size threshold. The aggregation + stops if the global number of variables + of the computed coarsest matrix is lower + than or equal + to this threshold (see Note). If negative, + it is ignored in favour of the default for + MIN_COARSE_SIZE_PER_PROCESS. +






    MIN_CR_RATIO

    real

    Any number -

    > 1

    1.5

    Minimum coarsening -ratio. The aggregation stops if the ratio -between the global matrix dimensions at -two consecutive levels is lower than or -equal to this threshold (see Note).

    real +

    Any number +

    > 1 +

    1.5 +

    Minimum coarsening + ratio. The aggregation stops if the ratio + between the global matrix dimensions at + two consecutive levels is lower than or + equal to this threshold (see Note). +






    MAX_LEVS

    integer

    Any integer -

    number > 1

    20

    Maximum number of levels. The -aggregation stops if the number of levels -reaches this value (see Note).

    integer +

    Any integer +

    number > 1 +

    20 +

    Maximum number of levels. The + aggregation stops if the number of levels + reaches this value (see Note). +






    PAR_AGGR_ALG

    character(len=*)

    PAR_AGGR_ALG +

    character(len=*) +

    ’DEC’, -’SYMDEC’, -’COUPLED’

    ’DEC’

    Parallel aggregation algorithm. -

    the SYMDEC option applies decoupled -aggregation to the sparsity pattern of -’COUPLED’ +

    ’DEC’ +

    Parallel aggregation algorithm. +

    the SYMDEC option applies decoupled + aggregation to the sparsity pattern of + A + AT .






    AGGR_TYPE

    character(len=*)

    SOC1, -SOC2, -MATCHBOXP

    SOC1

    Type of aggregation algorithm: currently, -for the -decoupled aggregation we implement two -measures of strength of connection, the -one by Vaněk, Mandel and Brezina [33], -and the one by Gratton et al [23]. The -coupled aggregation is based on a parallel -version of the half-approximate matching -implemented in the MatchBox-P software -package [9].

    character(len=*) +

    SOC1, + SOC2, + MATCHBOXP +

    SOC1 +

    Type of aggregation algorithm: currently, + for the + decoupled aggregation we implement two + measures of strength of connection, the + one by Vaněk, Mandel and Brezina [35], + and the one by Gratton et al [24]. The + coupled aggregation is based on a parallel + version of the half-approximate matching + implemented in the MatchBox-P software + package [9]. +






    AGGR_SIZE

    integer

    Any integer -

    power of 2, -with -

    AGGR_SIZE +

    integer +

    Any integer +

    power of 2, + with + aggr_size -2

    4

    Maximum size of aggregates when the -coupled aggregation based on matching -is applied. For aggressive coarsening -with size of aggregate larger than 8 -we recommend the use of smoothed -prolongators. Used only with

    4 +

    Maximum size of aggregates when the + coupled aggregation based on matching + is applied. For aggressive coarsening + with size of aggregate larger than 8 + we recommend the use of smoothed + prolongators. Used only with ’COUPLED’ -and ’MATCHBOXP’






    AGGR_PROL

    character(len=*)

    SMOOTHED, -UNSMOOTHED

    SMOOTHED

    Prolongator used by the aggregation -algorithm: smoothed or unsmoothed (i.e., -tentative prolongator).

    character(len=*) +

    SMOOTHED, + UNSMOOTHED +

    SMOOTHED +

    Prolongator used by the aggregation + algorithm: smoothed or unsmoothed (i.e., + tentative prolongator). +











    +class="td11">
    -

    Table 3: Parameters defining the aggregation algorithm.
    @@ -946,7 +794,7 @@ class="content">Parameters defining the aggregation algorithm.


    @@ -954,9 +802,9 @@ class="content">Parameters defining the aggregation algorithm.

    +

    s +

    ’DEGREE’ +

    + connection algorithm. See also the note + at the bottom of this table. + +

    ’NOFILTER’ +

    Note. Different thresholds at different levels, such as those used in [33, Section 5.1], can be easily set by invoking the rou-
    +href="userhtmlli5.html#XVANEK_MANDEL_BREZINA">35, Section 5.1]
    , can be easily set by invoking the rou-
    tine
    tine set with the parameter ilev.
    +class="cmtt-10x-x-109">ilev
    .





    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    AGGR_ORD

    character(len=*)

    AGGR_ORD +

    character(len=*) +

    ’NATURAL’ -

    ’DEGREE’

    ’NATURAL’

    Initial ordering of indices for -the decoupled aggregation algorithm: -either natural ordering or sorted by -descending degrees of the nodes in the -matrix graph.

    ’NATURAL’ +

    Initial ordering of indices for + the decoupled aggregation algorithm: + either natural ordering or sorted by + descending degrees of the nodes in the + matrix graph. +






    AGGR_THRESH

    real(kind_parameter)

    Any real -

    number 

    AGGR_THRESH +

    real(kind_parameter) +

    Any real +

    number  -[0,1]

    0.01

    The threshold ,1] +

    0.01 +

    The threshold θ in the strength of -connection algorithm. See also the note -at the bottom of this table.






    AGGR_FILTER

    character(len=*)

    AGGR_FILTER +

    character(len=*) +

    ’FILTER’ -

    ’NOFILTER’

    ’NOFILTER’

    Matrix used in computing the smoothed -prolongator: filtered or unfiltered.

    ’NOFILTER’ +

    Matrix used in computing the smoothed + prolongator: filtered or unfiltered. +











    +class="td11">
    Table 4: Parameters defining the aggregation algorithm (continued).
    @@ -1089,7 +929,7 @@ class="content">Parameters defining the aggregation algorithm (continued).


    @@ -1097,12 +937,12 @@ class="content">Parameters defining the aggregation algorithm (continued). -

    +

    s +class="td11">

    COARSE_MAT +

    + either a replicated or a distributed matrix. When + any of the previous solvers is specified, the matrix + layout is set to a default value which allows the use + of the solver (see Remark 4, p. 21). Note also that + UMFPACK and SuperLU_Dist are available only in + double precision. + + and SuperLU_Dist are available only in double + precision. +
    Note. Defaults for COARSE_SOLVE and COARSE_SUBSOLVE are chosen in the following order:
    +class="cmtt-10x-x-109">_SUBSOLVE
    are chosen in the following order:
    single precision version –
    single precision version – MUMPS if installed, then SLU if installed, ILU otherwise;
    +class="cmtt-10x-x-109">ILU
    otherwise;
    double precision version –
    double precision version – UMF if installed, then MUMPS if installed, then SLU if installed, ILU otherwise.
    +class="cmtt-10x-x-109">ILU
    otherwise.
    +class="small-caps">s + 0 + + fill-in for the approximate inverses. + + first drop-tolerance for the approximate inverses. +
    Note. Further options for coarse solvers are contained in Table 6.
    +href="#x19-18013r6">6.
    For a first use it is suggested to use the default options obtained by simply selecting the solver type.
    +class="td11">
    For a first use it is suggested to use the default options obtained by simply selecting the solver type.





    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    COARSE_MAT

    character(len=*)

    DIST -

    REPL

    REPL

    Coarsest matrix layout: distributed among the -processes or replicated on each of them.

    character(len=*) +

    DIST +

    REPL +

    REPL +

    Coarsest matrix layout: distributed among the + processes or replicated on each of them. +






    COARSE_SOLVE

    character(len=*)

    MUMPS -

    UMF -

    SLU -

    SLUDIST -

    ILU -

    JACOBI -

    GS -

    BJAC -

    KRM -

    L1-JACOBI -

    L1-BJAC -

    L1-FBGS

    See Note.

    Solver used at the coarsest level: sequential -LU from MUMPS, UMFPACK, or SuperLU -(plus triangular solve); distributed LU from -MUMPS or SuperLU_Dist (plus triangular solve); -point-Jacobi, hybrid Gauss-Seidel or block-Jacobi -and related

    COARSE_SOLVE +

    character(len=*) +

    MUMPS +

    UMF +

    SLU +

    SLUDIST +

    ILU +

    JACOBI +

    GS +

    BJAC +

    KRM +

    L1-JACOBI +

    L1-BJAC +

    L1-FBGS +

    See Note. +

    Solver used at the coarsest level: sequential + LU from MUMPS, UMFPACK, or SuperLU + (plus triangular solve); distributed LU from + MUMPS or SuperLU_Dist (plus triangular solve); + point-Jacobi, hybrid Gauss-Seidel or block-Jacobi + and related 1-versions; Krylov Method (flexible -Conjugate Gradient) coupled with the block-Jacobi -preconditioner with ILU(0) on the blocks. Note -that UMF and SLU require the coarsest matrix to -be replicated, SLUDIST, JACOBI, GS, BJAC and KRM -require it to be distributed, MUMPS can be used with -either a replicated or a distributed matrix. When -any of the previous solvers is specified, the matrix -layout is set to a default value which allows the use -of the solver (see Remark 3, p. 24). Note also that -UMFPACK and SuperLU_Dist are available only in -double precision.






    COARSE_SUBSOLVE

    character(len=*)

    ILU -

    ILUT -

    MILU -

    MUMPS -

    SLU -

    UMF -

    INVT -

    INVK -

    AINV

    See Note.

    Solver for the diagonal blocks of the coarsest -matrix, in case the block Jacobi solver is chosen -as coarsest-level solver: ILU(

    COARSE_SUBSOLVE +

    character(len=*) +

    ILU +

    ILUT +

    MILU +

    MUMPS +

    SLU +

    UMF +

    INVT +

    INVK +

    AINV +

    See Note. +

    Solver for the diagonal blocks of the coarsest + matrix, in case the block Jacobi solver is chosen + as coarsest-level solver: ILU(p), ILU(p,t), MILU(p), -LU from MUMPS, SuperLU or UMFPACK -(plus triangular solve), Approximate Inverses -INVK(p,q), INVT(p11,t2) and AINV(t); note -that approximate inverses are specifically suited -for GPUs since they do not employ triangular -system solve kernels, see [[3]. Note that UMFPACK -and SuperLU_Dist are available only in double -precision.











    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    COARSE_SWEEPS

    integer

    Any -integer -

    number

    COARSE_SWEEPS +

    integer +

    Any + integer +

    number > -0

    10

    Number of sweeps when JACOBI, GS or BJAC is -chosen as coarsest-level solver.

    10 +

    Number of sweeps when JACOBI, GS or BJAC is + chosen as coarsest-level solver. +






    COARSE_FILLIN

    integer

    Any -integer -

    number

    COARSE_FILLIN +

    integer +

    Any + integer +

    number -0

    0

    Fill-in level

    0 +

    Fill-in level p of the ILU factorizations and first -fill-in for the approximate inverses.






    COARSE_ILUTHRS

    real(kind_parameter)

    Any real -

    number

    COARSE_ILUTHRS +

    real(kind_parameter) +

    Any real +

    number -0

    0

    Drop tolerance

    0 +

    Drop tolerance t in the ILU(p,t) factorization and -first drop-tolerance for the approximate inverses.











    +class="td11">
    -
    Table 5: Parameters defining the solver at the coarsest level (continued).
    @@ -1482,7 +1230,7 @@ class="content">Parameters defining the solver at the coarsest level (continued) -


    @@ -1490,12 +1238,12 @@ class="content">Parameters defining the solver at the coarsest level (continued)
    -

    +

    s +class="td11">

    BJAC_STOP +

    +class="td11">

    BJAC_TRACE +

    +class="td11">

    BJAC_ITRACE +

    +class="td11">

    BJAC_RESCHECK +

    +class="td11">

    BJAC_STOPTOL +

    + Minimal Residual method with restarting. + Refer to the PSBLAS guide [21] for further + information. + + for the coarse Krylov solver. + + The same caveat from Table 5 applies here. + +class="td11">

    KRM_GLOBAL +

    +class="td11">

    The stopping tolerance.

    + methods, otherwise it is ignored. + + the relative residual reduction in the 2-norm is + used instead; refer to the PSBLAS [21] guide + for the details. + +class="td11">

    KRM_ITMAX +

    + an informational message about convergence + every KRM_ITRACE iterations. If = 0 print a + message in case of convergence failure. + + first fill-in for the approximate inverses. +





    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    BJAC_STOP

    character(len=*)

    FALSE -

    TRUE

    FALSE

    Select whether to use a stopping criterion for -the Block-Jacobi method used as a coarse -solver.

    character(len=*) +

    FALSE +

    TRUE +

    FALSE +

    Select whether to use a stopping criterion for + the Block-Jacobi method used as a coarse + solver. +






    BJAC_TRACE

    character(len=*)

    FALSE -

    TRUE

    FALSE

    Select whether to print a trace for the -calculated residual for the Block-Jacobi -method used as a coarse solver.

    character(len=*) +

    FALSE +

    TRUE +

    FALSE +

    Select whether to print a trace for the + calculated residual for the Block-Jacobi + method used as a coarse solver. +






    BJAC_ITRACE

    integer

    Any integer -

    > 0

    -1

    Number of iterations after which a trace is to -be printed.

    integer +

    Any integer +

    > 0 +

    -1 +

    Number of iterations after which a trace is to + be printed. +






    BJAC_RESCHECK

    integer

    Any integer -

    > 0

    -1

    Number of iterations after which a residual is -to be calculated.

    integer +

    Any integer +

    > 0 +

    -1 +

    Number of iterations after which a residual is + to be calculated. +






    BJAC_STOPTOL

    real(kind_parameter)

    Any real -

    < 1

    0

    Tolerance for the stopping criterion on the -residual.

    real(kind_parameter) +

    Any real +

    < 1 +

    0 +

    Tolerance for the stopping criterion on the + residual. +






    KRM_METHOD

    character(len=*)

    CG -

    FCG -

    CGS -

    CGR -

    BICG -

    BICGSTAB -

    BICGSTABL -

    RGMRES

    FCG

    A string that defines the iterative method to -be used when employing a Krylov method -KRM as a coarse solver.

    KRM_METHOD +

    character(len=*) +

    CG +

    FCG +

    CGS +

    CGR +

    BICG +

    BICGSTAB +

    BICGSTABL +

    RGMRES +

    FCG +

    A string that defines the iterative method to + be used when employing a Krylov method + KRM as a coarse solver. CG the Conjugate -Gradient method; FCG the Flexible Conjugate -Gradient method; CGS the Conjugate Gradient -Stabilized -method; GCR the Generalized Conjugate -Residual method; FCG the Flexible Conjugate -Gradient method; BICG the Bi-Conjugate -Gradient method; BICGSTAB the Bi-Conjugate -Gradient Stabilized method; BICGSTABL the -Bi-Conjugate Gradient Stabilized method -with restarting; RGMRES the Generalized -Minimal Residual method with restarting. -Refer to the PSBLAS guide [20] for further -information.






    KRM_KPREC

    character(len=*)

    Table 1

    BJAC

    The one-level -preconditioners from the Table 

    KRM_KPREC +

    character(len=*) +

    Table 1 +

    BJAC +

    The one-level + preconditioners from the Table 1 can be used -for the coarse Krylov solver.






    KRM_SUB_SOLVE

    character(len=*)

    Table 5

    ILU

    Solver for the diagonal blocks of the coarsest -matrix preconditioner, in case the block Jacobi -solver is chosen -as KRM_KPREC: ILU(

    KRM_SUB_SOLVE +

    character(len=*) +

    Table 5 +

    ILU +

    Solver for the diagonal blocks of the coarsest + matrix preconditioner, in case the block Jacobi + solver is chosen + as KRM_KPREC: ILU(p), ILU(p,t), MILU(p), -LU from MUMPS, SuperLU or UMFPACK -(plus triangular solve), Approximate Inverses -INVK(p,q), INVT(p11,t2) and AINV(t); -The same caveat from Table 5 applies here.






    KRM_GLOBAL

    character(len=*)

    TRUE, -FALSE

    FALSE

    Choose between a global Krylov solver, all -unknowns on a single node, or a distributed -one. The default choice is the distributed -solver.

    character(len=*) +

    TRUE, + FALSE +

    FALSE +

    Choose between a global Krylov solver, all + unknowns on a single node, or a distributed + one. The default choice is the distributed + solver. +






    KRM_EPS

    real(kind_parameter)

    Real

    KRM_EPS

    real(kind_parameter)

    Real < 1

    10

    10-6

    The stopping tolerance.






    KRM_IRST

    integer

    Integer -

    1

    30

    An integer specifying the restart parameter. -This is employed for the

    KRM_IRST +

    integer +

    Integer +

    1 +

    30 +

    An integer specifying the restart parameter. + This is employed for the BiCGSTABL or RGMRES -methods, otherwise it is ignored.






    KRM_ISTOPC

    integer

    Integers -1,2,3

    2

    If

    KRM_ISTOPC +

    integer +

    Integers + 1,2,3 +

    2 +

    If 1 then the method uses the normwise -backward error in the infinity norm; if 2, the -it uses the relative residual in the 2-norm; if 3 -the relative residual reduction in the 2-norm is -used instead; refer to the PSBLAS [20] guide -for the details.






    KRM_ITMAX

    integer

    Integer -

    1

    40

    The maximum number of iterations to -perform.

    integer +

    Integer +

    1 +

    40 +

    The maximum number of iterations to + perform. +






    KRM_ITRACE

    integer

    Integer -

    0

    -1

    If

    KRM_ITRACE +

    integer +

    Integer +

    0 +

    -1 +

    If > 0 print out -an informational message about convergence -every KRM_ITRACE iterations. If = 0 print a -message in case of convergence failure.






    KRM_FILLIN

    integer

    Integer -

    0

    0

    Fill-in level

    KRM_FILLIN +

    integer +

    Integer +

    0 +

    0 +

    Fill-in level p of the ILU factorizations and -first fill-in for the approximate inverses.






    +class="td11">
    + -

    Table 6: Additional parameters defining the solver at the coarsest level.
    @@ -1936,7 +1599,7 @@ class="content">Additional parameters defining the solver at the coarsest level. -


    @@ -1944,9 +1607,9 @@ class="content">Additional parameters defining the solver at the coarsest level.
    -

    +

    s +class="cmr-10">-point-Jacobi and Additive + Schwarz, polynomial accelerators; see [15] + and Remark 3 (p. 21). +

    It is ignored by one-level preconditioners. +

    + Note for details on hybrid Gauss-Seidel. + + respectively. Is ignored if the smoother is + POLY + +class="cmr-10">1 + and 30 +





    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    SMOOTHER_TYPE

    character(len=*)

    JACOBI -

    GS -

    BGS -

    BJAC -

    AS -

    L1-JACOBI -

    L1-BJAC -

    L1-FBGS

    FBGS

    SMOOTHER_TYPE +

    character(len=*) +

    JACOBI +

    GS +

    BGS +

    BJAC +

    AS +

    L1-JACOBI +

    L1-BJAC +

    L1-FBGS +

    POLY +

    FBGS +

    Type of smoother used in the multilevel -preconditioner: point-Jacobi, hybrid -(forward) Gauss-Seidel, hybrid backward -Gauss-Seidel, block-Jacobi, 1-Jacobi, -1–hybrid (forward) Gauss-Seidel, -–hybrid (forward) + Gauss-Seidel, 1-point-Jacobi and Additive Schwarz. -

    It is ignored by one-level preconditioners.






    SUB_SOLVE

    character(len=*)

    JACOBI -GS -

    BGS -

    ILU -

    ILUT -

    MILU -

    MUMPS -

    SLU -

    UMF -

    INVT -

    INVK -

    AINV

    SUB_SOLVE +

    character(len=*) +

    JACOBI + GS +

    BGS +

    ILU +

    ILUT +

    MILU +

    MUMPS +

    SLU +

    UMF +

    INVT +

    INVK +

    AINV +

    GS and BGS for pre- -and post-smoothers of -multilevel -preconditioners, -respectively -

    ILU for block-Jacobi -and Additive Schwarz -one-level -preconditioners

    preconditioners +

    The local solver to be used with the -smoother or one-level preconditioner (see -Remark 2, page 24): point-Jacobi, hybrid -(forward) Gauss-Seidel, hybrid backward -Gauss-Seidel, ILU(p), ILU(p,t), MILU(p), -LU from MUMPS, -SuperLU or UMFPACK (plus triangular -solve), Approximate Inverses INVK(p,q), -INVT(p12) and AINV(t); note -that approximate inverses are specifically -suited for GPUs since they do not employ -triangular system solve kernels, see [3]. See -Note for details on hybrid Gauss-Seidel.






    SMOOTHER_SWEEPS

    integer

    SMOOTHER_SWEEPS +

    integer +

    Any integer -

    number 0

    1

    0 +

    1 +

    Number of sweeps of the smoother or -one-level preconditioner. In the multilevel -case, no pre-smother or post-smoother -is used if this parameter is set to 0 -together with pos=PRE or pos=POSTtogether with pos=PRE or pos=POST, -respectively.






    SUB_OVR

    integer

    POLY_DEGREE +

    integer +

    Any integer -

    number 0

    1

    Number of overlap layers, for Additive -Schwarz only.

    1 +

    Degree of the polynomial accelerator, is + equal to the number of matrix-vector + products performed by the smoother. Is + ignored if the smoother is not POLY +






    +class="td11">

    Table 7: Parameters defining the smoother or the details of the one-level @@ -2278,7 +1881,7 @@ preconditioner.
    -


    @@ -2286,9 +1889,9 @@ preconditioner.
    -

    +

    s + + its RAS variant. + + NONEfor its RAS variant. + + factorizations. + + factorization. + + process. + + argument. + + argument. +





    what

    what

    daa type

    val

    e

    val

    default

    t

    coments






    SUB_RESTR

    character(len=*)

    HALO -

    NONE

    HALO

    SUB_OVR +

    integer +

    Any integer +

    number 0 +

    1 +

    Number of overlap layers, for Additive + Schwarz only. +

    SUB_RESTR +

    character(len=*) +

    HALO +

    NONE +

    HALO +

    Type of restriction operator, for Additive -Schwarz only: HALO for taking into account -the overlap, NONE the overlap, NONE for neglecting it. -

    Note that HALO must be chosen for the -classical Addditive Schwarz smoother and -its RAS variant.






    SUB_PROL

    character(len=*)

    SUM -

    NONE

    NONE

    SUB_PROL +

    character(len=*) +

    SUM +

    NONE +

    NONE +

    Type of prolongation operator, for Additive -Schwarz only: SUM Schwarz only: SUM for adding the -contributions from the overlap, NONE contributions from the overlap, NONE for -neglecting them. -

    Note that SUM

    Note that SUM must be chosen for the -classical Additive Schwarz smoother, and -NONE for its RAS variant.






    SUB_FILLIN

    integer

    SUB_FILLIN +

    integer +

    Any integer -

    number 0

    0

    0 +

    0 +

    Fill-in level p of the incomplete LU -factorizations.






    SUB_ILUTHRS

    real(kind_parameter)

    Any real -

    SUB_ILUTHRS +

    real(kind_parameter) +

    Any real + number 0

    0

    0 +

    0 +

    Drop tolerance t in the ILU(p,t) -factorization.






    MUMPS_LOC_GLOB

    character(len=*)

    LOCAL_SOLVER -

    GLOBAL_SOLVER

    GLOBAL_SOLVER

    MUMPS_LOC_GLOB +

    character(len=*) +

    LOCAL_SOLVER +

    GLOBAL_SOLVER +

    GLOBAL_SOLVER +

    Whether MUMPS should be used as a -distributed solver, or as a serial solver acting -only on the part of the matrix local to each -process.






    MUMPS_IPAR_ENTRY

    integer

    Any integer -number

    0

    MUMPS_IPAR_ENTRY +

    integer +

    Any integer + number +

    0 +

    Set an entry in the MUMPS integer control -array, as chosen via the idx array, as chosen via the idx optional -argument.






    MUMPS_RPAR_ENTRY

    real

    Any real number

    0

    Set an entry in the MUMPS real control -array, as chosen via the idx

    MUMPS_RPAR_ENTRY +

    real +

    Any real number +

    0 +

    Set an entry in the MUMPS real control + array, as chosen via the idx optional -argument.






    + style="vertical-align:baseline;" id="TBL-10-10-">
    Table 8: Parameters defining the smoother or the details of the one-level preconditioner @@ -2566,6 +2140,219 @@ class="content">Parameters defining the smoother or the details of the one-level +

    + +
    + + + +


    + + + +
    +

    +

    + + + +





    what

    data type

    val

    default

    comments






    POLY_VARIANT +

    character(len=*) +

    CHEB_4 +

    CHEB_4_OPT +

    CHEB_1_OPT +

    CHEB_4 +

    Select the type of + polynomial accelerator. + The CHEB_4 and + CHEB_4_OPT types + are those based on the + Chebyshev + polynomials of + the 4th-kind described + in [27]. The + CHEB_1_OPT version + is the one described + in [15] and based on + the Chebyshev + polynomials of the + 1st-kind. +






    POLY_RHO_ESTIMATE +

    character(len=*) +

    POLY_RHO_EST_POWER +

    POLY_RHO_EST_POWER +

    Algorithm for + estimating the spectral + radius of the smoother + to + which the polynomial + acceleration is applied. + The only implemented + algorithm is the power + method; see also the + two following options. +






    POLY_RHO_ESTIMATE_ITERATIONS +

    integer +

    Any integer +

    number 1 +

    20 +

    Number of iterations + for the spectral radius + estimate. +






    POLY_RHO_BA +

    real(kind_parameter) +

    Any real +

    number (0,1] +

    1 +

    Sets an estimate of + the spectral radius of + the base smoother to + which the polynomial + accelerator is applied. +






    +
    Table 9: Parameters defining the smoother or the details of the one-level preconditioner +(continued).
    + + +

    @@ -2574,7 +2361,7 @@ class="content">Parameters defining the smoother or the details of the one-level -