diff --git a/.gitignore b/.gitignore
index 6ecf4e1f..d9e7db98 100644
--- a/.gitignore
+++ b/.gitignore
@@ -20,3 +20,5 @@ autom4te.cache
# the executable from tests
runs
+# Documentation temporary files
+docs/src/userguide.pdf
diff --git a/README.md b/README.md
index 53e82e04..37f5ce5f 100644
--- a/README.md
+++ b/README.md
@@ -15,7 +15,13 @@ AMG4PSBLAS enables the user to easily specify different features of an algebraic
The package employs object-oriented design techniques in Fortran 2008, with interfaces to additional third party libraries such as MUMPS, UMFPACK, SuperLU, and SuperLU_Dist, which can be exploited in building multilevel preconditioners. The parallel implementation is based on a Single Program Multiple Data (SPMD) paradigm; the inter-process communication is based on MPI and is managed mainly through PSBLAS.
-## Main refrerences:
+## Main Refrerences:
+
+The main reference for this project is
+> D'Ambra, P., Durastante, F., & Filippone, S. (2021). AMG preconditioners for linear solvers towards extreme scale. SIAM Journal on Scientific Computing, 43(5), S679-S703.
+
+AMG4PSBLAS is the suite of preconditioners for the Parallel Sparse Computation Toolkit ([PSCToolkit](https://psctoolkit.github.io/)) suite of libraries. See the paper:
+> D’Ambra, P., Durastante, F., & Filippone, S. (2023). Parallel Sparse Computation Toolkit. Software Impacts, 15, 100463.
The main reference for features inherited from MLD2P4 is
> P. D'Ambra, D. di Serafino, S. Filippone,
@@ -24,12 +30,6 @@ The main reference for features inherited from MLD2P4 is
> ACM Transactions on Mathematical Software, 37 (3), 2010, art. 30,
> doi: 10.1145/1824801.1824808.
-The new features introduced and which led to the library's name change are described in the article
-> D'Ambra, P., Durastante, F., & Filippone, S. (2021). AMG preconditioners for linear solvers towards extreme scale. SIAM Journal on Scientific Computing, 43(5), S679-S703.
-
-AMG4PSBLAS contains the suite of preconditioners for the Parallel Sparse Computation Toolkit ([PSCToolkit](https://psctoolkit.github.io/)) suite of libraries. See the paper:
-> D’Ambra, P., Durastante, F., & Filippone, S. (2023). Parallel Sparse Computation Toolkit. Software Impacts, 15, 100463.
-
## Installing
Installation requires having a working version of the [PSBLAS](https://github.com/sfilippone/psblas3) library installed.
@@ -72,4 +72,5 @@ In the European project “Energy oriented Center of Excellence: toward exascale
- Pasqua D'Ambra (IAC-CNR, Naples, IT)
- Fabio Durastante (University of Pisa and IAC-CNR, IT)
-- Salvatore Filippone (University of Rome Tor Vergata and IAC-CNR)
+- Salvatore Filippone (University of Rome Tor Vergata and IAC-CNR, IT)
+
diff --git a/docs/amg4psblas_1.0-guide.pdf b/docs/amg4psblas_1.0-guide.pdf
index b3c4f559..03d93877 100644
Binary files a/docs/amg4psblas_1.0-guide.pdf and b/docs/amg4psblas_1.0-guide.pdf differ
diff --git a/docs/html/index.html b/docs/html/index.html
index 876b4291..b5db3f13 100644
--- a/docs/html/index.html
+++ b/docs/html/index.html
@@ -59,40 +59,33 @@ class="cmr-12">General Overview
class="cmr-12">2 Code Distribution
-
+
+
+
+
+
+
[Code Distribution
class="cmr-12">3 Configuring and Building AMG4PSBLAS
- Pasqua D’Ambra, IAC-CNR, IT;
- Fabio Durastante, University of Pisa and IAC-CNR, IT;
+ Salvatore Filippone, University of Rome Tor-Vergata and IAC-CNR, IT;
Contributors
-
Citing AMG4PSBLAS
3 Configuring and Building AMG4PSBLAS
-
3.1 Prerequisites
-
3.2 Optional third party libraries
-
3.3 Configuration options
-
3.4 Bug reporting
-
3.5 Example and test programs
@@ -100,13 +93,11 @@ class="cmr-12">Example and test programs
class="cmr-12">4 Getting Started
-
4.1 Examples
-
4.2 GPU example
@@ -114,71 +105,62 @@ class="cmr-12">GPU example
class="cmr-12">5 User Interface
-
5.1 Method init
-
5.2 Method set
-
5.3 Method hierarchy_build
-
5.4 Method smoothers_build
-
5.5 Method build
-
5.6 Method apply
-
5.7 Method free
-
5.8 Method descr
-
5.9 Auxiliary Methods
6 Adding new smoother and solver objects to AMG4PSBLAS
7 Error Handling
A License
B Contributor Covenant Code of Conduct
References
diff --git a/docs/html/userhtml.css b/docs/html/userhtml.css
index a5ede259..c29347d8 100644
--- a/docs/html/userhtml.css
+++ b/docs/html/userhtml.css
@@ -22,6 +22,9 @@
.cmmi-8{font-size:72%;font-style: italic;}
.cmsy-10x-x-120{font-size:109%;}
.cmsy-8{font-size:72%;}
+.cmtt-10{font-size:90%;font-family: monospace,monospace;}
+.cmtt-10{font-family: monospace,monospace;}
+.cmtt-10{font-family: monospace,monospace;}
.tctt-1200{font-size:109%;font-family: monospace,monospace;}
.cmmi-10x-x-109{font-style: italic;}
.cmsy-10x-x-109{}
@@ -29,9 +32,6 @@
.cmtt-10x-x-109{font-family: monospace,monospace;}
.cmtt-10x-x-109{font-family: monospace,monospace;}
.cmcsc-10x-x-109{}
-.cmtt-10{font-size:90%;font-family: monospace,monospace;}
-.cmtt-10{font-family: monospace,monospace;}
-.cmtt-10{font-family: monospace,monospace;}
.cmbx-10x-x-109{ font-weight: bold;}
.cmbx-10x-x-109{ font-weight: bold;}
.cmbx-10x-x-109{ font-weight: bold;}
@@ -42,21 +42,26 @@ p.indent{text-indent:0;}
p + p{margin-top:1em;}
p + div, p + pre {margin-top:1em;}
div + p, pre + p {margin-top:1em;}
+a { overflow-wrap: break-word; word-wrap: break-word; word-break: break-word; hyphens: auto; }
@media print {div.crosslinks {visibility:hidden;}}
+table.tabular{border-collapse: collapse; border-spacing: 0;}
a img { border-top: 0; border-left: 0; border-right: 0; }
center { margin-top:1em; margin-bottom:1em; }
td center { margin-top:0em; margin-bottom:0em; }
.Canvas { position:relative; }
img.math{vertical-align:middle;}
+div.par-math-display, div.math-display{text-align:center;}
li p.indent { text-indent: 0em }
li p:first-child{ margin-top:0em; }
li p:last-child, li div:last-child { margin-bottom:0.5em; }
+li p:first-child{ margin-bottom:0; }
li p~ul:last-child, li p~ol:last-child{ margin-bottom:0.5em; }
.enumerate1 {list-style-type:decimal;}
.enumerate2 {list-style-type:lower-alpha;}
.enumerate3 {list-style-type:lower-roman;}
.enumerate4 {list-style-type:upper-alpha;}
div.newtheorem { margin-bottom: 2em; margin-top: 2em;}
+div.newtheorem .head{font-weight: bold;}
.obeylines-h,.obeylines-v {white-space: nowrap; }
div.obeylines-v p { margin-top:0; margin-bottom:0; }
.overline{ text-decoration:overline; }
@@ -82,6 +87,7 @@ div.flushleft {text-align: left;}
.framebox-r {text-align:right;}
span.thank-mark{ vertical-align: super }
span.footnote-mark sup.textsuperscript, span.footnote-mark a sup.textsuperscript{ font-size:80%; }
+code.verb{font-family:monospace,monospace;}
div.tabular, div.center div.tabular {text-align: center; margin-top:0.5em; margin-bottom:0.5em; }
table.tabular td p{margin-top:0em;}
table.tabular {margin-left: auto; margin-right: auto;}
@@ -100,6 +106,9 @@ table[rules] {border-left:solid black 0.4pt; border-right:solid black 0.4pt; }
.hline hr, .cline hr{ height : 0px; margin:0px; }
.hline td, .cline td{ padding: 0; }
.hline hr, .cline hr{border:none;border-top:1px solid black;}
+.hline {border-top: 1px solid black;}
+.hline + .vspace:last-child{display:none;}
+.hline:first-child{border-bottom:1px solid black;border-top:none;}
.tabbing-right {text-align:right;}
div.float, div.figure {margin-left: auto; margin-right: auto;}
div.float img {text-align:center;}
@@ -130,9 +139,9 @@ div.caption span.id{font-weight: bold; white-space: nowrap; }
h1.partHead{text-align: center}
p.bibitem { text-indent: -2em; margin-left: 2em; margin-top:0.6em; margin-bottom:0.6em; }
p.bibitem-p { text-indent: 0em; margin-left: 2em; margin-top:0.6em; margin-bottom:0.6em; }
+.subsubsectionHead, .likesubsubsectionHead { font-size: 1em; }
.paragraphHead, .likeparagraphHead { margin-top:2em; font-weight: bold;}
.subparagraphHead, .likesubparagraphHead { font-weight: bold;}
-.quote {margin-bottom:0.25em; margin-top:0.25em; margin-left:1em; margin-right:1em; text-align:justify;}
.verse{white-space:nowrap; margin-left:2em}
div.maketitle {text-align:center;}
h2.titleHead{text-align:center;}
@@ -140,121 +149,95 @@ div.maketitle{ margin-bottom: 2em; }
div.author, div.date {text-align:center;}
div.thanks{text-align:left; margin-left:10%; font-size:85%; font-style:italic; }
div.author{white-space: nowrap;}
-.quotation {margin-bottom:0.25em; margin-top:0.25em; margin-left:1em; }
-.abstract p {margin-left:5%; margin-right:5%;}
+div.abstract p {margin-left:5%; margin-right:5%;}
div.abstract {width:100%;}
-.subsectionToc, .likesubsectionToc {margin-left:2em;}
-.subsubsectionToc, .likesubsubsectionToc {margin-left:4em;}
+.abstracttitle{text-align:center;margin-bottom:1em;}
+.subsectionToc, .likesubsectionToc {margin-left:1em;}
+.subsubsectionToc, .likesubsubsectionToc {margin-left:2em;}
+.paragraphToc, .likeparagraphToc {margin-left:3em;}
+.subparagraphToc, .likesubparagraphToc {margin-left:4em;}
.ovalbox { padding-left:3pt; padding-right:3pt; border:solid thin; }
.Ovalbox-thick { padding-left:3pt; padding-right:3pt; border:solid thick; }
.shadowbox { padding-left:3pt; padding-right:3pt; border:solid thin; border-right:solid thick; border-bottom:solid thick; }
.doublebox { padding-left:3pt; padding-right:3pt; border-style:double; border:solid thick; }
.rotatebox{display: inline-block;}
+code.lstinline{font-family:monospace,monospace;}
+pre.listings{font-family: monospace,monospace; white-space: pre-wrap; margin-top:0.5em; margin-bottom:0.5em; }
.lstlisting .label{margin-right:0.5em; }
-div.lstlisting{font-family: monospace,monospace; white-space: nowrap; margin-top:0.5em; margin-bottom:0.5em; }
-div.lstinputlisting{ font-family: monospace,monospace; white-space: nowrap; }
+pre.lstlisting{font-family: monospace,monospace; white-space: pre-wrap; margin-top:0.5em; margin-bottom:0.5em; }
+pre.lstinputlisting{ font-family: monospace,monospace; white-space: pre-wrap; }
.lstinputlisting .label{margin-right:0.5em;}
-#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-1{border-collapse:collapse;}
-#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-1{border-collapse:collapse;}
-#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-1{border-collapse:collapse;}
-#TBL-1 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-1{border-collapse:collapse;}
-#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-4{border-collapse:collapse;}
-#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-4{border-collapse:collapse;}
-#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-4{border-collapse:collapse;}
-#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-4{border-collapse:collapse;}
-#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-4{border-collapse:collapse;}
-#TBL-4 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-4{border-collapse:collapse;}
-#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-5{border-collapse:collapse;}
-#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-5{border-collapse:collapse;}
-#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-5{border-collapse:collapse;}
-#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-5{border-collapse:collapse;}
-#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-5{border-collapse:collapse;}
-#TBL-5 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-5{border-collapse:collapse;}
+#TBL-1-1{border-left: 1px solid black;}
+#TBL-1-1{border-right:1px solid black;}
+#TBL-1-2{border-right:1px solid black;}
+#TBL-1-3{border-right:1px solid black;}
+#TBL-4-1{border-left: 1px solid black;}
+#TBL-4-1{border-right:1px solid black;}
+#TBL-4-2{border-right:1px solid black;}
+#TBL-4-3{border-right:1px solid black;}
+#TBL-4-4{border-right:1px solid black;}
+#TBL-4-5{border-right:1px solid black;}
+#TBL-5-1{border-left: 1px solid black;}
+#TBL-5-1{border-right:1px solid black;}
+#TBL-5-2{border-right:1px solid black;}
+#TBL-5-3{border-right:1px solid black;}
+#TBL-5-4{border-right:1px solid black;}
+#TBL-5-5{border-right:1px solid black;}
td#TBL-5-10-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+td#TBL-5-10-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+td#TBL-5-11-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-5-11-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-5-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
-#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-6{border-collapse:collapse;}
-#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-6{border-collapse:collapse;}
-#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-6{border-collapse:collapse;}
-#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-6{border-collapse:collapse;}
-#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-6{border-collapse:collapse;}
-#TBL-6 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-6{border-collapse:collapse;}
+td#TBL-5-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+#TBL-6-1{border-left: 1px solid black;}
+#TBL-6-1{border-right:1px solid black;}
+#TBL-6-2{border-right:1px solid black;}
+#TBL-6-3{border-right:1px solid black;}
+#TBL-6-4{border-right:1px solid black;}
+#TBL-6-5{border-right:1px solid black;}
+td#TBL-6-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-6-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-6-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
-#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-7{border-collapse:collapse;}
-#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-7{border-collapse:collapse;}
-#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-7{border-collapse:collapse;}
-#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-7{border-collapse:collapse;}
-#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-7{border-collapse:collapse;}
-#TBL-7 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-7{border-collapse:collapse;}
+td#TBL-6-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+#TBL-7-1{border-left: 1px solid black;}
+#TBL-7-1{border-right:1px solid black;}
+#TBL-7-2{border-right:1px solid black;}
+#TBL-7-3{border-right:1px solid black;}
+#TBL-7-4{border-right:1px solid black;}
+#TBL-7-5{border-right:1px solid black;}
+td#TBL-7-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-7-5-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-7-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+td#TBL-7-6-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+td#TBL-7-7-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-7-7-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-7-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+td#TBL-7-12-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
+td#TBL-7-13-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
td#TBL-7-13-5{border-left:solid black 0.4pt;border-right:solid black 0.4pt;}
-#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-8{border-collapse:collapse;}
-#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-8{border-collapse:collapse;}
-#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-8{border-collapse:collapse;}
-#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-8{border-collapse:collapse;}
-#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-8{border-collapse:collapse;}
-#TBL-8 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-8{border-collapse:collapse;}
-#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-9{border-collapse:collapse;}
-#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-9{border-collapse:collapse;}
-#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-9{border-collapse:collapse;}
-#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-9{border-collapse:collapse;}
-#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-9{border-collapse:collapse;}
-#TBL-9 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-9{border-collapse:collapse;}
-#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-10{border-collapse:collapse;}
-#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-10{border-collapse:collapse;}
-#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-10{border-collapse:collapse;}
-#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-10{border-collapse:collapse;}
-#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-10{border-collapse:collapse;}
-#TBL-10 colgroup{border-left: 1px solid black;border-right:1px solid black;}
-#TBL-10{border-collapse:collapse;}
+#TBL-8-1{border-left: 1px solid black;}
+#TBL-8-1{border-right:1px solid black;}
+#TBL-8-2{border-right:1px solid black;}
+#TBL-8-3{border-right:1px solid black;}
+#TBL-8-4{border-right:1px solid black;}
+#TBL-8-5{border-right:1px solid black;}
+#TBL-9-1{border-left: 1px solid black;}
+#TBL-9-1{border-right:1px solid black;}
+#TBL-9-2{border-right:1px solid black;}
+#TBL-9-3{border-right:1px solid black;}
+#TBL-9-4{border-right:1px solid black;}
+#TBL-9-5{border-right:1px solid black;}
+#TBL-10-1{border-left: 1px solid black;}
+#TBL-10-1{border-right:1px solid black;}
+#TBL-10-2{border-right:1px solid black;}
+#TBL-10-3{border-right:1px solid black;}
+#TBL-10-4{border-right:1px solid black;}
+#TBL-10-5{border-right:1px solid black;}
+#TBL-11-1{border-left: 1px solid black;}
+#TBL-11-1{border-right:1px solid black;}
+#TBL-11-2{border-right:1px solid black;}
+#TBL-11-3{border-right:1px solid black;}
+#TBL-11-4{border-right:1px solid black;}
+#TBL-11-5{border-right:1px solid black;}
/* end css.sty */
diff --git a/docs/html/userhtml.html b/docs/html/userhtml.html
index 876b4291..b5db3f13 100644
--- a/docs/html/userhtml.html
+++ b/docs/html/userhtml.html
@@ -59,40 +59,33 @@ class="cmr-12">General Overview
class="cmr-12">2 Code Distribution
-
Contributors
-
Citing AMG4PSBLAS
3 Configuring and Building AMG4PSBLAS
-
3.1 Prerequisites
-
3.2 Optional third party libraries
-
3.3 Configuration options
-
3.4 Bug reporting
-
3.5 Example and test programs
@@ -100,13 +93,11 @@ class="cmr-12">Example and test programs
class="cmr-12">4 Getting Started
-
4.1 Examples
-
4.2 GPU example
@@ -114,71 +105,62 @@ class="cmr-12">GPU example
class="cmr-12">5 User Interface
-
5.1 Method init
-
5.2 Method set
-
5.3 Method hierarchy_build
-
5.4 Method smoothers_build
-
5.5 Method build
-
5.6 Method apply
-
5.7 Method free
-
5.8 Method descr
-
5.9 Auxiliary Methods
6 Adding new smoother and solver objects to AMG4PSBLAS
7 Error Handling
A License
B Contributor Covenant Code of Conduct
References
diff --git a/docs/html/userhtmlli1.html b/docs/html/userhtmlli1.html
index dcc83f43..27911345 100644
--- a/docs/html/userhtmlli1.html
+++ b/docs/html/userhtmlli1.html
@@ -145,6 +145,13 @@ class="cmr-12">of AMG4PSBLAS.
+
3.1 Prerequisites
-
3.2 Optional third party libraries
-
3.3 Configuration options
-
3.4 Bug reporting
-
3.5 Example and test programs
@@ -72,13 +67,11 @@ class="cmr-12">Example and test programs
class="cmr-12">4 Getting Started
-
4.1 Examples
-
4.2 GPU example
@@ -86,83 +79,64 @@ class="cmr-12">GPU example
class="cmr-12">5 User Interface
-
5.1 Method init
-
5.2 Method set
-
5.3 Method hierarchy_build
-
5.4 Method smoothers_build
-
5.5 Method build
-
5.6 Method apply
-
5.7 Method free
-
5.8 Method descr
-
5.9 Auxiliary Methods
-
5.9.1 Method: dump
-
5.9.2 Method: clone
-
5.9.3 Method: sizeof
-
5.9.4 Method: allocate_wrk
-
5.9.5 Method: free_wrk
up] Contributors
-
diff --git a/docs/html/userhtmlli4.html b/docs/html/userhtmlli4.html
index 2a339025..a943f3f2 100644
--- a/docs/html/userhtmlli4.html
+++ b/docs/html/userhtmlli4.html
@@ -25,7 +25,7 @@ href="userhtmlse2.html#userhtmlli4.html" >up]
When use the library, please cite the following: @@ -41,6 +41,7 @@ class="cmr-12">When use the library, please cite the following: archivePrefix = {arXiv}, year={2021} } + @Misc{psctoolkit-web-page, author = {D’Ambra, Pasqua and Durastante, Fabio and Filippone, Salvatore}, title = {{PSCToolkit} {W}eb page}, @@ -56,6 +57,9 @@ class="cmr-12">When use the library, please cite the following: + + +
[up]
[15] P. D’Ambra, F. Durastante, S. Filippone, S. Massei, S. Thomas + Optimal Polynomial Smoothers for Parallel AMG, 2024, arXiv:2407.09848. +
+[16][17] SIAM Journal on Matrix Analysis and Applications, 20 (3), 1999, 7
[17][18] Software, 16 (1) 1990, 1–17.
[18][19] extended set of FORTRAN Basic Linear Algebra Subprograms< class="cmr-12">, ACM Transactions on Mathematical Software, 14 (1) 1988, 1–17. + + +
[19][20] Clusters, in Proc. of ParCo 2001, Parallel Computing, Advances and Current Issues, 2002. - - -
[20][21] .
[21][22] 23.
[22][23] Transactions on Mathematical Software, 26 (4), 2000, 527–55
[23][24] 2016, 23:501-518
[24][25] , MIT Press, 1998.
[25][26] Algebra Subprograms for FORTRAN usage, ACM Transactions on Mathematical Software, 5 (3), 1979, 308–323. + + +
[26][27] J. Lottes, Optimal polynomial smoothers for multigrid V-cycles, + Numerical Linear Algebra with Applications 30.6 (2023): e2518. +
+[27][29] Numerical Linear Algebra with Applications, 15 (5), 2008, 473R
[28][30] 2003.
[29][31] University Press, 1996.
[30][32] Press, 1998.
[31][33] Oosterlee, Multigrid, Academic Press, 2001.
[32][34] Aggregation Strategies on Massively Parallel Machines, in J. Donnelley, editor, Proceedings of SuperComputing 2000, Dallas, 2000. + + +
[33][35] (3) 1996, 179–196.
[5, 3133]), to be used in the iterative solution of linear systems,
- | + | (1) |
- where A is a square, real or complex, sparse symmetric positive definite (s.p.d)
@@ -121,7 +121,7 @@ class="cmr-12">5 a decoupled version of the smoothed aggregation procedure proposed in [,
3335] a coupled, parallel implementation of the Coarsening based on Compatible
Weighted Matching introduced incomputational framework [2223, 2122]. PSBLAS provides basic linear algebra operators
AMG4PSBLAS has a layered and modular software architecture where
class="cmr-12">layers can be identified. The lower layer consists of the PSBLAS kernels, the middle
one implements the construction and application phases of the preconditioners, and the
-upper one provides a uniform interface to all the preconditioners. This architecture
+upper one provides a uniform interface to all the preconditioners. This architecture
allows for different levels of use of the package: few black-box routines at the upper
.
+
+
+
[2 Code Distribution
AMG4PSBLAS is available from the web site
- where contact points for further information can be also found.
different and more stringent license, most notably the GPL, and t
class="cmr-12">into account when treating derived works.
The library defines a version string with the constant
- whose current value is 1.0whose current value is
In order to build AMG4PSBLAS it is necessary to set up a Makefile with appropriate
system-dependent variables; this is done by means of the configure system-dependent variables; this is done by means of the
The following steps are required:
Declare the preconditioner data structure. It is a derived data type,
- amg_xprec_ type Allocate and initialize the preconditioner data structure, according to a
preconditioner type chosen by the user. This is performed by the routine
- initinit, which also sets defaults for each preconditioner type selected by
the user. The preconditioner types and the defaults associated with them
@@ -98,34 +88,35 @@ class="cmr-12">are given in Table 1, where the strings used by init , where the strings used by Modify the selected preconditioner type, by properly setting preconditioner
parameters. This is performed by the routine setThis is performed by the routine Build the preconditioner for a given matrix. If the selected preconditioner is
multilevel, then two steps must be performed, as specified next.
-
-
-
Build the AMG hierarchy for a given matrix. This is performed by the
routine hierarchy_buildroutine Build the preconditioner for a given matrix. This is performed by the
routine smoothers_buildroutine If the selected preconditioner is one-level, it is built in a single step, performed by
the routine bldthe routine Apply the preconditioner at each iteration of a Krylov solver. This is performed by
the method applythe method Free the preconditioner data structure. This is performed by the routine free. This is performed by the routine
e string g deioner ’NONE’ Considered to use the PSBLAS Krylov
-solvers with no preconditioner. Considered to use the PSBLAS Krylov
+ solvers with no preconditioner.
+ ’DIAG’,
-’JACOBI’,
-’L1-JACOBI’ Diagonal preconditioner. For any zero
-diagonal entry of the matrix to be
-preconditioned, the corresponding entry
-of the preconditioner is set to 1. Diagonal preconditioner. For any zero
+ diagonal entry of the matrix to be
+ preconditioned, the corresponding entry
+ of the preconditioner is set to 1.
+ ’GS’,
-’L1-GS’ Hybrid Gauss-Seidel (forward), that is,
-global block Jacobi with Gauss-Seidel as
-local solver. Hybrid Gauss-Seidel (forward), that is,
+ global block Jacobi with Gauss-Seidel as
+ local solver.
+ ’FBGS’,
-’L1-FBGS’ Symmetrized hybrid Gauss-Seidel, that
-is, forward Gauss-Seidel followed by
-backward Gauss-Seidel. Symmetrized hybrid Gauss-Seidel, that
+ is, forward Gauss-Seidel followed by
+ backward Gauss-Seidel.
+ ’BJAC’,
-’L1-BJAC’ Block-Jacobi with ILU(0) on the local
-blocks. Block-Jacobi with ILU(0) on the local
+ blocks.
+ ’AS’ Additive Schwarz (AS), with overlap 1
-and ILU(0) on the local blocks. Additive Schwarz (AS), with overlap 1
+ and ILU(0) on the local blocks.
+ ’ML’ V-cycle with one hybrid
-forward Gauss-Seidel (GS) sweep as
-pre-smoother and one hybrid backward
-GS sweep as post-smoother, decoupled
-smoothed aggregation as coarsening
-algorithm, and LU (plus triangular solve)
-as coarsest-level solver. See the default
-values in Tables
+ Multilevel V-cycle with one hybrid
+ forward Gauss-Seidel (GS) sweep as
+ pre-smoother and one hybrid backward
+ GS sweep as post-smoother, decoupled
+ smoothed aggregation as coarsening
+ algorithm, and LU (plus triangular solve)
+ as coarsest-level solver. See the default
+ values in Tables 2-8 for further details of
-the preconditioner.
-
1.0
.
configure
script. The
distribution also includes the autoconf and automake sources employed to generate the
@@ -48,12 +47,10 @@ class="cmr-12"> 2003, with some
class="cmr-12">interfaces to external libraries in C; the Fortran compiler must support the
Fortran 2003 standard plus the extension MOLD= 2003 standard plus the extension MOLD=
feature, which enhances the usability
of ALLOCATEof ALLOCATE
. Most Fortran compilers provide this feature; in particular, this is
supported by the GNU Fortran compiler, for which we recommend to use at least
@@ -87,28 +84,23 @@ class="cmr-12">the base and optional software used by AMG4PSBLAS is given in the
class="cmr-12">sections.
3.2 Optional third party libraries
-
3.3 Configuration options
-
3.4 Bug reporting
-
3.5 Example and test programs
diff --git a/docs/html/userhtmlse4.html b/docs/html/userhtmlse4.html
index 0a78b57d..264761de 100644
--- a/docs/html/userhtmlse4.html
+++ b/docs/html/userhtmlse4.html
@@ -40,56 +40,46 @@ class="cmr-12">PSBLAS [2021].
-
amg_
xprec_
type
, where x may be s, d, c or zmay be s
, d
, c
or z
, according to the basic data
type of the sparse matrix (s = real single precision; d type of the sparse matrix (s
= real single precision; d
= real double precision;
- c = complex single precision; z c
= complex single precision; z
= complex double precision). This data
structure is accessed by the user only through the AMG4PSBLAS routines,
following an object-oriented approach.
init
to identify the
preconditioner types are also given. Note that these strings are valid also if
uppercase letters are substituted by corresponding lowercase ones.
set
. This routine must be
called if the user wants to modify the default values of the parameters
associated with the selected preconditioner type, to obtain a variant of that
preconditioner. Examples of use of set preconditioner. Examples of use of set
are given in Section 4.1; a complete
+
+
+
list of all the preconditioner parameters and their allowed and default values
8.
-
hierarchy_build
.
smoothers_build
.bld
.
apply
. When using the PSBLAS Krylov solvers, this step is
completely transparent to the user, since apply completely transparent to the user, since apply
is called by the PSBLAS routine
implementing the Krylov solver (psb_krylovimplementing the Krylov solver (psb_krylov
).
free
.
This step is complementary to step 1 and should be performed when the
@@ -231,28 +217,28 @@ class="cmr-12">. type r No preconditioner
+class="td11">
+ No preconditioner ’NONE’
+
Diagonal
+class="td11">
+ Diagonal ’DIAG’
,
+ ’JACOBI’
,
+ ’L1-JACOBI’
+
Gauss-Seidel
+class="td11">
+ Gauss-Seidel ’GS’
,
+ ’L1-GS’
+
Symmetrized Gauss-Seidel
+class="td11">
+ Symmetrized Gauss-Seidel ’FBGS’
,
+ ’L1-FBGS’
+
Block Jacobi
+class="td11">
+ Block Jacobi ’BJAC’
,
+ ’L1-BJAC’
+
Additive Schwarz
+class="td11">
+ Additive Schwarz ’AS’
+
Multilevel ’ML’
+
+ the preconditioner.
+
@@ -399,18 +362,15 @@ class="content">Preconditioner types, corresponding strings and default choices.
Note that the module amg_prec_modNote that the module amg_prec_mod
, containing the definition of the preconditioner
data type and the interfaces to the routines of AMG4PSBLAS, must be used
in any program calling such routines. The modules psb_base_modin any program calling such routines. The modules psb_base_mod
, for the
sparse matrix and communication descriptor data types, and psb_krylov_modsparse matrix and communication descriptor data types, and psb_krylov_mod
,
for interfacing with the Krylov solvers, must be also used (see Sectionproblems. However, this does not necessarily correspond to the sh
class="cmr-12">on parallel computers.
The basic user interface of AMG4PBLAS consists of eight methods. The six methods
-init, set, build, hierarchy_build, smoothers_build and apply init, the sparse matrix data structure, containing the matrix to be preconditioned,
must be of type psb_xspmat_type must be of type the preconditioner data structure must be of type the arrays containing the vectors v and B-1v must be of type psb_xvect_type must be of type real parameters defining the preconditioner must be declared according to
the precision of the sparse matrix and preconditioner data structures (see
@@ -160,81 +138,62 @@ class="cmr-12">A description of each method is given in the remainder of this se
declare in the application program a variable of the new type;
pass that variable as the argument to the
- call p%set(smoother,info [,ilev,ilmax,pos]) link the code implementing the various methods into the application
executable. The new solver object is then dynamically included in the preconditioner structure,
to which the preconditioner will conform, even though
class="cmr-12">the AMG4PSBLAS library has not been modified to account for this newset
, build
, hierarchy_build
, smoothers_build
and apply
encapsulate all the
functionalities for the setup and the application of any multilevel and one-level
preconditioner implemented in the package. The method free preconditioner implemented in the package. The method free
deallocates the
preconditioner data structure, while descr preconditioner data structure, while descr
prints a description of the preconditioner
setup by the user. For backward compatibility, methods are also accessible as
@@ -67,53 +59,44 @@ class="cmr-12">real/complex and single/double precision data; arguments with app
must be passed to the method, i.e.,
-
+
+
+
psb_
xspmat_type
with x = s = s
for real single precision, x
= d = d
for real double precision, x = c = c
for complex single precision, x = z = z
for
complex double precision;
amg_
xprec_type
, with x
= s, d, c, z= s
, d
, c
, z
, according to the sparse matrix data structure;
psb_
xvect_type
with x = s, d, c= s
, d
, c
,
- zz
, in a manner completely analogous to the sparse matrix type;
5.2 Method set
-
5.3 Method hierarchy_build
-
5.4 Method smoothers_build
-
5.5 Method build
-
5.6 Method apply
-
5.7 Method free
-
5.8 Method descr
-
5.9 Auxiliary Methods
-
5.9.1 Method: dump
-
5.9.2 Method: clone
-
5.9.3 Method: sizeof
-
5.9.4 Method: allocate_wrk
-
5.9.5 Method: freeOnce the new smoother/solver class has been developed, to use it
the multilevel preconditioners it is necessary to:
-
set
routine as in the following:
-call p%set(solver,info [,ilev,ilmax,pos])call p%set(smoother,info [,ilev,ilmax,pos])
+call p%set(solver,info [,ilev,ilmax,pos])
It is possible to define new values for the keyword WHAT in the set It is possible to define new values for the keyword WHAT
in the set
routine; if the
library code does not recognize a keyword, it passes it down the composition hierarchy
@@ -147,8 +113,7 @@ class="cmr-12">any keyword/value pair that does not pertain to a given solver is
class="cmr-12">ignored.
An example is provided in the source code distribution under the folder
-tests/newslvtests/newslv
. In this example we are implementing a new incomplete factorization
variant (which is simply the ILU(0) factorization under a new name). Because of the
@@ -165,61 +130,51 @@ class="cmr-12">The interfaces for the calls shown above are defined using
-
smoother | class(amg_x_base_smoother_type) |
+
|
|
||
|
| The user-defined new smoother to be employed in the -preconditioner. |
+ preconditioner.
|
solver | class(amg_x_base_solver_type) |
+ style="vertical-align:baseline;" id="TBL-23-3-">
|
|
| The user-defined new solver to be employed in the preconditioner. |
The user-defined new solver to be employed in the preconditioner.
The other arguments are defined in the way described in Sec. 5.2. As an example, in the
-tests/newslv code we define a new object of type amg_d_tlu_solver_typetests/newslv
code we define a new object of type amg_d_tlu_solver_type
, and we
pass it as follows:
-
++ ! sparse matrix and preconditioner type(psb_dspmat_type) :: a type(amg_dprec_type) :: prec type(amg_d_tlu_solver_type) :: tlusv + ...... ! ! prepare the preconditioner: an ML with defaults, but with TLU solver at @@ -230,6 +185,7 @@ class="cmr-12">pass it as follows: nlv = prec%get_nlevs() call prec%set(tlusv, info,ilev=1,ilmax=max(1,nlv-1)) call prec%smoothers_build(a,desc_a,info) +@@ -241,6 +197,9 @@ class="cmr-12">pass it as follows: + + +
[Error Handling
The error handling in AMG4PSBLAS is based on the PSBLAS error handling. Error conditions are signaled via an integer argument infoconditions are signaled via an integer argument
]. @@ -67,6 +66,9 @@ class="cmr-12">. + + +info
; whenever an error condition is detected, an error trace stack is built by the library up to the top-level, user-callable @@ -52,7 +51,7 @@ class="cmr-12">PSBLAS error handling routines; for further details see the PSBLA [2021[AMG4PSBLAS is freely distributable under the following copyright -
+-+ AMG4PSBLAS version 1.0 Algebraic MultiGrid Preconditioners Package based on PSBLAS (Parallel Sparse BLAS version 3.7) + (C) Copyright 2021 + Pasqua D’Ambra IAC-CNR, IT Fabio Durastante University of Pisa and IAC-CNR, IT Salvatore Filippone University of Rome Tor-Vergata and IAC-CNR, IT + Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: @@ -55,6 +59,7 @@ class="cmr-12">AMG4PSBLAS is freely distributable under the following copyright 3. The name of the MLD2P4 group or the names of its contributors may not be used to endorse or promote products derived from this software without specific written permission. + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ‘‘AS IS’’ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR @@ -66,6 +71,7 @@ class="cmr-12">AMG4PSBLAS is freely distributable under the following copyright CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +@@ -78,14 +84,20 @@ class="cmr-12">abide by its terms:
++ + MLD2P4 version 2.2 MultiLevel Domain Decomposition Parallel Preconditioners Package based on PSBLAS (Parallel Sparse BLAS version 3.5) + (C) Copyright 2008-2018 + Salvatore Filippone Pasqua D’Ambra Daniela di Serafino + + Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: @@ -97,6 +109,7 @@ class="cmr-12">abide by its terms: 3. The name of the MLD2P4 group or the names of its contributors may not be used to endorse or promote products derived from this software without specific written permission. + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ‘‘AS IS’’ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR @@ -108,6 +121,7 @@ class="cmr-12">abide by its terms: CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +AMG4PSBLAS is distributed together with (a small part of) the graph-matching @@ -127,7 +141,7 @@ class="cmr-12">here. -
+// *********************************************************************** // // MatchboxP: A C++ library for approximate weighted matching @@ -179,6 +193,9 @@ class="cmr-12">here. + + +[Contributor Covenant Code of Conduct
Our Pledge We as members, contributors, and leaders pledge to make participation in @@ -55,52 +55,62 @@ class="cmr-12">of behavior that contributes to a positive environment for our co include:-
+ + +- +
-Demonstrating empathy and kindness toward other people
- +
-Being respectful of differing opinions, viewpoints, and experiences
- +
-Giving and gracefully accepting constructive feedback
- +
-Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
- +
Focusing on what is best not just for us as individuals, but for the overall community
Examples of unacceptable behavior include:
-
- +
-The use of sexualized language or imagery, and sexual attention or advances of any kind - - -
- +
-Trolling, insulting or derogatory comments, and personal or political attacks
- +
-Public or private harassment
- +
-Publishing others’ private information, such as a physical or email address, without their explicit permission
- +
Other conduct which could reasonably be considered inappropriate in a professional setting
Enforcement Responsibilities Community leaders are responsible for clarifying and enforcing our standards of @@ -127,7 +137,7 @@ class="cmr-12">appointed representative at an online or offline event. Enforceme abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at community leaders responsible for enforcement at eocoe@na.iac.cnr.it. All @@ -137,8 +147,11 @@ class="cmr-12">complaints will be reviewed and investigated promptly and fairly. class="cmr-12">leaders are obligated to respect the privacy and security of the reporter of any incident. + + +
Enforcement Guidelines Community leaders will follow these Community Impact Guidelines in @@ -147,17 +160,15 @@ class="cmr-12">determining the consequences for any action they deem in violatio Conduct:
-
- +
-Correction
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community. - - -
Consequence: A private, written warning from community leaders, providing @@ -166,8 +177,9 @@ class="cmr-12">clarity around the nature of the violation and an explanation of behavior was inappropriate. A public apology may be requested.
- +
-Warning
Community Impact: channels like social media. Violating these terms may lead to a t or permanent ban.
- +
-Temporary Ban
Community Impact: public or private interaction with the people involved, including class="cmr-12">interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban. + + +
- +
Permanent Ban
Community Impact: A permanent ban from any sort of public interaction within the community.
Attribution This Code of Conduct is adapted from the Contributor Covenant, version 2.0, available at available at https://www.contributor-covenant.org/version/2/0/code_of. Community Impact Guidelines were inspired by Mozilla’s co enforcement ladder. For answers to common questions about this code of conduct, see the FAQ at the FAQ at https://www.contributor-covenant.org/faq. Translations are available at @@ -255,6 +272,13 @@ class="cmr-12">.
++ + + + + + diff --git a/docs/html/userhtmlsu1.html b/docs/html/userhtmlsu1.html index 8022ba00..f9736269 100644 --- a/docs/html/userhtmlsu1.html +++ b/docs/html/userhtmlsu1.html @@ -33,29 +33,29 @@ class="cmr-12">The following base libraries are needed:
+class="cmr-12">are also prerequisites of PSBLAS.
- BLAS
- +
[1718, 1819, 25] Many vendors provide optimized versions of BLAS; if no - vendor version is available for a given platform, the ATLAS software +class="cmr-12">26
] Many vendors provide optimized versions of BLAS; if no (vendor version is available for a given platform, the ATLAS software ( + math-atlas.sourceforge .net) may be employed. The reference BLAS from Netlib (Netlib ( www.netlib.org/blas) are meant to define the standard behaviour of @@ -74,7 +74,7 @@ class="cmr-12">our experience is that configuring ATLAS for building full LAPACK not always work in the expected way. Our advice is first to download the LAPACK tarfile from LAPACK tarfile from www.netlib.org/lapack and install it independently of @@ -87,14 +87,15 @@ class="cmr-12">library.- MPI
- +
[2425, 3032] A version of MPI is available on most high-performance computing systems.
- PSBLAS
- +
[2021, 22] Parallel Sparse BLAS (PSBLAS) is available from - 23
] Parallel Sparse + BLAS (PSBLAS) is available from psctoolkit.github.io/ products/psblas/; version 3.7.0 (or later) is required. +class="cmr-12">; + version 3.7.0 (or later) is required. Indeed, all the prerequisites listed so far Indeed, all the prerequisites listed so far are also prerequisites of PSBLAS.Please note that the four previous libraries must have Fortran interfaces compatible with compiler being used for AMG4PSBLAS.
If you want to use the PSBLAS support for NVIDIA GPUs, you will also need: -
-
- +class="cmr-12">need a working version of the CUDA Toolkit that is compatible with the PSBLAS-EXT
- Parallel Sparse BLAS (PSBLAS) Extensions, available from - psctoolkit.github.io/products/psblasext/; version 1.3.0 (or later). -
- +class="cmr-12">compiler choice made to compile PSBLAS and AMG4PSBLAS. After that +you will need to have configured and compiled the PSBLAS library with the +options: + + + +
+./configure --enable-cuda --with-cudadir=${CUDA_HOME} --with-cudacc=xx,yy,zz ++Previous versions required you to have the auxiliary libraries SPGPU and +PSBLAS-EXT compiled, this is no longer necessary because they have been integrated +into PSBLAS and are compiled by activating the previous flags during configuration. SPGPU
- Sparse CUDA kernels for NVIDIA GPUs; available from GitHub, see - also psctoolkit.github.io/products/psblasext/.
See also Sec . -