Internal doc changes.

pizdaint-runs
Salvatore Filippone 5 years ago
parent cc9ef42464
commit 6b2fa31ae1

@ -70,14 +70,28 @@
! 3. Perform a local transpose; ! 3. Perform a local transpose;
! 4. Split the matrix: all local entries stay, all halo entries go into ! 4. Split the matrix: all local entries stay, all halo entries go into
! the send buffers, and are converted to global numbering; ! the send buffers, and are converted to global numbering;
! 5. Do the all-to-all with our simple a2av (the exchange is with the halo ! 5. Do the all-to-all (see below for a discussion of the alternative
! pattern, so the full MPI A2AV is almost certainly too heavy) ! communication strategies)
! 6. The receive is in the extra section of the ACOO buffer; convert ! 6. The receive is in the extra section of the ACOO buffer; convert
! the row indices to local numbering, and discard extra ones (there will ! the row indices to local numbering, and discard extra ones (there will
! be some) ! be some)
! 7. If desc_rx was required, make sure to insert the column indices ! 7. If desc_rx was requested, make sure to insert the (new) column indices
! 8. Cleanup and sort the output matrix ! 8. Cleanup and sort the output matrix
! 9. Copy back into AIN or ATRANS if requested. ! 9. Copy back into AIN or ATRANS if requested.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
!
! !
#undef SP_A2AV_MPI #undef SP_A2AV_MPI
#undef SP_A2AV_XI #undef SP_A2AV_XI

@ -31,11 +31,26 @@
! !
! File: psb_csphalo.f90 ! File: psb_csphalo.f90
! !
! Subroutine: psb_csphalo ! Subroutine: psb_csphalo psb_lcsphalo
! This routine does the retrieval of remote matrix rows. ! This routine does the retrieval of remote matrix rows.
! Note that retrieval is done through GTBLK, therefore it should work ! Retrieval is done through GETROW, therefore it should work
! for any matrix format in A; as for the output, default is CSR. ! for any matrix format in A; as for the output, default is CSR.
! !
! There is also a specialized version lc_CSR whose interface
! is adapted for the needs of c_par_csr_spspmm.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
! !
! Arguments: ! Arguments:
! a - type(psb_cspmat_type) The local part of input matrix A ! a - type(psb_cspmat_type) The local part of input matrix A

@ -70,14 +70,28 @@
! 3. Perform a local transpose; ! 3. Perform a local transpose;
! 4. Split the matrix: all local entries stay, all halo entries go into ! 4. Split the matrix: all local entries stay, all halo entries go into
! the send buffers, and are converted to global numbering; ! the send buffers, and are converted to global numbering;
! 5. Do the all-to-all with our simple a2av (the exchange is with the halo ! 5. Do the all-to-all (see below for a discussion of the alternative
! pattern, so the full MPI A2AV is almost certainly too heavy) ! communication strategies)
! 6. The receive is in the extra section of the ACOO buffer; convert ! 6. The receive is in the extra section of the ACOO buffer; convert
! the row indices to local numbering, and discard extra ones (there will ! the row indices to local numbering, and discard extra ones (there will
! be some) ! be some)
! 7. If desc_rx was required, make sure to insert the column indices ! 7. If desc_rx was requested, make sure to insert the (new) column indices
! 8. Cleanup and sort the output matrix ! 8. Cleanup and sort the output matrix
! 9. Copy back into AIN or ATRANS if requested. ! 9. Copy back into AIN or ATRANS if requested.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
!
! !
#undef SP_A2AV_MPI #undef SP_A2AV_MPI
#undef SP_A2AV_XI #undef SP_A2AV_XI

@ -31,11 +31,26 @@
! !
! File: psb_dsphalo.f90 ! File: psb_dsphalo.f90
! !
! Subroutine: psb_dsphalo ! Subroutine: psb_dsphalo psb_ldsphalo
! This routine does the retrieval of remote matrix rows. ! This routine does the retrieval of remote matrix rows.
! Note that retrieval is done through GTBLK, therefore it should work ! Retrieval is done through GETROW, therefore it should work
! for any matrix format in A; as for the output, default is CSR. ! for any matrix format in A; as for the output, default is CSR.
! !
! There is also a specialized version ld_CSR whose interface
! is adapted for the needs of d_par_csr_spspmm.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
! !
! Arguments: ! Arguments:
! a - type(psb_dspmat_type) The local part of input matrix A ! a - type(psb_dspmat_type) The local part of input matrix A

@ -70,14 +70,28 @@
! 3. Perform a local transpose; ! 3. Perform a local transpose;
! 4. Split the matrix: all local entries stay, all halo entries go into ! 4. Split the matrix: all local entries stay, all halo entries go into
! the send buffers, and are converted to global numbering; ! the send buffers, and are converted to global numbering;
! 5. Do the all-to-all with our simple a2av (the exchange is with the halo ! 5. Do the all-to-all (see below for a discussion of the alternative
! pattern, so the full MPI A2AV is almost certainly too heavy) ! communication strategies)
! 6. The receive is in the extra section of the ACOO buffer; convert ! 6. The receive is in the extra section of the ACOO buffer; convert
! the row indices to local numbering, and discard extra ones (there will ! the row indices to local numbering, and discard extra ones (there will
! be some) ! be some)
! 7. If desc_rx was required, make sure to insert the column indices ! 7. If desc_rx was requested, make sure to insert the (new) column indices
! 8. Cleanup and sort the output matrix ! 8. Cleanup and sort the output matrix
! 9. Copy back into AIN or ATRANS if requested. ! 9. Copy back into AIN or ATRANS if requested.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
!
! !
#undef SP_A2AV_MPI #undef SP_A2AV_MPI
#undef SP_A2AV_XI #undef SP_A2AV_XI

@ -31,11 +31,26 @@
! !
! File: psb_ssphalo.f90 ! File: psb_ssphalo.f90
! !
! Subroutine: psb_ssphalo ! Subroutine: psb_ssphalo psb_lssphalo
! This routine does the retrieval of remote matrix rows. ! This routine does the retrieval of remote matrix rows.
! Note that retrieval is done through GTBLK, therefore it should work ! Retrieval is done through GETROW, therefore it should work
! for any matrix format in A; as for the output, default is CSR. ! for any matrix format in A; as for the output, default is CSR.
! !
! There is also a specialized version ls_CSR whose interface
! is adapted for the needs of s_par_csr_spspmm.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
! !
! Arguments: ! Arguments:
! a - type(psb_sspmat_type) The local part of input matrix A ! a - type(psb_sspmat_type) The local part of input matrix A

@ -70,14 +70,28 @@
! 3. Perform a local transpose; ! 3. Perform a local transpose;
! 4. Split the matrix: all local entries stay, all halo entries go into ! 4. Split the matrix: all local entries stay, all halo entries go into
! the send buffers, and are converted to global numbering; ! the send buffers, and are converted to global numbering;
! 5. Do the all-to-all with our simple a2av (the exchange is with the halo ! 5. Do the all-to-all (see below for a discussion of the alternative
! pattern, so the full MPI A2AV is almost certainly too heavy) ! communication strategies)
! 6. The receive is in the extra section of the ACOO buffer; convert ! 6. The receive is in the extra section of the ACOO buffer; convert
! the row indices to local numbering, and discard extra ones (there will ! the row indices to local numbering, and discard extra ones (there will
! be some) ! be some)
! 7. If desc_rx was required, make sure to insert the column indices ! 7. If desc_rx was requested, make sure to insert the (new) column indices
! 8. Cleanup and sort the output matrix ! 8. Cleanup and sort the output matrix
! 9. Copy back into AIN or ATRANS if requested. ! 9. Copy back into AIN or ATRANS if requested.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
!
! !
#undef SP_A2AV_MPI #undef SP_A2AV_MPI
#undef SP_A2AV_XI #undef SP_A2AV_XI

@ -31,11 +31,26 @@
! !
! File: psb_zsphalo.f90 ! File: psb_zsphalo.f90
! !
! Subroutine: psb_zsphalo ! Subroutine: psb_zsphalo psb_lzsphalo
! This routine does the retrieval of remote matrix rows. ! This routine does the retrieval of remote matrix rows.
! Note that retrieval is done through GTBLK, therefore it should work ! Retrieval is done through GETROW, therefore it should work
! for any matrix format in A; as for the output, default is CSR. ! for any matrix format in A; as for the output, default is CSR.
! !
! There is also a specialized version lz_CSR whose interface
! is adapted for the needs of z_par_csr_spspmm.
!
! There are three possible exchange algorithms:
! 1. Use MPI_Alltoallv
! 2. Use psb_simple_a2av
! 3. Use psb_simple_triad_a2av
! Default choice is 3. The MPI variant has proved to be inefficient;
! that is because it is not persistent, therefore you pay the initialization price
! every time, and it is not optimized for a sparse communication pattern,
! most MPI implementations assume that all communications are non-empty.
! The PSB_SIMPLE variants reuse the same communicator, and go for a simplistic
! sequence of sends/receive that is quite efficient for a sparse communication
! pattern. To be refined/reviewed in the future to compare with neighbour
! persistent collectives.
! !
! Arguments: ! Arguments:
! a - type(psb_zspmat_type) The local part of input matrix A ! a - type(psb_zspmat_type) The local part of input matrix A

Loading…
Cancel
Save