180 lines
7.6 KiB
Plaintext
180 lines
7.6 KiB
Plaintext
Design principles for this directory.
|
|
|
|
1. What is a sparse matrix?
|
|
It is an object which does have some properties (number of rows,
|
|
number of columns, whether it is a triangle, and in that case
|
|
upper/lower, unit/nonunit), a state (null, build, assembled,
|
|
update), a type (real/complex, single/double), and a storage
|
|
format.
|
|
Thus we have a three-level inheritance chain:
|
|
i. The base object, defining the methods to set/query the various
|
|
properties, and allocate and free. Some of the property
|
|
getters/setters, allocate and free depend on the storage
|
|
format, so at this level they will just throw an error.
|
|
ii. The X_base_object, where X=s,d,c,z thus defining the
|
|
type. At this level we define the computational interfaces to
|
|
MV and SV, since they require the type of the vectors/scalars
|
|
involved (should also add NRMI here!!!!), but again they will
|
|
be empty shells. We also define the interface to CSPUT,
|
|
required to build the object, and TO_COO,FROM_COO (see
|
|
below).
|
|
iii. The X_YYY_object where the real implementation of the
|
|
MV/SV/NRMI/CSPUT/ALLOCATE/FREE/TO_COO/FROM_COO takes place.
|
|
|
|
2. What is a sparse matrix (take 2)?
|
|
The above structure by itself does not allow a sparse matrix to
|
|
switch among different storage formats during its life. To do this,
|
|
we define all of the above to be INNER objects, encapsulated in an
|
|
OUTER object which is what the rest of the library sees, as
|
|
follows:
|
|
|
|
type :: psbn_d_sparse_mat
|
|
|
|
class(psbn_d_base_sparse_mat), allocatable :: a
|
|
|
|
end type psbn_d_sparse_mat
|
|
type(psbn_d_sparse_mat) :: a
|
|
|
|
In this way we can have an outer object whose type is stable
|
|
both statically (at compile time) and at runtime, while at runtime
|
|
the type of the inner object switches from COO to CSR to whatever as
|
|
needed. All of the methods are simply thrown onto the corresponding
|
|
methods of the (allocatable, polymorphic) component A%A as needed
|
|
(provided the component is allocated, that is).
|
|
This is what is called a STATE design pattern (different from the
|
|
internal state we discussed above).
|
|
|
|
As an example, consider the allocate/build/assembly cycle:
|
|
the outer code would do the following:
|
|
1. Allocate(psbn_d_coo_sparse_mat :: a%a)
|
|
|
|
2. During the build loop a call to A%CSINS() gets translated into
|
|
CALL A%A%CSINS()
|
|
|
|
3. At assembly time the code would do the following
|
|
subroutine psb_spasb(a,....)
|
|
type(psbn_d_sparse_mat), intent(inout) :: a
|
|
|
|
class(psbn_d_base_sparse_mat), allocatable :: temp
|
|
|
|
select case (TYPE)
|
|
case('CSR')
|
|
allocate(psbn_d_csr_sparse_mat :: temp, stat=info)
|
|
end select
|
|
call temp%from_coo(a%a)
|
|
call a%a%free()
|
|
call move_alloc(temp,a%a)
|
|
|
|
|
|
4. Note in the above that to_coo, from_coo are defined so that every
|
|
conceivable storage representation provides just 2 conversion
|
|
routines, avoiding quadratic explosion. But since all have to
|
|
provide them, the to_coo/from_coo is defined in d_base_mat_mod
|
|
together with d_coo_sparse_mat, which enjoys the "crown prince"
|
|
status with respect to all the other types derived from
|
|
d_base_sparse_mat (its "siblings").
|
|
|
|
5. How does a user add a new storage format? Very simple. After
|
|
deriving the class and implementing all the necessary methods,
|
|
the user declares in the program a dummy variable of the new
|
|
inner type
|
|
|
|
type(X_YYY_sparse_mat) :: reftype
|
|
|
|
|
|
then calls
|
|
call psb_spasb(a,....,mold=reftype)
|
|
|
|
In psb_spasb we have
|
|
class(psbn_d_base_sparse_mat), intent(in), optional :: mold
|
|
|
|
if (present(mold)) then
|
|
allocate(temp,source=mold,stat=info)
|
|
end select
|
|
call temp%from_coo(a%a)
|
|
call a%a%free()
|
|
call move_alloc(temp,a%a)
|
|
|
|
AND IT'S DONE! Nothing else in the library requires the explicit
|
|
knowledge of type of MOLD.
|
|
|
|
|
|
|
|
3. Data precisoin (aka KIND / aka byte size)
|
|
Data precision is a bit of a thorny issue here, because it is used
|
|
by the Fortran language to disambiguate generic interfaces. This
|
|
means that we must be careful when choosing precision for data
|
|
structures. On the other hand, we want to have some freedom of
|
|
choice. The sticky point here is how to deal with integers, because
|
|
real and complex are already standardized on S/D/C/Z.
|
|
Integers are tricky because we do not want to use large integer
|
|
sizes (read: 8 bytes) unless they are really necessary; moreover,
|
|
the GPU code currently use 4 byte integers (and with good
|
|
reason). However if we want to tackle large index spaces, we will
|
|
need at some point 8-byte integers.
|
|
|
|
So, here is the plan.
|
|
A. We have two basic integer kinds, 4-byte PSB_MPK_ which takes its
|
|
name from being the kind that is going to be passed to MPI for
|
|
all arguments other than data buffers, and 8-byte PSB_EPK_, to
|
|
be used as necessary; the PSB_SIZEOF functions which return
|
|
data structure sizes (sometimes summed over all processes) are
|
|
always 8-byte.
|
|
|
|
B. At all levels where a function/subroutine is supposed to be
|
|
interfaced with an array of integers, there should be two
|
|
versions, distinguished by an M or an E in the specific name,
|
|
all adding to a generic set. This applies to the internal
|
|
utilities, such as sorting and reallocation.
|
|
|
|
C. For computation we have I and L, as in psb_ipk_ and
|
|
psb_lpk_. The idea is that I<=L, and I is used for almost
|
|
everything, e.g. for the integer parts of the sparse matrix data
|
|
structures. L is only used for a very small subset of data, and
|
|
specifically for the indices in *GLOBAL* numbering mode, hence
|
|
the I<=L constraint.
|
|
|
|
D. The values for I and L can be remapped independently at
|
|
configure time over M and E; thus, if a sparse matrix routine is
|
|
reallocating integer data through the generic names of the
|
|
utilities, the PSB_IPK_ is remapped at compile time onto
|
|
PSB_MPK_ or PSB_EPK_ as needed.
|
|
|
|
E. Because we must have I<=L, this means that supported
|
|
configurations are (I=4,L=4), (I=4,L=8), (I=8,L=8). Default
|
|
is (I=4,L=8), because it allows us to go to multi-billion linear
|
|
systems while still keeping all local data structures on 4-byte
|
|
integers.
|
|
|
|
F. Thus, care must be taken in defining specific interfaces: to
|
|
reiterate, if we are dealing with an interface which accepts an
|
|
integer array, it should be defined with M and E (which are
|
|
always distinct) and not I/L (which might be
|
|
indistinguishable). Example in case:
|
|
interface psb_realloc
|
|
Subroutine psb_r_m_m_rk1(len,rrax,info,pad,lb)
|
|
integer(psb_mpk_),Intent(in) :: len
|
|
integer(psb_mpk_), allocatable, intent(inout) :: rrax(:)
|
|
integer(psb_ipk_) :: info
|
|
integer(psb_mpk_), optional, intent(in) :: pad
|
|
integer(psb_mpk_), optional, intent(in) :: lb
|
|
|
|
G. The INFO argument and others related to error handling should
|
|
always be PSB_IPK_;
|
|
|
|
H. Arguments related to MPI interfacing should always be PSB_MPK_
|
|
|
|
I. Encapsulated types such as psb_i_base_vect_mod can still be I
|
|
and L, because if the name of the type is different, the types
|
|
are interpreted as distinguishable even when the contents are
|
|
identical.
|
|
|
|
L. This means that most user-level interfaces will deal in I and L,
|
|
not M and E, which are going to be used mostly in the
|
|
internals.
|
|
|
|
M. Actually, the user will probably never see M, but will (for
|
|
sizeof & friends) see E.
|
|
|
|
|