5 Communication routines

The routines in this chapter implement various global communication operators on vectors associated with a discretization mesh. For auxiliary communication routines not tied to a discretization space see 6.

5.1 psb_halo — Halo Data Communication

These subroutines gathers the values of the halo elements:

x ← x

where:

x

is a global dense submatrix.




α, x Subroutine


Integer psb_halo
Short Precision Real psb_halo
Long Precision Real psb_halo
Short Precision Complexpsb_halo
Long Precision Complex psb_halo



Table 17: Data types

call psb_halo(x, desc_a, info)
call psb_halo(x, desc_a, info, work, data)

Type:

Synchronous.

On Entry

x

global dense matrix x.
Scope: local
Type: required
Intent: inout.
Specified as: a rank one or two array or an object of type psb_T_vect_type containing numbers of type specified in Table 17.

desc_a

contains data structures for communications.
Scope: local
Type: required
Intent: in.
Specified as: a structured data of type psb_desc_type.

work

the work array.
Scope: local
Type: optional
Intent: inout.
Specified as: a rank one array of the same type of x.

data

index list selector.
Scope: global
Type: optional
Specified as: an integer. Values:psb_comm_halo_,psb_comm_mov_, psb_comm_ext_, default: psb_comm_halo_. Chooses the index list on which to base the data exchange.

On Return

x

global dense result matrix x.
Scope: local
Type: required
Intent: inout.
Returned as: a rank one or two array containing numbers of type specified in Table 17.

info

the local portion of result submatrix y.
Scope: local
Type: required
Intent: out.
An integer value that contains an error code.


PIC


Figure 3: Sample discretization mesh.


Usage Example Consider the discretization mesh depicted in fig. 3, partitioned among two processes as shown by the dashed line; the data distribution is such that each process will own 32 entries in the index space, with a halo made of 8 entries placed at local indices 33 through 40. If process 0 assigns an initial value of 1 to its entries in the x vector, and process 1 assigns a value of 2, then after a call to psb_halo the contents of the local vectors will be the following:


Process 0
Process 1
IGLOB(I)X(I) IGLOB(I)X(I)
1 1 1.0 1 33 2.0
2 2 1.0 2 34 2.0
3 3 1.0 3 35 2.0
4 4 1.0 4 36 2.0
5 5 1.0 5 37 2.0
6 6 1.0 6 38 2.0
7 7 1.0 7 39 2.0
8 8 1.0 8 40 2.0
9 9 1.0 9 41 2.0
10 10 1.010 42 2.0
11 11 1.011 43 2.0
12 12 1.012 44 2.0
13 13 1.013 45 2.0
14 14 1.014 46 2.0
15 15 1.015 47 2.0
16 16 1.016 48 2.0
17 17 1.017 49 2.0
18 18 1.018 50 2.0
19 19 1.019 51 2.0
20 20 1.020 52 2.0
21 21 1.021 53 2.0
22 22 1.022 54 2.0
23 23 1.023 55 2.0
24 24 1.024 56 2.0
25 25 1.025 57 2.0
26 26 1.026 58 2.0
27 27 1.027 59 2.0
28 28 1.028 60 2.0
29 29 1.029 61 2.0
30 30 1.030 62 2.0
31 31 1.031 63 2.0
32 32 1.032 64 2.0
33 33 2.033 25 1.0
34 34 2.034 26 1.0
35 35 2.035 27 1.0
36 36 2.036 28 1.0
37 37 2.037 29 1.0
38 38 2.038 30 1.0
39 39 2.039 31 1.0
40 40 2.040 32 1.0

5.2 psb_ovrl — Overlap Update

These subroutines applies an overlap operator to the input vector:

x ← Qx

where:

x

is the global dense submatrix x

Q

is the overlap operator; it is the composition of two operators Pa and PT.




x Subroutine


Short Precision Real psb_ovrl
Long Precision Real psb_ovrl
Short Precision Complexpsb_ovrl
Long Precision Complex psb_ovrl



Table 18: Data types

call psb_ovrl(x, desc_a, info)
call psb_ovrl(x, desc_a, info, update=update_type, work=work)

Type:

Synchronous.

On Entry

x

global dense matrix x.
Scope: local
Type: required
Intent: inout.
Specified as: a rank one or two array or an object of type psb_T_vect_type containing numbers of type specified in Table 18.

desc_a

contains data structures for communications.
Scope: local
Type: required
Intent: in.
Specified as: a structured data of type psb_desc_type.

update

Update operator.

update = psb_none_

Do nothing;

update = psb_add_

Sum overlap entries, i.e. apply PT;

update = psb_avg_

Average overlap entries, i.e. apply PaPT;

Scope: global
Intent: in.
Default: update_type = psb_avg_
Scope: global
Specified as: a integer variable.

work

the work array.
Scope: local
Type: optional
Intent: inout.
Specified as: a one dimensional array of the same type of x.

On Return

x

global dense result matrix x.
Scope: local
Type: required
Intent: inout.
Specified as: an array of rank one or two containing numbers of type specified in Table 18.

info

Error code.
Scope: local
Type: required
Intent: out.
An integer value; 0 means no error has been detected.

Notes

  1. If there is no overlap in the data distribution associated with the descriptor, no operations are performed;

  2. The operator PT performs the reduction sum of overlap elements; it is a “prolongation” operator PT that replicates overlap elements, accounting for the physical replication of data;

  3. The operator Pa performs a scaling on the overlap elements by the amount of replication; thus, when combined with the reduction operator, it implements the average of replicated elements over all of their instances.


PIC


Figure 4: Sample discretization mesh.


Example of use Consider the discretization mesh depicted in fig. 4, partitioned among two processes as shown by the dashed lines, with an overlap of 1 extra layer with respect to the partition of fig. 3; the data distribution is such that each process will own 40 entries in the index space, with an overlap of 16 entries placed at local indices 25 through 40; the halo will run from local index 41 through local index 48.. If process 0 assigns an initial value of 1 to its entries in the x vector, and process 1 assigns a value of 2, then after a call to psb_ovrl with psb_avg_ and a call to psb_halo_ the contents of the local vectors will be the following (showing a transition among the two subdomains)


Process 0
Process 1
IGLOB(I)X(I) IGLOB(I)X(I)
1 1 1.0 1 33 1.5
2 2 1.0 2 34 1.5
3 3 1.0 3 35 1.5
4 4 1.0 4 36 1.5
5 5 1.0 5 37 1.5
6 6 1.0 6 38 1.5
7 7 1.0 7 39 1.5
8 8 1.0 8 40 1.5
9 9 1.0 9 41 2.0
10 10 1.010 42 2.0
11 11 1.011 43 2.0
12 12 1.012 44 2.0
13 13 1.013 45 2.0
14 14 1.014 46 2.0
15 15 1.015 47 2.0
16 16 1.016 48 2.0
17 17 1.017 49 2.0
18 18 1.018 50 2.0
19 19 1.019 51 2.0
20 20 1.020 52 2.0
21 21 1.021 53 2.0
22 22 1.022 54 2.0
23 23 1.023 55 2.0
24 24 1.024 56 2.0
25 25 1.525 57 2.0
26 26 1.526 58 2.0
27 27 1.527 59 2.0
28 28 1.528 60 2.0
29 29 1.529 61 2.0
30 30 1.530 62 2.0
31 31 1.531 63 2.0
32 32 1.532 64 2.0
33 33 1.533 25 1.5
34 34 1.534 26 1.5
35 35 1.535 27 1.5
36 36 1.536 28 1.5
37 37 1.537 29 1.5
38 38 1.538 30 1.5
39 39 1.539 31 1.5
40 40 1.540 32 1.5
41 41 2.041 17 1.0
42 42 2.042 18 1.0
43 43 2.043 19 1.0
44 44 2.044 20 1.0
45 45 2.045 21 1.0
46 46 2.046 22 1.0
47 47 2.047 23 1.0
48 48 2.048 24 1.0

5.3 psb_gather — Gather Global Dense Matrix

These subroutines collect the portions of global dense matrix distributed over all process into one single array stored on one process.

glob-x ← collect(loc-xi)

where:

glob_x

is the global submatrix glob_x1:m,1:n

loc_xi

is the local portion of global dense matrix on process i.

collect

is the collect function.




xi,y Subroutine


Integer psb_gather
Short Precision Real psb_gather
Long Precision Real psb_gather
Short Precision Complexpsb_gather
Long Precision Complex psb_gather



Table 19: Data types

call psb_gather(glob_x, loc_x, desc_a, info, root) call psb_gather(glob_x, loc_x, desc_a, info, root)

Type:

Synchronous.

On Entry

loc_x

the local portion of global dense matrix glob_x.
Scope: local
Type: required
Intent: in.
Specified as: a rank one or two array or an object of type psb_T_vect_type indicated in Table 19.

desc_a

contains data structures for communications.
Scope: local
Type: required
Intent: in.
Specified as: a structured data of type psb_desc_type.

root

The process that holds the global copy. If root = -1 all the processes will have a copy of the global vector.
Scope: global
Type: optional
Intent: in.
Specified as: an integer variable -1 root np - 1, default -1.

On Return

glob_x

The array where the local parts must be gathered.
Scope: global
Type: required
Intent: out.
Specified as: a rank one or two array with the ALLOCATABLE attribute.

info

Error code.
Scope: local
Type: required
Intent: out.
An integer value; 0 means no error has been detected.

5.4 psb_scatter — Scatter Global Dense Matrix

These subroutines scatters the portions of global dense matrix owned by a process to all the processes in the processes grid.

loc-xi ← scatter(glob-x)

where:

glob_x

is the global matrix glob_x1:m,1:n

loc_xi

is the local portion of global dense matrix on process i.

scatter

is the scatter function.




xi,y Subroutine


Integer psb_scatter
Short Precision Real psb_scatter
Long Precision Real psb_scatter
Short Precision Complexpsb_scatter
Long Precision Complex psb_scatter



Table 20: Data types

call psb_scatter(glob_x, loc_x, desc_a, info, root, mold)

Type:

Synchronous.

On Entry

glob_x

The array that must be scattered into local pieces.
Scope: global
Type: required
Intent: in.
Specified as: a rank one or two array.

desc_a

contains data structures for communications.
Scope: local
Type: required
Intent: in.
Specified as: a structured data of type psb_desc_type.

root

The process that holds the global copy. If root = -1 all the processes have a copy of the global vector.
Scope: global
Type: optional
Intent: in.
Specified as: an integer variable -1 root np - 1, default psb_root_, i.e. process 0.

mold

The desired dynamic type for the internal vector storage.
Scope: local.
Type: optional.
Intent: in.
Specified as: an object of a class derived from psb_T_base_vect_type; this is only allowed when loc_x is of type psb_T_vect_type.

On Return

loc_x

the local portion of global dense matrix glob_x.
Scope: local
Type: required
Intent: out.
Specified as: a rank one or two ALLOCATABLE array or an object of type psb_T_vect_type containing numbers of the type indicated in Table 20.

info

Error code.
Scope: local
Type: required
Intent: out.
An integer value; 0 means no error has been detected.