Initial revision

psblas3-type-indexed
Salvatore Filippone 20 years ago
parent 97306b6d74
commit 960e13a6b4

@ -0,0 +1,114 @@
Changelog. A lot less detailed than usual, at least for past
history.
2005/05/04: Now enabled SuperLU complete factorization as basis for AS.
2005/04/29: First version with decoupled 2-level.
2005/04/06: Started work on decoupling the preconditioner aggregation
for 2-level from the main factorization.
2005/03/30: First version of new DSC/SP allocate/insert/assembly
routines.
2005/03/17: First version of RGMRES. To be refined.
2005/03/08: dSwapTran aligned with dSwapData. Taken out SwapOverlap.
also moved onto iSwapX.
2005/03/07: dSwapData rewritten to achieve: 1. better performance;
2. more flexible functionality. It is now possible to
avoid SwapOvrlap entirely, relying on just SwapData.
SwapTran is still alive, since it reads the descriptors in
"transpose" mode. Also, added work areas to preconditioner
routine, to avoid excessive allocation in the halo/overlap
exchange.
2005/03/04: Had to put in a workaround for a gfortran bug:
tolower/toupper cannot be functions.
2005/02/09: Explicit storage choice for the smoother. This seems
to be changing a little bit the actual preconditioner.
To be evaluated further.
2005/02/08: Renamed F90_PSPREC to PSB_PRCAPLY and Preconditioner to
PSB_PRCBLD. Changed the way PRCAPLY decides what to do.
Still needs a PSB_PRCSET to be called before PRCBLD.
2005/01/28: Started moving functionalities to a SERIAL F90 layer. Also
defined a new COMM layer, to enable implementing SPMM
directly in F90.
2005/01/20: Finally taken out a direct call to the F77 DCSDP from
SPASB.
2005/01/18: After much work, we now have 2-level Additive Schwarz
prototype implemented and working. We now start a major
code cleanup that will take some time. Mainly we want to
move a lot of the serial F77 functionality into a new F95
serial layer, to simplify the parallel F95 code.
2004/11/25: Following the introduction of Additive Shwarz and
variants, we have now renamed DECOMP_ and friends as
DESC_; this makes things more readable. Sooner or later
we're going to merge this into mainline, but this version
is still very much in a state of flux.
2004/07/18: For use with gfortran we need to declare the pointer
components with NULL() initialization. This rules out
VAST and PGI.
2004/07/15: First development version with gfortran from the current
snapshot of gcc 3.5.0.
It is now possible in PSI_dSwapData to opt for
SEND|RECEIVE|SYNC data exchange; plan is to extend to all
data exchange functions, plus making it available as an
option from the F90 level.
2004/07/06: Merged in a lot of stuff coming mainly from the ASM
development; full merge will have to wait a little more.
Among other things:
use of psimod
new choice parms for overlap
new data exchange for swapdata, to be extended.
multicolumn CSMM.
use psrealloc
new format for marking a matrix as suitable for update.
2003/12/09: Changed DSALLOC and DSASB to make sure whenever a dense
matrix is allocated it is also zeroed out.
2003/10/13: Added call to BLACS_SET in the solvers to ensure global
heterogeneous coherence in the combine operations.
2003/09/30: Added LOC_TO_GLOB and GLOB_TO_LOC support routines.
2003/09/30: Changed interface for smart update capabilities: choose
with optional parameters in ASB routines.
2003/09/16: IFC 7.0 had a strange behaviour in the test programs:
sometimes the declaration of PARTS dummy argument with an
INTERFACE would not work, requiring an EXTERNAL
declaration. The proper INTERFACE works now with 7.1.
2003/03/10: Halo data exchange in F90_PSHALO can now be applied to
integer data; create appropriate support routines.
2002/12/05: Initial version of Fileread sample programs.
2002/11/19: Fixes for JAD preconditioner.
2002/11/19: Methods for patterns: create a descriptor without a
matrix.
2001/11/16: Reviewed the interfaces: in the tools section we really
need the POINTER attribute for dense vectors, but not in
the computational routines; taking it out allows more
flexibility.
2001/09/16: Smart update capabilities.
2001/03/16: Renumbering routines.
2001/01/14: Added extensions to compute multiple DOTs and AMAXs at once;

@ -0,0 +1,79 @@
.mod=.mod
.SUFFIXES: .f90 $(.mod)
####################### Section 1 #######################
# Define your compilers and compiler flags here #
##########################################################
F90=ifort
FC=ifort
CC=icc
F77=$(FC)
F90COPT=-g -CB -no_cpprt
FCOPT=-g -CB -no_cpprt
CCOPT=-g -CB -no_cpprt
####################### Section 2 #######################
# Define your linker and linker flags here #
##########################################################
F90LINK=/usr/local/mpich-intel/bin/mpif90 -g -CB -no_cpprt
FLINK=mpif77 -g -CB -no_cpprt
MPF90=/usr/local/mpich-intel/bin/mpif90 -g -CB -no_cpprt
MPCC=/usr/local/mpich-intel/bin/mpicc -g -CB -no_cpprt
####################### Section 3 #######################
# Specify paths to libraries #
##########################################################
BLAS=-lblas-intel -L$(HOME)/NUMERICAL/LIB
BLACS=-lmpiblacs-intel -L$(HOME)/NUMERICAL/LIB
####################### Section 4 #######################
# Other useful tools&defines #
##########################################################
CDEFINES=-DAdd_
AR=ar -cur
RANLIB=ranlib
####################### Section 5 #######################
# Do not edit this #
##########################################################
LIBDIR = lib
PSBLASLIB = libpsblas.a
TOOLSLIB = libpsbtools.a
COMMLIB = libpsbcomm.a
METHDLIB = libpsbmethd.a
PRECLIB = libpsbprec.a
TYPEMODS = psb_spmat_type$(.mod) psb_desc_type$(.mod) psb_prec_type$(.mod) psb_realloc_mod$(.mod)
CONSTMODS = psb_tools_const$(.mod)
BLASMODS = $(TYPEMODS) psb_psblas_mod$(.mod) psb_comm_mod$(.mod)
METHDMODS = psb_methd_mod$(.mod)
TOOLSMODS = $(CONSTMODS) psi_mod$(.mod) psb_tools_mod$(.mod) psb_serial_mod$(.mod)
PRECMODS = psb_prec_mod$(.mod)
ERRORMODS = psb_error_mod$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) $(ERRORMODS) string$(.mod)
MODS=$(LIBDIR)/psb_tools_const$(.mod) $(LIBDIR)/psb_spmat_type$(.mod) $(LIBDIR)/psb_realloc_mod$(.mod) \
$(LIBDIR)/psb_desc_type$(.mod) $(LIBDIR)/psb_prec_type$(.mod) $(LIBDIR)/parts.f90 \
$(LIBDIR)/psb_serial_mod$(.mod) $(LIBDIR)/psb_comm_mod$(.mod) $(LIBDIR)/psb_error_mod$(.mod)
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) -I $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) -I $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) -I $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) -I $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) -I $(INCDIRS) -c $<

@ -0,0 +1,54 @@
# Using GNU gfortran (from GCC 3.5.0)
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=/usr/local/g95-install/bin/g95
FC=/usr/local/g95-install/bin/g95
F77=$(FC)
F90COPT=-O3 -ggdb -fbounds-check
FCOPT=-O3 -ggdb -fbounds-check
CC=gcc
CCOPT=-O3 -ggdb
F90LINK=/usr/local/mpich-g95/bin/mpif90
FLINK=mpif77
MPF90=$(F90LINK)
MPCC=/usr/local/mpich-g95/bin/mpicc
#
#
BLAS=-lblas -L$(HOME)/LIB
BLACS=-lmpiblacsg95 -L$(HOME)/LIB
#
CDEFINES=-DAddDouble_
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod) typeprec$(.mod) realloc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod) f90comm$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) psimod$(.mod) f90tools$(.mod) f90serial$(.mod) string$(.mod)
PRECMODS = f90prec$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) $(LIBDIR)/realloc$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/typeprec$(.mod) $(LIBDIR)/parts.f90 \
$(LIBDIR)/f90serial$(.mod) $(LIBDIR)/f90comm$(.mod) $(LIBDIR)/string$(.mod)
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) $(INCDIRS) -c $<

@ -0,0 +1,62 @@
# Using GNU gfortran (from GCC 3.5.0)
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=/usr/local/gfortran/bin/gfortran
FC=/usr/local/gfortran/bin/gfortran
F77=$(FC)
F90COPT=-O3 -ffast-math -march=pentium4 -msse2 -mfpmath=sse
FCOPT=-O3 -ffast-math -march=pentium4 -msse2 -mfpmath=sse
CC=/usr/local/gfortran/bin/gcc
CCOPT=-O3 -ffast-math -march=pentium4 -msse2 -mfpmath=sse
F90LINK=/usr/local/mpich-gfortran/bin/mpif90
FLINK=mpif77
MPF90=$(F90LINK)
MPCC=/usr/local/mpich-gfortran/bin/mpicc
#
#
BLAS=-lblas -L$(HOME)/LIB
BLACS=-lmpiblacs-gfortran -L$(HOME)/LIB
#
# Comment these, and uncomment SLUDEF below if you don't want SuperLU.
SLUDIR=/usr/local/SuperLU_3.0
SLU=-lslu_lx_gfort -L$(SLUDIR)
SLUDEF=-DHave_SLU_ -I$(SLUDIR)
# SLUDEF=
CDEFINES=-DAdd_ $(SLUDEF)
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod) typeprec$(.mod) realloc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod) f90comm$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) psimod$(.mod) f90tools$(.mod) f90serial$(.mod) string$(.mod)
PRECMODS = f90prec$(.mod)
ERRORMODS = errormod$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod) $(ERRORMODS)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) $(LIBDIR)/realloc$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/typeprec$(.mod) $(LIBDIR)/parts.f90 \
$(LIBDIR)/f90serial$(.mod) $(LIBDIR)/f90comm$(.mod) $(LIBDIR)/string$(.mod)\
$(LIBDIR)/errormod$(.mod)
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) $(INCDIRS) -c $<

@ -0,0 +1,54 @@
# Using Intel Fortran compiler version 7.0
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=${IFC7}/bin/ifc
FC=${IFC7}/bin/ifc
F77=$(FC)
F90COPT=-O3
FCOPT=-O3
CC=gcc
CCOPT=-O3
F90LINK=/usr/local/mpich-ifc71/bin/mpif90
FLINK=mpif77
MPF90=/usr/local/mpich-ifc71/bin/mpif90
MPCC=/usr/local/mpich-ifc71/bin/mpicc
#
#
BLAS=-lblas -L$(HOME)/LIB
BLACS=-lmpiblacsifc71 -L$(HOME)/LIB
SLU=-lslu_lx_ifc8
#
CDEFINES=-DAdd_
AR=ar -cur
RANLIB=ranlib
TYPEMODS = TYPESP$(.mod) TYPEDESC$(.mod) TYPEPREC$(.mod) REALLOC$(.mod)
CONSTMODS = TOOLS_CONST$(.mod)
BLASMODS = $(TYPEMODS) F90PSBLAS$(.mod) F90COMM$(.mod)
METHDMODS = F90METHD$(.mod)
TOOLSMODS = $(CONSTMODS) PSIMOD$(.mod) F90TOOLS$(.mod) F90SERIAL$(.mod) STRING$(.mod)
PRECMODS = F90PREC$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) F90SPARSE$(.mod)
MODS=$(LIBDIR)/TOOLS_CONST$(.mod) $(LIBDIR)/TYPESP$(.mod) $(LIBDIR)/REALLOC$(.mod) \
$(LIBDIR)/TYPEDESC$(.mod) $(LIBDIR)/parts.f90 $(LIBDIR)/STRING$(.mod)
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) $(INCDIRS) -c $<

@ -0,0 +1,63 @@
# Using Intel Fortran compiler version 8.0
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=${IFC8}/bin/ifort
FC=${IFC8}/bin/ifort
F77=$(FC)
F90COPT=-O3
FCOPT=-O3
CC=gcc
CCOPT=-O3 -g
F90LINK=/usr/local/mpich-ifc80/bin/mpif90
FLINK=mpif77
MPF90=/usr/local/mpich-ifc80/bin/mpif90
MPCC=/usr/local/mpich-ifc80/bin/mpicc
#
#
BLAS=-lblas -L$(HOME)/LIB
BLACS=-lmpiblacsifc80 -L$(HOME)/LIB
# Comment these, and uncomment SLUDEF below if you don't want SuperLU.
SLUDIR=/usr/local/SuperLU_3.0
SLU=-lslu_lx_ifc8 -L$(SLUDIR)
SLUDEF=-DHave_SLU_ -I$(SLUDIR)
# SLUDEF=
CDEFINES=-DAdd_ $(SLUDEF)
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod) typeprec$(.mod) realloc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod) f90comm$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) psimod$(.mod) f90tools$(.mod) f90serial$(.mod) string$(.mod)
PRECMODS = f90prec$(.mod)
ERRORMODS = errormod$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod) $(ERRORMODS)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) $(LIBDIR)/realloc$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/typeprec$(.mod) $(LIBDIR)/parts.f90 \
$(LIBDIR)/f90serial$(.mod) $(LIBDIR)/f90comm$(.mod) $(LIBDIR)/string$(.mod)\
$(LIBDIR)/errormod$(.mod)
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) $(INCDIRS) -c $<

@ -0,0 +1,52 @@
# Using Lahey F95
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=lf95
FC=lf95
F77=$(FC)
FCOPT= -O
F90COPT= -O
CC=gcc
CCOPT=-O2 -g -ggdb -pg
F90LINK=mpif90
FLINK=mpif77
#
#
BLAS=-lblas -L$(HOME)/LIB
BLACS=-lmpiblacslh -L$(HOME)/LIB
#
CDEFINES=-DAdd_
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) f90tools$(.mod)
F90MODS= $(BLASMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/parts.f90
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) $(INCDIRS) -c $<

@ -0,0 +1,58 @@
# Using GNU gfortran (from GCC 3.5.0)
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=/opt/nag/bin/f95
FC=/opt/nag/bin/f95
F77=$(FC)
F90COPT=-O3 -mismatch
FCOPT=-O3 -dusty
CC=gcc
CCOPT=-O3
F90LINK=$(HOME)/mpich-nag/bin/mpif90
FLINK=$(HOME)/mpich-nag/bin/mpif77
MPF90=$(F90LINK) -mismatch
MPCC=$(HOME)/mpich-nag/bin/mpicc
#
#
BLAS=-lblasnag -L$(HOME)/LIB
BLACS=-lmpiblacs-nag -L$(HOME)/LIB
#
CDEFINES=-DAdd_ -DHave_SLU_
SLUDIR=$(HOME)/SuperLU_3.0
SLUINC=-I$(SLUDIR)
SLU=-lslu_lx_nag -L$(SLUDIR)
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod) typeprec$(.mod) realloc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod) f90comm$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) psimod$(.mod) f90tools$(.mod) f90serial$(.mod) string$(.mod)
PRECMODS = f90prec$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) $(LIBDIR)/realloc$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/typeprec$(.mod) $(LIBDIR)/parts.f90 \
$(LIBDIR)/f90serial$(.mod) $(LIBDIR)/f90comm$(.mod) $(LIBDIR)/string$(.mod)
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) $(INCDIRS) -c $<

@ -0,0 +1,54 @@
# Using PGI Fortran compilers
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=pgf90
FC=pgf90
F77=$(FC)
F90COPT=-fast -g77libs
FCOPT=-fast -g77libs
CC=gcc
CCOPT=-O2 -g -ggdb -pg
F90LINK=/usr/local/mpich-pgi/bin/mpif90
FLINK=/usr/local/mpich-pgi/bin/mpif77
MPF90=/usr/local/mpich-pgi/bin/mpif90
MPCC=/usr/local/mpich-pgi/bin/mpicc
#
#
BLAS=-lblas -L$(HOME)/LIB
BLACS=-lmpiblacspgi -L$(HOME)/LIB
#
CDEFINES=-DAdd_
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod) typeprec$(.mod) realloc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) psimod$(.mod) f90tools$(.mod)
PRECMODS = f90prec$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) $(LIBDIR)/realloc$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/typeprec$(.mod) $(LIBDIR)/parts.f90
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(F90COPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(F90COPT) $(INCDIRS) -c $<

@ -0,0 +1,50 @@
# Using XLF
.mod=.mod
.SUFFIXES: .f90 $(.mod)
F90=xlf95 -qsuffix=f=f90
FC=xlf
F77=$(FC)
FCOPT=-O3
CC=xlc
CCOPT=-O3
F90LINK=mpxlf90
MPF90=mpxlf95 -qsuffix=f=f90
FLINK=mpxlf77
MPCC=mpxlc
#
#
BLAS=-lessl
BLACS=-lmpiblacs -L$(HOME)/LIB
#
CDEFINES=-DNoChange
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod) typeprec$(.mod) realloc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod) f90comm$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) psimod$(.mod) f90tools$(.mod) f90serial$(.mod)
PRECMODS = f90prec$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) $(LIBDIR)/realloc$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/typeprec$(.mod) $(LIBDIR)/parts.f90 \
$(LIBDIR)/f90serial$(.mod) $(LIBDIR)/f90comm$(.mod)
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(FC) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(FCOPT) $(INCDIRS) -c $<

@ -0,0 +1,53 @@
# Using VAST F90
.mod=.vo
.SUFFIXES: .f90 $(.mod)
F90=/usr/local/VASTF90/f90
FC=/usr/local/VASTF90/f90
F77=$(FC)
FCOPT=-O2 -ff90 -g -ggdb -pg
#-march=pentium4 -mfpmath=sse #You may want to use these
CC=gcc
CCOPT=-O2 -g -ggdb -pg
F90LINK=/usr/local/mpich-vast/bin/mpif90
FLINK=mpif77
MPF90=/usr/local/mpich-vast/bin/mpif90
MPCC=/usr/local/mpich-vast/bin/mpicc
#
#
BLAS=-lblas -L$(HOME)/LIB
BLACS=-lmpiblacsvast -L$(HOME)/LIB
#
CDEFINES=-DAdd_
AR=ar -cur
RANLIB=ranlib
TYPEMODS = typesp$(.mod) typedesc$(.mod) typeprec$(.mod) realloc$(.mod)
CONSTMODS = tools_const$(.mod)
BLASMODS = $(TYPEMODS) f90psblas$(.mod)
METHDMODS = f90methd$(.mod)
TOOLSMODS = $(CONSTMODS) psimod$(.mod) f90tools$(.mod)
PRECMODS = f90prec$(.mod)
F90MODS= $(BLASMODS) $(PRECMODS) $(METHDMODS) $(TOOLSMODS) f90sparse$(.mod)
MODS=$(LIBDIR)/tools_const$(.mod) $(LIBDIR)/typesp$(.mod) $(LIBDIR)/realloc$(.mod) \
$(LIBDIR)/typedesc$(.mod) $(LIBDIR)/typeprec$(.mod) $(LIBDIR)/parts.f90
# Under Linux/gmake there is a rule interpreting .mod as Modula source!
$(.mod).o:
.f.o:
$(FC) $(FCOPT) $(INCDIRS) -c $<
.c.o:
$(CC) $(CCOPT) $(INCDIRS) $(CDEFINES) -c $<
.f$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90$(.mod):
$(F90) $(FCOPT) $(INCDIRS) -c $<
.f90.o:
$(F90) $(FCOPT) $(INCDIRS) -c $<

@ -0,0 +1,11 @@
include Make.inc
lib:
( [ -d lib ] || mkdir lib)
(cd src; make lib)
clean:
(cd src; make clean)
veryclean:
(cd src; make veryclean)
(cd lib; /bin/rm -f *.a *$(.mod) V*.inc *.pc *.pcl)

143
README

@ -0,0 +1,143 @@
This directory contains the PSBLAS library, version 1.0
The library is described in:
S. Filippone, M. Colajanni
PSBLAS: A library for parallel linear algebra computation on sparse matrices
ACM Trans. on Math. Software, 26(4), Dec. 2000, pp. 527-550.
PLATFORMS:
For the F77 compiler, we assume it supports DOUBLE COMPLEX and DO WHILE/ENDDO.
Practically all compilers do nowadays.
The compilation process relies on the choice of an appropriate
Make.inc file; we have tested with AIX XLF, Intel ifc/Linux, Lahey
F95/Linux and Nag f95/Linux. If you succeed in compiling with other
compiler/operating systems please let us know.
IBM SP2.
The library has been tested on an IBM SP2 with XLC and XLF
compilers, and a version of the BLACS based on MPI.
The rather baroque setting
F90=xlf90 -qsuffix=f=f90
in Make.inc.rs6k takes care of the f90 extension.
WARNING: xlf 8.1 introduced a performance bug, whereas a Fortan 90
code calling a Fortan 77 code would incur spurious array copies;
please make sure your system has the PTF xlf 8102 installed.
LINUX:
There finally exist a GNU Fortran 95 implementation: we are using the
development snapshots from GCC 3.5.0 since July 2004, and it appears
to work. This is on its way to become the reference platform.
We compiled with egcs G77/GCC and mpich-1.2.0, 1.2.1, 1.2.2 and 1.2.4.
With some versions of GCC, g77 chokes on SRC/SERIAL/dcsdp.f; this
problem seems to have disappeared in recent GCC distributions.
Vast F90/Linux and PGI f90/Linux are now obsolete, in that we have a
NULL() initialization for pointers that is outsied strict F90; they
cannot be used any longer.
For the PGI compilers, we used them in conjunction with gcc, NOT
pgcc. Note that with pgi 3.6 we have horrible performance, due to
spurious array copies when calling Fortran 77 codes from Fortran 90;
this is fixed in version 4.
The Lahey version we got access to (6.0 and 6.1) seems to suffer from
the same extra copies problem; this is most apparent in the matrix
build process.
For the Intel compilers, we used ifc version 7.0 and 7.1; with version 6.0
you need to change the way modules are handled, but we recommend to migrate
to the new version anyway. Moreover, with versions prior to 7.1, there
is a strange error in pargen/ppde90: the compiler did not like the
INTERFACE for the dummy argument subroutine PARTS, it wanted an
EXTERNAL specification. Again, please move to 7.1.
Testing of NAG f95 versions is still incomplete.
DOCUMENTATION
See userguidef90.ps.
Please consult the sample programs, especially TEST/pargen/ppde90.f90.
OTHER SOFTWARE CREDITS
We include our modified implementation of some of the Sparker (serial
sparse BLAS) material, e.g. Jagged diagonal, plus a number of
extensions of our own design. The original file spblas.f can be
downloaded from matisa.cc.rl.ac.uk; of course any bugs in our
implementation are our own to fix. The main reference for the serial
sparse BLAS is:
Duff, I., Marrone, M., Radicati, G., and Vittoli, C.
Level 3 basic linear algebra subprograms for sparse matrices: a user
level interface
ACM Trans. Math. Softw., 23(3), 379-401, 1997.
We have had good results with the METIS library, which can be
obtained from
http://www-users.cs.umn.edu/~karypis/metis/metis/main.html
NEW:
- Reviewed the interfaces: in the tools section we really need the
POINTER attribute for dense vectors, but not in the computational
routines; taking it out allows more flexibility. Besides, this acts
as a workaround for a bug on Linux/VAST: as of version
"vf90 Personal V3.4N5" the DOT and NRM functions had to be
transformed into subroutines whenever the vectors had the POINTER
attribute; now both subroutine and function versions work.
- Added more methods: CGS, BiCGSTAB(L), BiCG
- We now have a new Preconditioner F90 routine; diagonal scaling works
with COO, CSR and JAD; ILU works with JAD, but is more expensive in terms
of memory space
- We added some extensions to compute multiple DOTs and AMAXs at once;
they can be useful in solving vector equations (e.g. momentum
equation in fluid dynamics).
- Halo data exchange in F90_PSHALO can now be applied to integer data;
- There is an update capability for cases where the same sparsity
pattern is reused (e.g. multiple time steps over a discretization
mesh which maintains a constant topology)
- We added a test program to read and solve sparse matrices from files in
Matrix-Market format (for details see http://math.nist.gov/MatrixMarket/)
TODO:
1. Extend the C/F77 interface for character data (some Fortran
compilers don't just pass a pointer and a hidden length!)
2. Provide more general factorizations.
The PSBLAS team.
Credits:
Salvatore Filippone
Michele Colajanni
Alfredo Buttari
Fabio Cerioni
Stefano Maiolatesi
Dario Pascucci

File diff suppressed because it is too large Load Diff

@ -0,0 +1,5 @@
integer, parameter :: nohalo_=0, halo_=4
integer, parameter :: none_=0,sum_=1,avg_=2,square_root_=3
integer, parameter :: swap_send=1, swap_recv=2
integer, parameter :: swap_sync=4,swap_mpi=8
character, parameter :: all='A',topdef=' '

@ -0,0 +1,6 @@
integer, parameter :: deadlock_check=0,local_mtrx_check=1
integer, parameter :: local_comm_check=2,consistency_check=3
integer, parameter :: global_check=4,order_communication=5
integer, parameter :: change_represent=6,loc_to_glob_check=7
integer, parameter :: convert_halo=1,convert_ovrlap=2
integer, parameter :: act_ret=0, act_abort=1, no_err=0

@ -0,0 +1,13 @@
integer, parameter :: dec_type_=1,m_=2,n_=3
integer, parameter :: n_row_=4,n_col_=5,ctxt_=6
integer, parameter :: loc_to_glob_=7,mpi_c_=9,mdata_size=10
integer, parameter :: desc_asb=3099, desc_bld=desc_asb+1
integer, parameter :: desc_upd=desc_bld+1
integer, parameter :: desc_upd_asb=desc_upd+1
integer, parameter :: upd_glb=998, upd_loc=997
integer, parameter :: proc_id_=0,n_elem_recv_=1
integer, parameter :: elem_recv_=2,n_elem_send_=2
integer, parameter :: elem_send_=3,n_ovrlp_elem_=1
integer, parameter :: ovrlp_elem_to_=2
integer, parameter :: ovrlp_elem_=0, n_dom_ovr_=1

Binary file not shown.

@ -0,0 +1,7 @@
interface
!.....user passed subroutine.....
subroutine parts(glob_index,nrow,np,pv,nv)
integer, intent (in) :: glob_index,np,nrow
integer, intent (out) :: nv, pv(*)
end subroutine parts
end interface

Binary file not shown.

@ -0,0 +1,38 @@
integer, parameter :: psb_nohalo_=0, psb_halo_=4
integer, parameter :: psb_none_=0,psb_sum_=1
integer, parameter :: psb_avg_=2,psb_square_root_=3
integer, parameter :: psb_swap_send_=1,psb_swap_recv_=2
integer, parameter :: psb_swap_sync_=4,psb_swap_mpi_=8
integer, parameter :: psb_deadlock_check_=0
integer, parameter :: psb_local_mtrx_check_=1
integer, parameter :: psb_local_comm_check_=2
integer, parameter :: psb_consistency_check_=3
integer, parameter :: psb_global_check_=4
integer, parameter :: psb_order_communication_=5
integer, parameter :: psb_change_represent_=6
integer, parameter :: psb_loc_to_glob_check_=7
integer, parameter :: psb_convert_halo_=1
integer, parameter :: psb_convert_ovrlap_=2
integer, parameter :: psb_act_ret_=0
integer, parameter :: psb_act_abort_=1, no_err_=0
integer, parameter :: psb_dec_type_=1,psb_m_=2,psb_n_=3
integer, parameter :: psb_n_row_=4,psb_n_col_=5,psb_ctxt_=6
integer, parameter :: psb_loc_to_glob_=7
integer, parameter :: psb_mpi_c_=9,psb_mdata_size_=10
integer, parameter :: psb_desc_asb_=3099
integer, parameter :: psb_desc_bld_=psb_desc_asb_+1
integer, parameter :: psb_desc_upd_=psb_desc_bld_+1
integer, parameter :: psb_desc_upd_asb_=psb_desc_upd_+1
integer, parameter :: psb_upd_glb_=998,psb_upd_loc_=997
integer, parameter :: psb_proc_id_=0,psb_n_elem_recv_=1
integer, parameter :: psb_elem_recv_=2,psb_n_elem_send_=2
integer, parameter :: psb_elem_send_=3,psb_n_ovrlp_elem_=1
integer, parameter :: psb_ovrlp_elem_to_=2,psb_ovrlp_elem_=0
integer, parameter :: psb_nnz_=1, psb_n_dom_ovr_=1
integer, parameter :: psb_no_comm_=-1, psb_nzsizereq_=3
integer, parameter :: ione=1, done=1.d0,izero=0, dzero=0.d0
integer, parameter :: itwo=2, ithree=3,root=0, act_abort=1
integer, parameter :: psb_nztotreq_=1,psb_nzrowreq_=2
character, parameter :: psb_all_='A',psb_topdef_=' '

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

@ -0,0 +1,3 @@
include 'desc.fh'
include 'comm.fh'
include 'const.fh'

@ -0,0 +1,819 @@
/* ---------------------------------------------------------------------
*
* -- PSBLAS routine (version 1.0) --
*
* ---------------------------------------------------------------------
*/
/*
* This file includes the standard C libraries, as well as system
* dependent include files. All PSBLAS routines include this file.
*/
#include <string.h>
#ifndef PSBLASH
#define PSBLASH
/*
* ========================================================================
* Machine Specific PBLAS macros
* ========================================================================
*/
/* This is a debugging option.
#define PS_CONTROL_LEVEL */
#define _HAL_ 0
#define _T3D_ 1
#ifdef T3D
#define _MACH_ _T3D_
#endif
#ifndef _MACH_
#define _MACH_ _HAL_
#endif
/*
* ========================================================================
* Include files
* ========================================================================
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#if( _MACH_ == _T3D_ )
#include <fortran.h>
#endif
#ifdef USE_FBLACS
#ifndef CTOF_BLACS
#include "ctof_blacs.h"
#endif
#endif
/*
* ========================================================================
* FORTRAN <-> C interface
* ========================================================================
*
* These macros define how the PBLAS will be called. _F2C_ADD_ assumes
* that they will be called by FORTRAN, which expects C routines to have
* an underscore postfixed to the name (Suns, and Intel machines expect
* this). _F2C_NOCHANGE indicates that FORTRAN will be calling, and that
* it expects the name called by FORTRAN to be identical to that compiled
* by the C (RS6K's do this). _F2C_UPCASE says it expects C routines
* called by FORTRAN to be in all upcase (CRAY wants this).
*/
#define _F2C_ADD_ 0
#define _F2C_NOCHANGE 1
#define _F2C_UPCASE 2
#ifdef UpCase
#define _F2C_CALL_ _F2C_UPCASE
#endif
#ifdef NoChange
#define _F2C_CALL_ _F2C_NOCHANGE
#endif
#ifdef Add_
#define _F2C_CALL_ _F2C_ADD_
#endif
#ifndef _F2C_CALL_
#define _F2C_CALL_ _F2C_ADD_
#endif
/*
* ========================================================================
* TYPE DEFINITIONS AND CONVERSION UTILITIES
* ========================================================================
*/
typedef struct { float re, im; } complex;
typedef struct { double re, im; } complex16;
#if( _MACH_ == _T3D_ )
/* Type of character argument in a FORTRAN call */
#define F_CHAR _fcd
/* Character conversion utilities */
#define F2C_CHAR(a) ( _fcdtocp( (a) ) )
#define C2F_CHAR(a) ( _cptofcd( (a), 1 ) )
/* Type of FORTRAN functions */
#define F_VOID_FCT void fortran /* Subroutine */
#define F_INTG_FCT int fortran /* INTEGER function */
#define F_DBLE_FCT double fortran /* DOUBLE PRECISION function */
#else
/* Type of character argument in a FORTRAN call */
typedef char * F_CHAR;
/* Character conversion utilities */
#define F2C_CHAR(a) (a)
#define C2F_CHAR(a) (a)
/* Type of FORTRAN functions */
#define F_VOID_FCT void /* Subroutine */
#define F_INTG_FCT int /* INTEGER function */
#define F_DBLE_FCT double /* DOUBLE PRECISION function */
#endif
/*
* ======================================================================
* FUNCTIONS PROTOTYPES
* ======================================================================
*/
void DVSct(int n, int k,int idx[],int flag, double X[], int lx,
double beta, double Y[], int ly);
void DVGth(int n, int k,int idx[],int flag, double X[], int lx,double Y[], int ly);
void IVSct(int n, int k,int idx[],int flag, int X[], int lx,
int beta, int Y[], int ly);
void IVGth(int n, int k,int idx[],int flag, int X[], int lx,int Y[], int ly);
void PSI_dSwapData(int iflag, int n, double beta, double Y[], int ly,
int desc_data[], int desc_halo[],
double *work, int *lwork, int *ierror);
void PSI_dSwapTran(int flag, int n, double beta, double Y[], int ly,
int desc_data[], int desc_halo[],
double *work, int *lwork, int *ierror);
void PSI_zSwapData(int n, double Y[], int ly, int desc_data[], int desc_halo[],
double *work, int *lwork, int *ierror);
void PSI_zSwapOverlap(double Y[], double Sum_Ovrlap[], int desc_data[],
int desc_ovrlap[], double work[], int *lwork, int *ierror);
void PSI_iSwapData(int iflag, int n, int beta, int Y[], int ly,
int desc_data[], int desc_halo[],
int *work, int *lwork, int *ierror);
void PSI_iSwapTran(int flag, int n, int beta, int Y[], int ly,
int desc_data[], int desc_halo[],
int *work, int *lwork, int *ierror);
/*
* ========================================================================
* #DEFINE MACRO CONSTANTS
* ========================================================================
*/
/* MACRO max */
#define max(x,y) ((x)>(y)?(x):(y))
/*MACRO for ovrlap update*/
#define NOHALO_ 0
#define HALO_ 4
#define NONE_ 0
#define SUM_ 1
#define AVG_ 2
#define SQUARE_ROOT_ 3
/* Bit fields to control swapdata/ovrlap behaviour.
BEWARE: check consistency with tools_const.f.
Should it be automated? */
#define SWAP_SEND 1
#define SWAP_RECV 2
#define SWAP_SYNC 4
#define SWAP_MPI 8
/* Macro for MATRIX_DATA array */
#define DEC_TYPE_ 0 /* The type of decomposition of global
matrix A. */
#define M_ 1 /* Number of equations */
#define N_ 2 /* Number of variables */
#define N_ROW_ 3 /* The number of row of local matrix. */
#define N_COL_ 4 /* The number of columns of local
matrix. */
#define CTXT_ 5 /* The BLACS context handle, indicating
the global context of the operation
on the matrix.
The context itself is global. */
#define LOC_TO_GLOB_ 6 /* The pointer to the array
loc_to_glob */
#define MPI_C_ 8 /* The MPI Fortran handle */
/* values for DEC_TYPE_ */
#define DESC_ASB 3099
#define DESC_BLD (DESC_ASB+1)
/* Macro for HALO array */
#define PROC_ID_ 0 /* The identifier of domain. */
#define N_ELEM_RECV_ 1 /* The number of elements to receive*/
#define ELEM_RECV_ 2 /* The first index of local elements */
#define N_ELEM_SEND_ 2 /* The number of elements to send */
#define ELEM_SEND_ 3 /* The first index of local elements */
/* Macro for OVERLAP array */
#define N_OVRLP_ELEM_ 1 /* The number of overlap elements to recv/send */
#define OVRLP_ELEM_TO_ 2 /* The first index of local elements */
/* Macro for OVR_ELEM_D array */
#define OVRLP_ELEM_ 0
#define N_DOM_OVR_ 1
#define BROADCAST "B" /* Blacs operation definitions */
#define COMBINE "C"
#define ALL "A" /* Scope definitions */
#define COLUMN "C"
#define ROW "R"
#define TOPDEF " " /* Default BLACS topology, PB-BLAS routines */
#define CTOPDEF ' '
#define TOPGET "!"
#define YES "Y"
#define NO "N"
#define MULLENFAC 2
#define ONE 1.0
#define ZERO 0.0
/* Integer values for error checking */
#define no_err 0
#define act_ret 0
#define act_abort 1
/*
* ========================================================================
* PREPROCESSOR MACRO FUNCTIONS USED FOR OPTIMIZATION & CONVENIENCE
* ========================================================================
*/
#define ABS(a) ((a > 0) ? (a) : (-a))
#define MIN(a,b) ((a < b) ? (a) : (b))
#define MAX(a,b) ((a > b) ? (a) : (b))
#define CEIL(a,b) ( (a+b-1) / (b) )
#define Mlowcase(C) ( ((C) > 64 && (C) < 91) ? (C) | 32 : (C) )
#define Mupcase(C) ( ((C) > 96 && (C) < 123) ? (C) & 0xDF : (C) )
#define INDXG2L( iglob, nb, iproc, isrcproc, nprocs )\
( (nb) * ( ( (iglob)-1) / ( (nb) * (nprocs) ) ) +\
( ( (iglob) - 1 ) % (nb) ) + 1 )
#define INDXL2G( iloc, nb, iproc, isrcproc, nprocs )\
( (nprocs) * (nb) * ( ( (iloc) - 1 ) / (nb) ) +\
( ( (iloc) - 1 ) % (nb) ) +\
( ( (nprocs) + (iproc) - (isrcproc) ) % (nprocs) ) * (nb) + 1 )
#define INDXG2P( iglob, nb, iproc, isrcproc, nprocs ) \
( ( (isrcproc) + ( (iglob) - 1 ) / (nb) ) % (nprocs) )
#define MYROC0( nblocks, n, nb, nprocs )\
( ( (nblocks) % (nprocs) ) ? ( ( (nblocks) / (nprocs) ) * (nb) + (nb) )\
: ( ( (nblocks) / (nprocs) )* (nb) + ( (n) % (nb) ) ) )
#if( _F2C_CALL_ == _F2C_ADD_ )
/*
* These defines set up the naming scheme required to have a FORTRAN
* routine call a C routine (which is what the PBLAS are written in).
* No redefinition necessary to have following FORTRAN to C interface:
* FORTRAN CALL C DECLARATION
* call pdgemm(...) void pdgemm_(...)
*
* This is the default.
*/
#define dcsmm dcsmm_
#define dcssm dcssm_
#define dcsnmi dcsnmi_
#define idamax idamax_
#define izamax izamax_
#define ddot ddot_
#define dasum dasum_
#define daxpby daxpby_
#define dscal dscal_
#define zcsmm zcsmm_
#define zcssm zcssm_
#define zcsnmi zcsnmi_
#define zdot zdot_
#define dzasum dzasum_
#define zaxpby zaxpby_
#define zscal zscal_
#define pbchkvectf pbchkvectf_
#define fcpsb_errcomm fcpsb_errcomm_
#define fcpsb_erractionsave fcpsb_erractionsave_
#define fcpsb_erractionrestore fcpsb_erractionrestore_
#define fcpsb_perror fcpsb_perror_
#define fcpsb_serror fcpsb_serror_
#define fcpsb_errpush fcpsb_errpush_
#endif
#if( _F2C_CALL_ == _F2C_UPCASE )
/*
* These defines set up the naming scheme required to have a FORTRAN
* routine call a C routine (which is what the PBLAS are written in)
* following FORTRAN to C interface:
* FORTRAN CALL C DECLARATION
* call pdgemm(...) void PDGEMM(...)
*/
#define pbchkvectf PBCHKVECTF /* PSBLAS */
#define psddot_ PSDDOT
#define psdmdot_ PSDMDOT
#define psddot_sub_ PSDDOT_SUB
#define psdaxpby_ PSDAXPBY
#define psdamax_ PSDAMAX
#define psdmamax_ PSDMAMAX
#define psdasum_ PSDASUM
#define psdnrm2_ PSDNRM2
#define psdnrmi_ PSDNRMI
#define psdnrmisym_ PSDNRMISYM
#define psdhalo_ PSDHALO
#define psihalo_ PSIHALO
#define psdhred_ PSDHRED
#define psdovrl_ PSDOVRL
#define psdspmm_ PSDSPMM
#define psdswaptran_ PSDSWAPTRAN
#define psdspmmsym_ PSDSPMMSYM
#define psdspsm_ PSDSPSM
#define psderror_ PSDERROR
#define psdverify_ PSDVERIFY
#define psdscatterm_ PSDSCATTERM
#define psdgatherm PSDGATHERM
/* PSBLAS */
#define pszdotc_ PSZDOTC
#define pszdotu_ PSZDOTU
#define pszmdot_ PSZMDOT
#define pszaxpby_ PSZAXPBY
#define pszamax_ PSZAMAX
#define pszmamax_ PSZMAMAX
#define pszasum_ PSZASUM
#define psznrm2_ PSZNRM2
#define psznrmi_ PSZNRMI
#define psznrmisym_ PSZNRMISYM
#define pszhalo_ PSZHALO
#define pszovrl_ PSZOVRL
#define pszspmm_ PSZSPMM
#define pszspmmsym_ PSZSPMMSYM
#define pszspsm_ PSZSPSM
#define pszerror_ PSZERROR
#define pszverify_ PSZVERIFY
#define pszscatterm_ PSZSCATTERM
#define pszgatherm_ PSZGATHERM
/* BLACS */
#define blacs_abort_ BLACS_ABORT
#define blacs_gridinfo_ BLACS_GRIDINFO
#define igesd2d_ IGESD2D
#define igebs2d_ IGEBS2D
#define itrsd2d_ ITRSD2D
#define itrbs2d_ ITRBS2D
#define igerv2d_ IGERV2D
#define igebr2d_ IGEBR2D
#define itrrv2d_ ITRRV2D
#define itrbr2d_ ITRBR2D
#define igamx2d_ IGAMX2D
#define igamn2d_ IGAMN2D
#define igsum2d_ IGSUM2D
#define sgesd2d_ SGESD2D
#define sgebs2d_ SGEBS2D
#define strsd2d_ STRSD2D
#define strbs2d_ STRBS2D
#define sgerv2d_ SGERV2D
#define sgebr2d_ SGEBR2D
#define strrv2d_ STRRV2D
#define strbr2d_ STRBR2D
#define sgamx2d_ SGAMX2D
#define sgamn2d_ SGAMN2D
#define sgsum2d_ SGSUM2D
#define dgesd2d_ DGESD2D
#define dgebs2d_ DGEBS2D
#define dtrsd2d_ DTRSD2D
#define dtrbs2d_ DTRBS2D
#define dgerv2d_ DGERV2D
#define dgebr2d_ DGEBR2D
#define dtrrv2d_ DTRRV2D
#define dtrbr2d_ DTRBR2D
#define dgamx2d_ DGAMX2D
#define dgamn2d_ DGAMN2D
#define dgsum2d_ DGSUM2D
#define cgesd2d_ CGESD2D
#define cgebs2d_ CGEBS2D
#define ctrsd2d_ CTRSD2D
#define ctrbs2d_ CTRBS2D
#define cgerv2d_ CGERV2D
#define cgebr2d_ CGEBR2D
#define ctrrv2d_ CTRRV2D
#define ctrbr2d_ CTRBR2D
#define cgamx2d_ CGAMX2D
#define cgamn2d_ CGAMN2D
#define cgsum2d_ CGSUM2D
#define zgesd2d_ ZGESD2D
#define zgebs2d_ ZGEBS2D
#define ztrsd2d_ ZTRSD2D
#define ztrbs2d_ ZTRBS2D
#define zgerv2d_ ZGERV2D
#define zgebr2d_ ZGEBR2D
#define ztrrv2d_ ZTRRV2D
#define ztrbr2d_ ZTRBR2D
#define zgamx2d_ ZGAMX2D
#define zgamn2d_ ZGAMN2D
#define zgsum2d_ ZGSUM2D
/* Level-1 BLAS */
#define srotg_ SROTG
#define srotmg_ SROTMG
#define srot_ SROT
#define srotm_ SROTM
#define sswap_ SSWAP
#define sscal_ SSCAL
#define scopy_ SCOPY
#define saxpy_ SAXPY
#define ssdot_ SSDOT
#define isamax_ ISAMAX
#define drotg_ DROTG
#define drotmg_ DROTMG
#define drot_ DROT
#define drotm_ DROTM
#define dswap_ DSWAP
#define dscal_ DSCAL
#define dcopy_ DCOPY
#define daxpy_ DAXPY
#define dddot_ DDDOT
#define dnrm2_ DNRM2
#define dsnrm2_ DSNRM2
#define dasum_ DASUM
#define dsasum_ DSASUM
#define idamax_ IDAMAX
#define daxpby_ DAXPBY
#define zaxpby_ ZAXPBY /* to match added internal function */
#define cswap_ CSWAP
#define cscal_ CSCAL
#define csscal_ CSSCAL
#define ccopy_ CCOPY
#define caxpy_ CAXPY
#define ccdotu_ CCDOTU
#define ccdotc_ CCDOTC
#define icamax_ ICAMAX
#define zswap_ ZSWAP
#define zscal_ ZSCAL
#define zdscal_ ZDSCAL
#define zcopy_ ZCOPY
#define zaxpy_ ZAXPY
#define zzdotu_ ZZDOTU
#define zzdotc_ ZZDOTC
#define dscnrm2_ DSCNRM2
#define dznrm2_ DZNRM2
#define dscasum_ DSCASUM
#define dzasum_ DZASUM
#define izamax_ IZAMAX
/* Level-2 BLAS */
#define sgemv_ SGEMV
#define ssymv_ SSYMV
#define strmv_ STRMV
#define strsv_ STRSV
#define sger_ SGER
#define ssyr_ SSYR
#define ssyr2_ SSYR2
#define dgemv_ DGEMV
#define dsymv_ DSYMV
#define dtrmv_ DTRMV
#define dtrsv_ DTRSV
#define dger_ DGER
#define dsyr_ DSYR
#define dsyr2_ DSYR2
#define cgemv_ CGEMV
#define chemv_ CHEMV
#define ctrmv_ CTRMV
#define ctrsv_ CTRSV
#define cgeru_ CGERU
#define cgerc_ CGERC
#define cher_ CHER
#define cher2_ CHER2
#define zgemv_ ZGEMV
#define zhemv_ ZHEMV
#define ztrmv_ ZTRMV
#define ztrsv_ ZTRSV
#define zgeru_ ZGERU
#define zgerc_ ZGERC
#define zher_ ZHER
#define zher2_ ZHER2
/* Level-3 BLAS */
#define sgemm_ SGEMM
#define ssymm_ SSYMM
#define ssyrk_ SSYRK
#define ssyr2k_ SSYR2K
#define strmm_ STRMM
#define strsm_ STRSM
#define dgemm_ DGEMM
#define dsymm_ DSYMM
#define dsyrk_ DSYRK
#define dsyr2k_ DSYR2K
#define dtrmm_ DTRMM
#define dtrsm_ DTRSM
#define cgemm_ CGEMM
#define chemm_ CHEMM
#define csymm_ CSYMM
#define csyrk_ CSYRK
#define cherk_ CHERK
#define csyr2k_ CSYR2K
#define cher2k_ CHER2K
#define ctrmm_ CTRMM
#define ctrsm_ CTRSM
#define zgemm_ ZGEMM
#define zhemm_ ZHEMM
#define zsymm_ ZSYMM
#define zsyrk_ ZSYRK
#define zherk_ ZHERK
#define zsyr2k_ ZSYR2K
#define zher2k_ ZHER2K
#define ztrmm_ ZTRMM
#define ztrsm_ ZTRSM
/* Auxilliary PBLAS */
#define pberror_ PBERROR
#define pbfreebuf_ PBFREEBUF
#define dcsmm DCSMM
#define dcssm DCSSM
#define dcsnmi DCSNMI
#define zcsnmi ZCSNMI
#endif
#if( _F2C_CALL_ == _F2C_NOCHANGE )
/*
* These defines set up the naming scheme required to have a FORTRAN
* routine call a C routine (which is what the PBLAS are written in)
* for following FORTRAN to C interface:
* FORTRAN CALL C DECLARATION
* call pdgemm(...) void pdgemm(...)
*/
/* PSBLAS */
#define psddot_ psddot
#define psdmdot_ psdmdot
#define psdaxpby_ psdaxpby
#define psdamax_ psdamax
#define psdmamax_ psdmamax
#define psdasum_ psdasum
#define psdnrm2_ psdnrm2
#define psdnrmi_ psdnrmi
#define psdnrmisym_ psdnrmisym
#define psdhalo_ psdhalo
#define psihalo_ psihalo
#define psdhred_ psdhred
#define psdovrl_ psdovrl
#define psdspmm_ psdspmm
#define psdswaptran_ psdswaptran
#define psdspmmsym_ psdspmmsym
#define psdspsm_ psdspsm
#define psderror_ psderror
#define psdverify_ psdverify
#define psdscatterm_ psdscatterm
#define psdgatherm_ psdgatherm
#define pszmdot_ pszmdot
#define pszdotc_ pszdotc
#define pszdotu_ pszdotu
#define pszaxpby_ pszaxpby
#define pszamax_ pszamax
#define pszmamax_ pszmamax
#define pszasum_ pszasum
#define psznrm2_ psznrm2
#define psznrmi_ psznrmi
#define psznrmisym_ psznrmisym
#define pszhalo_ pszhalo
#define pszovrl_ pszovrl
#define pszspmm_ pszspmm
#define pszspmmsym_ pszspmmsym
#define pszspsm_ pszspsm
#define pszerror_ pszerror
#define pszverify_ pszverify
#define pszscatterm_ pszscatterm
#define pszgatherm_ pszgatherm
/* BLACS */
#define blacs_abort_ blacs_abort
#define blacs_gridinfo_ blacs_gridinfo
#define igesd2d_ igesd2d
#define igebs2d_ igebs2d
#define itrsd2d_ itrsd2d
#define itrbs2d_ itrbs2d
#define igerv2d_ igerv2d
#define igebr2d_ igebr2d
#define itrrv2d_ itrrv2d
#define itrbr2d_ itrbr2d
#define igamx2d_ igamx2d
#define igamn2d_ igamn2d
#define igsum2d_ igsum2d
#define sgesd2d_ sgesd2d
#define sgebs2d_ sgebs2d
#define strsd2d_ strsd2d
#define strbs2d_ strbs2d
#define sgerv2d_ sgerv2d
#define sgebr2d_ sgebr2d
#define strrv2d_ strrv2d
#define strbr2d_ strbr2d
#define sgamx2d_ sgamx2d
#define sgamn2d_ sgamn2d
#define sgsum2d_ sgsum2d
#define dgesd2d_ dgesd2d
#define dgebs2d_ dgebs2d
#define dtrsd2d_ dtrsd2d
#define dtrbs2d_ dtrbs2d
#define dgerv2d_ dgerv2d
#define dgebr2d_ dgebr2d
#define dtrrv2d_ dtrrv2d
#define dtrbr2d_ dtrbr2d
#define dgamx2d_ dgamx2d
#define dgamn2d_ dgamn2d
#define dgsum2d_ dgsum2d
#define cgesd2d_ cgesd2d
#define cgebs2d_ cgebs2d
#define ctrsd2d_ ctrsd2d
#define ctrbs2d_ ctrbs2d
#define cgerv2d_ cgerv2d
#define cgebr2d_ cgebr2d
#define ctrrv2d_ ctrrv2d
#define ctrbr2d_ ctrbr2d
#define cgamx2d_ cgamx2d
#define cgamn2d_ cgamn2d
#define cgsum2d_ cgsum2d
#define zgesd2d_ zgesd2d
#define zgebs2d_ zgebs2d
#define ztrsd2d_ ztrsd2d
#define ztrbs2d_ ztrbs2d
#define zgerv2d_ zgerv2d
#define zgebr2d_ zgebr2d
#define ztrrv2d_ ztrrv2d
#define ztrbr2d_ ztrbr2d
#define zgamx2d_ zgamx2d
#define zgamn2d_ zgamn2d
#define zgsum2d_ zgsum2d
/* Level-1 BLAS */
#define srotg_ srotg
#define srotmg_ srotmg
#define srot_ srot
#define srotm_ srotm
#define sswap_ sswap
#define sscal_ sscal
#define scopy_ scopy
#define saxpy_ saxpy
#define ssdot_ ssdot
#define isamax_ isamax
#define drotg_ drotg
#define drotmg_ drotmg
#define drot_ drot
#define drotm_ drotm
#define dswap_ dswap
#define dscal_ dscal
#define dcopy_ dcopy
#define daxpy_ daxpy
#define dddot_ dddot
#define dnrm2_ dnrm2
#define dsnrm2_ dsnrm2
#define dasum_ dasum
#define dsasum_ dsasum
#define idamax_ idamax
#define daxpby_ daxpby
#define zaxpby_ zaxpby
#define cswap_ cswap
#define cscal_ cscal
#define csscal_ csscal
#define ccopy_ ccopy
#define caxpy_ caxpy
#define ccdotu_ ccdotu
#define ccdotc_ ccdotc
#define icamax_ icamax
#define zswap_ zswap
#define zscal_ zscal
#define zdscal_ zdscal
#define zcopy_ zcopy
#define zaxpy_ zaxpy
#define zzdotu_ zzdotu
#define zzdotc_ zzdotc
#define dscnrm2_ dscnrm2
#define dznrm2_ dznrm2
#define dscasum_ dscasum
#define dzasum_ dzasum
#define izamax_ izamax
/* Level-2 BLAS */
#define sgemv_ sgemv
#define ssymv_ ssymv
#define strmv_ strmv
#define strsv_ strsv
#define sger_ sger
#define ssyr_ ssyr
#define ssyr2_ ssyr2
#define dgemv_ dgemv
#define dsymv_ dsymv
#define dtrmv_ dtrmv
#define dtrsv_ dtrsv
#define dger_ dger
#define dsyr_ dsyr
#define dsyr2_ dsyr2
#define cgemv_ cgemv
#define chemv_ chemv
#define ctrmv_ ctrmv
#define ctrsv_ ctrsv
#define cgeru_ cgeru
#define cgerc_ cgerc
#define cher_ cher
#define cher2_ cher2
#define zgemv_ zgemv
#define zhemv_ zhemv
#define ztrmv_ ztrmv
#define ztrsv_ ztrsv
#define zgeru_ zgeru
#define zgerc_ zgerc
#define zher_ zher
#define zher2_ zher2
/* Level-3 BLAS */
#define sgemm_ sgemm
#define ssymm_ ssymm
#define ssyrk_ ssyrk
#define ssyr2k_ ssyr2k
#define strmm_ strmm
#define strsm_ strsm
#define dgemm_ dgemm
#define dsymm_ dsymm
#define dsyrk_ dsyrk
#define dsyr2k_ dsyr2k
#define dtrmm_ dtrmm
#define dtrsm_ dtrsm
#define cgemm_ cgemm
#define chemm_ chemm
#define csymm_ csymm
#define csyrk_ csyrk
#define cherk_ cherk
#define csyr2k_ csyr2k
#define cher2k_ cher2k
#define ctrmm_ ctrmm
#define ctrsm_ ctrsm
#define zgemm_ zgemm
#define zhemm_ zhemm
#define zsymm_ zsymm
#define zsyrk_ zsyrk
#define zherk_ zherk
#define zsyr2k_ zsyr2k
#define zher2k_ zher2k
#define ztrmm_ ztrmm
#define ztrsm_ ztrsm
/* Auxilliary PBLAS */
#define pberror_ pberror
#define pbfreebuf_ pbfreebuf
#endif
#endif
void pbchkvect( int, int, int, int, int, int, int *, int, int, int *, int *,
int *) ;
void pbchkmat( int, int, int, int, int, int, int *, int, int, int *, int *, int *);

Binary file not shown.

@ -0,0 +1,26 @@
INTEGER MINJDROWS, MAXJDROWS
PARAMETER (MINJDROWS=4, MAXJDROWS=8)
DOUBLE PRECISION PERCENT
INTEGER DBLEINT_
INTEGER DCMPLXINT_
C ... This parameter represent sizeof(DOUBLE)/sizeof(INTEGER) ...
PARAMETER (PERCENT=0.7,DBLEINT_=2)
PARAMETER (DCMPLXINT_ = 4)
character fidef*5
parameter (fidef='CSR')
integer, parameter :: nnz_=1
integer, parameter :: del_bnd_=6, srtd_=7
integer, parameter :: state_=8, upd_=9
integer, parameter :: upd_pnt_=10, ifasize_=10
integer, parameter :: spmat_null=0, spmat_bld=1
integer, parameter :: spmat_asb=2, spmat_upd=4
integer perm_update
parameter (perm_update=98765)
integer isrtdcoo
parameter (isrtdcoo=98764)
integer ireg_flgs
parameter (ireg_flgs=10)
integer ip2_, iflag_, ipc_, ichk_, nnzt_, zero_
parameter (ip2_=0, iflag_=2, ichk_=3)
parameter ( nnzt_=4, zero_=5,ipc_=6)

@ -0,0 +1,118 @@
L'inizializzazione del sistema prevede, ora, che l'assemblaggio del
descrittore e quello della matrice possano essere eseguiti
indipendentemente.
Durante la sua vita, il descrittore può trovarsi in due differenti
stati:
1. bld: stato di build. in questo stato è possibile aggiornare il
contenuto del descrittore attraverso la routine psb_dscins.
2. asb: stato assembled. questo è lo stato della rappresentazione
finale del descrittore ed è raggiunto a valle di una chiamata alla
routine psb_dscasb.
Durante la sua vita, la matrice può trovarsi in tre differenti
stati:
1. bld: stato di build. in questo stato è possibile aggiornare il
contenuto dela matrice attraverso la routine psb_spins.
2. asb: stato assembled. questo è lo stato della rappresentazione
finale della matrice ed è raggiunto a valle di una chiamata alla
routine psb_spasb.
3. upd: stato di update. è lo stato in cui è possibile (attraverso
una chiamata alla routine psb_spasb) rigenerare la matrice.
- Assemblaggio contestuale di matrice e descrittore
Il procedimento da seguire prevede il seguente ordine di chiamate:
1. psb_dscall: allocazione del descrittore. Alla fine di questo
step lo stato del descrittore sarà bld
2. psb_spall: allocazione della matrice. Alla fine di questo
step lo stato della matrice sarà bld
3. psb_spins: in questo caso sia il descrittore che la matrice
saranno nello stato bld. Quindi la psb_spins invoca la
psb_dscins per portare il descrittore in uno stato pre-asb e
poi effettivamente inserisce i coefficienti nella
matrice (che quindi sarà anch'essa in uno stato
pre-asb). Dunque, nel caso di costruzione/assemblaggio
contestuale di matrise e descrittore, il contenuto del
descrittore è implicitamente aggiornato da questa
chiamata. (nel caso separato bisognerà esplicitamente
prevedere questa fase attraverso una chiamata alla psb_dscins)
4. psb_dscasb: il descrittore viene assemblato e quindi portato
allo stato asb.
5. psb_spasb: la matrice viene assemblata e quindi portata
allo stato asb.
- Assemblaggio di descrittore e matrice indipendenti
Il procedimento da seguire per costruire/assemblare il descrittore
prevede il seguente ordine di chiamate:
1. psb_dscall: allocazione del descrittore. Alla fine di questo
step lo stato del descrittore sarà bld
2. psb_dscins: il descrittore viene inizializzato a partire dal
pattern di sparsità della matrice e dal partizionamento. Alla
fine di questo step sarà in uno stato pre-asb
3. psb_dscasb: il descrittore viene assemblato e quindi portato
allo stato asb.
Il procedimento da seguire per costruire/assemblare la matrice
prevede il seguente ordine di chiamate:
1. psb_spall: allocazione della matrice. Alla fine di questo
step lo stato della matrice sarà bld
2. psb_spins: i coefficienti vengono effettivamente inseriti
nella matrice che sarà portata ad uno stato pre-asb.
3. psb_spasb: la matrice viene assemblata e quindi portata
allo stato asb.
- Aggiornamento della matrice
Se il pattern di sparsità della matrice non cambia, la matrice può
essere aggiornata attraverso il seguente procedimento:
1. psb_sprn: reinizializza la matrice. Alla fine di questo step
la matrice sarà nello stato upd
2. psb_spins: i coefficienti della matrice vengono reinseriti
3. psb_spasb: la matrice viene assemblata e riportata nello stato
asb.
La gestione degli errori
La nuova gestione degli errori prevede la creazione di uno stack di
messaggi di errore che possa consentire di seguire a ritroso la
sequenza di chiamate di routine fino ad arrivare a quella in cui
l'errore è stato rilevato. Tutte le nuove interfacce prevedono un
argomento "info" il quale ritorna un valore > 0 se all'interno della
routine chiamata è stato rilevato un errore. Dunque ogni volta che
si rileva una condizione di errore (o per verifica diretta o perchè
una routine chiamata ha ritornato info>0) occorre mettere l'errore
in cima allo stack per mezzo della routine
psb_errpush(info,name,i_err,a_err) in cui:
info: codice di errore (si veda SRC/F90/errormod.f90 per una
corrispondenza codice-messaggiodierrore)
name: stringa di lunghezza 20 contenente il nome della routine
che invoca la psb_errpush()
i_err: opzionale. E' un array di 5 interi contenente informazioni
aggiuntive per il messaggio di errore (si veda errormod.f90)
a_err: opzionale. E' una stringa di 20 contenente informazioni
aggiuntive per il messaggio di errore (si veda errormod.f90)
attraverso la routine psb_seterrverbosity si può impostare la
verbosità del messaggio d'errore (se =1 viene stampato solo l'errore
in cima allo stack; se >1 vengono stampati tutti)
la routine psb_error(ictxt) provoca la stampa degli (dell') errori
(errore) sullo stack ed, eventualmente, stronca il set di processi.
l'argomento ictxt è opzionale: se è assente viene semplicemente
stampato il messaggio d'errore altrimenti viene anche abortita
l'esecuzione di tutti i processi.
la routine psb_seterraction(action) determina quale azione deve essere
intrapresa a fronte del rilevamento di un errore:
action =0 : la routine in cui è stato rilevato un errore (e quindi
dopo che l'errore stesso sia stato inserito sullo stack)
semplicemente ritorna al chiamante un codice di errore
action =1 : la routine in cui è stato rilevato un errore (e quindi
dopo che l'errore stesso sia stato inserito sullo stack)
prima di ritornare invoca la psb_error (e quindi, può, eventualmente
stroncare l'esecuzione di tutti i processi).

73
notes

@ -0,0 +1,73 @@
Struttura:
psblas:
psblas/src: la directory contenente il codice sorgente
psblas/src/comm: contiene tutte le routine preposte allo scambio di
dati
psblas/src/internals: contiene una serie di routine utilizzate per
l'assemblaggio dei descrittori di comunicazione e per
lo scambio di dati (psi_dswap_data e psi_dswap_tran)
psblas/src/methd: contiene l'implementazione dei metodi iterativi
psblas/src/modules: contiene i moduli con le interfacce, le
definizioni di tipi e di costanti
psblas/src/prec: contiene tutte le routine preposte alla generazione e
applicazione dei precondizionatori
psblas/src/psblas: contiene le routine algebriche parallele
psblas/src/serial: contiene l'implementazione seriale di routine
algebriche e ausiliarie
psblas/src/serial/aux: routine ausiliarie (in realtà c'è rimasto be poco)
psblas/src/serial/coo: routine relative al formato coo
psblas/src/serial/csr: routine relative al formato csr
psblas/src/serial/dp: routine per l'assemblaggio e la conversione da
un formato all'altro
psblas/src/serial/f77: si tratta delle routine algebriche. queste
vengono chiamate all'interno delle routine in
psblas/src/psblas.
psblas/src/serial/jad: routine relative al formato jad
psblas/src/tools: tutte le routine per la generazione e rigenerazione
di descrittori e matrici
psblas/test: programmi di test
Schema di nomenclatura:
tutti i simboli (quindi routine, tipi dato, costanti, moduli etc...)
devono avere il prefisso psb_. I tipi hanno il suffisso "_type" (quindi
quello che prima era d_spmat adesso diventa psb_dspmat_type), tutti i
moduli hanno il suffisso "_mod".
Interfacce subroutine & argomenti:
l'articolo di Carney et al. richiede questa convenzione per l'ordine
degli argomenti:
1- arguments specifying options (tipo TRANS, UNITD etc...)
2- arguments specifying problem dimensions
3- input scalar associated with input matrices
4- description of sparse input matrices (che sarebbe i nostri FIDA e
DESCRA)
5- description of dense input matrices
6- input scalar associated with input-output matrices
7- description of input-output matrices
8- error processing informations
9- workspace
10- length of workspace
tutto questo va rivisto nell'ottica f90 e, quindi, con i tipi dato
user-defined e con gli argomenti opzionali.
Attualmente, in linea di massima, le interfacce delle routine
algebriche hanno sempre desc_a (il descrittore) e info (il codice di
errore riportato) nelle ultime posizioni prima dei parametri
opzionali. I parametri opzionali contengono sempre gli argomenti al
punto 1 oltre che jx, jy e k che definiscono il sottopreblema su cui
effettuare l'operazione richiesta. Per tutto il resto non c'è uno
schema ben definito e viene, generalmente seguito l'ordine con cui gli
operandi appaiono nella scrittura della formula matematica relativa
all'operazione implementata nella subroutine. si potrebbe pensare a
come risistemare la cosa ma secondo me è abbastanza chiara.

@ -0,0 +1,30 @@
include ../Make.inc
lib:
(cd modules; make lib)
(cd comm; make lib)
(cd internals; make lib)
(cd tools; make lib)
(cd serial; make lib)
(cd psblas; make lib)
(cd prec; make lib)
(cd methd; make lib)
clean:
(cd modules; make clean)
(cd comm; make clean)
(cd internals; make clean)
(cd tools; make clean)
(cd serial; make clean)
(cd psblas; make clean)
(cd prec; make clean)
(cd methd; make clean)
veryclean:
(cd modules; make veryclean)
(cd comm; make veryclean)
(cd internals; make veryclean)
(cd tools; make veryclean)
(cd serial; make veryclean)
(cd psblas; make veryclean)
(cd prec; make veryclean)
(cd methd; make veryclean)

@ -0,0 +1,17 @@
include ../../Make.inc
OBJS = psb_dgather.o psb_dhalo.o psb_dovrl.o \
psb_ihalo.o
MPFOBJS = psb_dscatter.o
INCDIRS = ../../lib
lib: mpfobjs $(OBJS)
mpfobjs:
(make $(MPFOBJS) F90="$(MPF90)" FC="$(MPF90)" FCOPT="$(F90COPT)")
clean:
/bin/rm -f $(MPFOBJS) $(OBJS)

@ -0,0 +1,321 @@
! File: psb_dgather.f90
!
! Subroutine: psb_dgatherm
! This subroutine gathers pieces of a distributed dense matrix into a local one.
!
! Parameters:
! globx - real,dimension(:,:). The local matrix into which gather the distributed pieces.
! locx - real,dimension(:,:). The local piece of the ditributed matrix to be gathered.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! iroot - integer. The process that has to own the global matrix. If -1 all
! the processes will have a copy.
! iiglobx - integer(optional). The starting row of the global matrix.
! ijglobx - integer(optional). The starting column of the global matrix.
! iilocx - integer(optional). The starting row of the local piece of matrix.
! ijlocx - integer(optional). The starting column of the local piece of matrix.
! ik - integer(optional). The number of columns to gather.
!
subroutine psb_dgatherm(globx, locx, desc_a, info, iroot,&
& iiglobx, ijglobx, iilocx,ijlocx,ik)
use psb_descriptor_type
use psb_error_mod
implicit none
real(kind(1.d0)), intent(in) :: locx(:,:)
real(kind(1.d0)), intent(out) :: globx(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot, iiglobx, ijglobx, iilocx, ijlocx, ik
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, n, iix, jjx, temp(2), root, iiroot, ilocx, iglobx, jlocx,&
& jglobx, lda_locx, lda_globx, m, lock, globk, maxk, k, jlx, ilx, i, j, idx
real(kind(1.d0)) :: locmax(2), amax
real(kind(1.d0)),pointer :: tmpx(:)
character(len=20) :: name, ch_err
name='psb_dgatherm'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
if (present(iroot)) then
root = iroot
if((root.lt.-1).or.(root.gt.nprow)) then
info=30
int_err(1:2)=(/5,root/)
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
else
root = -1
end if
if (root==-1) then
iiroot=0
endif
if (present(iiglobx)) then
iglobx = iiglobx
else
iglobx = 1
end if
if (present(ijglobx)) then
jglobx = ijglobx
else
jglobx = 1
end if
if (present(iilocx)) then
ilocx = iilocx
else
ilocx = 1
end if
if (present(ijlocx)) then
jlocx = ijlocx
else
jlocx = 1
end if
lda_globx = size(globx,1)
lda_locx = size(locx, 1)
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
lock=size(locx,2)-jlocx+1
globk=size(globx,2)-jglobx+1
maxk=min(lock,globk)
if(present(ik)) then
if(ik.gt.maxk) then
k=maxk
else
k=ik
end if
else
k = maxk
end if
if (myrow == iiroot) then
call igebs2d(icontxt, 'all', ' ', 1, 1, k, 1)
else
call igebr2d(icontxt, 'all', ' ', 1, 1, k, 1, iiroot, 0)
end if
! there should be a global check on k here!!!
call psb_chkglobvect(m,n,size(globx,1),iglobx,jglobx,desc_a%matrix_data,info)
call psb_chkvect(m,n,size(locx,1),ilocx,jlocx,desc_a%matrix_data,info,ilx,jlx)
if(info.ne.0) then
info=4010
ch_err='psb_chk(glob)vect'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
if ((ilx.ne.1).or.(iglobx.ne.1)) then
info=3040
call psb_errpush(info,name)
goto 9999
end if
globx(:,:)=0.d0
do j=1,k
do i=1,desc_a%matrix_data(psb_n_row_)
idx = desc_a%loc_to_glob(i)
globx(idx,jglobx+j-1) = locx(i,jlx+j-1)
end do
! adjust overlapped elements
i=0
do while (desc_a%ovrlap_elem(i).ne.-1)
idx=desc_a%ovrlap_elem(i+psb_ovrlp_elem_)
idx=desc_a%loc_to_glob(idx)
globx(idx,jglobx+j-1) = globx(idx,jglobx+j-1)/desc_a%ovrlap_elem(i+psb_n_dom_ovr_)
i=i+2
end do
end do
call dgsum2d(icontxt,'a',' ',m,k,globx(1,jglobx),size(globx,1),root,mycol)
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dgatherm
! Subroutine: psb_dgatherv
! This subroutine gathers pieces of a distributed dense vector into a local one.
!
! Parameters:
! globx - real,dimension(:). The local vector into which gather the distributed pieces.
! locx - real,dimension(:). The local piece of the ditributed vector to be gathered.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! iroot - integer. The process that has to own the global vector. If -1 all
! the processes will have a copy.
! iiglobx - integer(optional). The starting row of the global vector.
! iilocx - integer(optional). The starting row of the local piece of vector.
!
subroutine psb_dgatherv(globx, locx, desc_a, info, iroot,&
& iiglobx, iilocx)
use psb_descriptor_type
use psb_error_mod
implicit none
real(kind(1.d0)), intent(in) :: locx(:)
real(kind(1.d0)), intent(out) :: globx(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot, iiglobx, iilocx
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, n, iix, jjx, temp(2), root, iiroot, ilocx, iglobx, jlocx,&
& jglobx, lda_locx, lda_globx, lock, maxk, globk, m, k, jlx, ilx, i, j, idx
real(kind(1.d0)) :: locmax(2), amax
real(kind(1.d0)),pointer :: tmpx(:)
character(len=20) :: name, ch_err
name='psb_dgatherv'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
if (present(iroot)) then
root = iroot
if((root.lt.-1).or.(root.gt.nprow)) then
info=30
int_err(1:2)=(/5,root/)
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
else
root = -1
end if
if (root==-1) then
iiroot=0
endif
jglobx=1
if (present(iiglobx)) then
iglobx = iiglobx
else
iglobx = 1
end if
jlocx=1
if (present(iilocx)) then
ilocx = iilocx
else
ilocx = 1
end if
lda_globx = size(globx)
lda_locx = size(locx)
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
k = 1
if (myrow == iiroot) then
call igebs2d(icontxt, 'all', ' ', 1, 1, k, 1)
else
call igebr2d(icontxt, 'all', ' ', 1, 1, k, 1, iiroot, 0)
end if
! there should be a global check on k here!!!
call psb_chkglobvect(m,n,size(globx),iglobx,jglobx,desc_a%matrix_data,info)
call psb_chkvect(m,n,size(locx),ilocx,jlocx,desc_a%matrix_data,info,ilx,jlx)
if(info.ne.0) then
info=4010
ch_err='psb_chk(glob)vect'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
if ((ilx.ne.1).or.(iglobx.ne.1)) then
info=3040
call psb_errpush(info,name)
goto 9999
end if
globx(:)=0.d0
do i=1,desc_a%matrix_data(psb_n_row_)
idx = desc_a%loc_to_glob(i)
globx(idx) = locx(i)
end do
! adjust overlapped elements
i=0
do while (desc_a%ovrlap_elem(i).ne.-1)
idx=desc_a%ovrlap_elem(i+psb_ovrlp_elem_)
idx=desc_a%loc_to_glob(idx)
globx(idx) = globx(idx)/desc_a%ovrlap_elem(i+psb_n_dom_ovr_)
i=i+2
end do
call dgsum2d(icontxt,'a',' ',m,k,globx,size(globx),root,mycol)
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dgatherv

@ -0,0 +1,323 @@
! File: psb_dhalo.f90
!
! Subroutine: psb_dhalom
! This subroutine performs the exchange of the halo elements in a distributed dense matrix between all the processes.
!
! Parameters:
! x - real,dimension(:,:). The local part of the dense matrix.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! alpha - real(optional). ???.
! jx - integer(optional). The starting column of the global matrix.
! ik - integer(optional). The number of columns to gather.
! work - real(optional). A working area.
! tran - character(optional). ???.
! mode - integer(optional).
!
subroutine psb_dhalom(x,desc_a,info,alpha,jx,ik,work,tran,mode)
use psb_descriptor_type
use psb_const_mod
use psi_mod
use psb_error_mod
implicit none
real(kind(1.d0)), intent(inout), target :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
real(kind(1.d0)), intent(inout), target, optional :: work(:)
integer, intent(in), optional :: mode,jx,ik
character, intent(in), optional :: tran
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), ix, ijx, k, maxk, nrow, imode, i,&
& err, liwork, ncol
real(kind(1.d0)),pointer :: iwork(:), xp(:,:)
character :: ltran
character(len=20) :: name, ch_err
name='psb_dhalom'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
ix = 1
if (present(jx)) then
ijx = jx
else
ijx = 1
endif
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
nrow = desc_a%matrix_data(psb_n_row_)
maxk=size(x,2)-jx+1
if(present(ik)) then
if(ik.gt.maxk) then
k=maxk
else
k=ik
end if
else
k = maxk
end if
if (present(tran)) then
ltran = tran
else
ltran = 'N'
endif
if (present(mode)) then
imode = mode
else
imode = IOR(psb_swap_send_,psb_swap_recv_)
endif
! check vector correctness
call psb_chkvect(m,1,size(x,1),ix,ijx,desc_a%matrix_data,info,iix,jjx)
if(info.ne.0) then
info=4010
ch_err='psb_chkvect'
call psb_errpush(info,name,a_err=ch_err)
end if
if (iix.ne.1) then
info=3040
call psb_errpush(info,name)
end if
err=info
call psb_errcomm(icontxt,err)
if(err.ne.0) goto 9999
if(present(alpha)) then
if(alpha.ne.1.d0) then
do i=0, k-1
call dscal(nrow,alpha,x(1,jjx+i),1)
end do
end if
end if
liwork=ncol
if (present(work)) then
if(size(work).lt.liwork) then
call psrealloc(liwork,work,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
iwork => work
else
call psrealloc(liwork,iwork,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
! exchange halo elements
xp => x(iix:size(x,1),jjx:jjx+k-1)
if(ltran.eq.'N') then
call psi_swapdata(imode,k,0.d0,xp,&
& desc_a,iwork,info)
!!$ call PSI_dSwapData(imode,k,0.d0,x(1,jjx),&
!!$ & size(x,1),desc_a%matrix_data,&
!!$ & desc_a%halo_index,iwork,liwork,info)
else if((ltran.eq.'T').or.(ltran.eq.'H')) then
call spi_swaptran(imode,k,1.d0,xp,&
&desc_a,iwork,info)
!!$ call PSI_dSwapTran(imode,k,1.d0,x(1,jjx),&
!!$ & size(x,1),desc_a%matrix_data,&
!!$ & desc_a%halo_index,iwork,liwork,info)
end if
if(info.ne.0) then
call psb_errpush(4010,name,a_err='PSI_dSwap...')
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dhalom
! Subroutine: psb_dhalov
! This subroutine performs the exchange of the halo elements in a distributed dense vector between all the processes.
!
! Parameters:
! x - real,dimension(:). The local part of the dense vector.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! alpha - real(optional). ???.
! work - real(optional). A working area.
! tran - character(optional). ???.
! mode - integer(optional).
!
subroutine psb_dhalov(x,desc_a,info,alpha,work,tran,mode)
use psb_descriptor_type
use psb_const_mod
use psi_mod
use psb_error_mod
implicit none
real(kind(1.d0)), intent(inout) :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
real(kind(1.d0)), intent(inout), target, optional :: work(:)
integer, intent(in), optional :: mode
character, intent(in), optional :: tran
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), ix, ijx, k, maxk, nrow, imode, i,&
& err, liwork, ncol
real(kind(1.d0)),pointer :: iwork(:)
character :: ltran
character(len=20) :: name, ch_err
name='psb_dhalom'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
ix = 1
ijx = 1
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
nrow = desc_a%matrix_data(psb_n_row_)
if (present(tran)) then
ltran = tran
else
ltran = 'N'
endif
if (present(mode)) then
imode = mode
else
imode = IOR(psb_swap_send_,psb_swap_recv_)
endif
! check vector correctness
call psb_chkvect(m,1,size(x,1),ix,ijx,desc_a%matrix_data,info,iix,jjx)
if(info.ne.0) then
info=4010
ch_err='psb_chkvect'
call psb_errpush(info,name,a_err=ch_err)
end if
if (iix.ne.1) then
info=3040
call psb_errpush(info,name)
end if
err=info
call psb_errcomm(icontxt,err)
if(err.ne.0) goto 9999
if(present(alpha)) then
if(alpha.ne.1.d0) then
call dscal(nrow,alpha,x,1)
end if
end if
liwork=ncol
if (present(work)) then
if(size(work).lt.liwork) then
call psrealloc(liwork,work,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
iwork => work
else
call psrealloc(liwork,iwork,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
! exchange halo elements
if(ltran.eq.'N') then
call psi_swapdata(imode,0.d0,x(iix:size(x)),&
& desc_a,iwork,info)
else if((ltran.eq.'T').or.(ltran.eq.'H')) then
call psi_swaptran(imode,1.d0,x(iix:size(x)),&
& desc_a,iwork,info)
end if
if(info.ne.0) then
call psb_errpush(4010,name,a_err='PSI_dSwap...')
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dhalov

@ -0,0 +1,358 @@
! File: psb_dovrl.f90
!
! Subroutine: psb_dovrlm
! This subroutine performs the exchange of the overlap elements in a distributed dense matrix between all the processes.
!
! Parameters:
! x - real,dimension(:,:). The local part of the dense matrix.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! jx - integer(optional). The starting column of the global matrix.
! ik - integer(optional). The number of columns to gather.
! work - real(optional). A working area.
! choice - logical(optional). ???.
! update_type - integer(optional). ???.
!
subroutine psb_dovrlm(x,desc_a,info,jx,ik,work,choice,update_type)
use psb_descriptor_type
use psb_const_mod
use psb_error_mod
implicit none
real(kind(1.d0)), intent(inout) :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(inout), optional, target :: work(:)
logical, intent(in), optional :: choice
integer, intent(in), optional :: update_type,jx,ik
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), ix, ijx, nrow, ncol, k, maxk, iupdate,&
& imode, err, liwork, i
real(kind(1.d0)),pointer :: iwork(:)
logical :: ichoice
character(len=20) :: name, ch_err
name='psb_dovrlm'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
ix = 1
if (present(jx)) then
ijx = jx
else
ijx = 1
endif
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
nrow = desc_a%matrix_data(psb_n_row_)
ncol = desc_a%matrix_data(psb_n_col_)
maxk=size(x,2)-jx+1
if(present(ik)) then
if(ik.gt.maxk) then
k=maxk
else
k=ik
end if
else
k = maxk
end if
if (present(choice)) then
ichoice = choice
else
ichoice = .true.
endif
if (present(update_type)) then
iupdate = update_type
else
iupdate = psb_none_
endif
imode = IOR(psb_swap_send_,psb_swap_recv_)
! check vector correctness
call psb_chkvect(m,1,size(x,1),ix,ijx,desc_a%matrix_data,info,iix,jjx)
if(info.ne.0) then
info=4010
ch_err='psb_chkvect'
call psb_errpush(info,name,a_err=ch_err)
end if
if (iix.ne.1) then
info=3040
call psb_errpush(info,name)
end if
err=info
call psb_errcomm(icontxt,err)
if(err.ne.0) goto 9999
! check for presence/size of a work area
liwork=ncol
if (present(work)) then
if(size(work).lt.liwork) then
call psrealloc(liwork,work,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
iwork => work
else
call psrealloc(liwork,iwork,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
! exchange overlap elements
if(ichoice) then
call PSI_dSwapData(imode,k,1.d0,x(1,jjx),&
& size(x,1),desc_a%matrix_data,&
& desc_a%halo_index,iwork,liwork,info)
end if
if(info.ne.0) then
call psb_errpush(4010,name,a_err='PSI_dSwapData')
goto 9999
end if
i=0
! switch on update type
select case (iupdate)
case(psb_square_root_)
do while(desc_a%ovrlap_elem(i).ne.-ione)
x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_),:) =&
& x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_),:)/&
& sqrt(real(desc_a%ovrlap_elem(i+psb_n_dom_ovr_)))
i = i+2
end do
case(psb_avg_)
do while(desc_a%ovrlap_elem(i).ne.-ione)
x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_),:) =&
& x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_),:)/&
& real(desc_a%ovrlap_elem(i+psb_n_dom_ovr_))
i = i+2
end do
case(psb_sum_)
! do nothing
case default
! wrong value for choice argument
info = 70
int_err=(/10,iupdate,0,0,0/)
call psb_errpush(info,name,i_err=int_err)
goto 9999
end select
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dovrlm
! Subroutine: psb_dovrlv
! This subroutine performs the exchange of the overlap elements in a distributed dense vector between all the processes.
!
! Parameters:
! x - real,dimension(:). The local part of the dense vector.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! work - real(optional). A working area.
! choice - logical(optional). ???.
! update_type - integer(optional). ???.
!
subroutine psb_dovrlv(x,desc_a,info,work,choice,update_type)
use psb_descriptor_type
use psb_const_mod
use psb_error_mod
implicit none
real(kind(1.d0)), intent(inout) :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(inout), optional, target :: work(:)
logical, intent(in), optional :: choice
integer, intent(in), optional :: update_type
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), ix, ijx, nrow, ncol, k, maxk, iupdate,&
& imode, err, liwork, i
real(kind(1.d0)),pointer :: iwork(:)
logical :: ichoice
character(len=20) :: name, ch_err
name='psb_dovrlv'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
ix = 1
ijx = 1
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
nrow = desc_a%matrix_data(psb_n_row_)
ncol = desc_a%matrix_data(psb_n_col_)
k = 1
if (present(choice)) then
ichoice = choice
else
ichoice = .true.
endif
if (present(update_type)) then
iupdate = update_type
else
iupdate = psb_none_
endif
imode = IOR(psb_swap_send_,psb_swap_recv_)
! check vector correctness
call psb_chkvect(m,1,x,1,ix,ijx,desc_a%matrix_data,info,iix,jjx)
if(info.ne.0) then
info=4010
ch_err='psb_chkvect'
call psb_errpush(info,name,a_err=ch_err)
end if
if (iix.ne.1) then
info=3040
call psb_errpush(info,name)
end if
err=info
call psb_errcomm(icontxt,err)
if(err.ne.0) goto 9999
! check for presence/size of a work area
liwork=ncol
if (present(work)) then
if(size(work).lt.liwork) then
call psrealloc(liwork,work,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
iwork => work
else
call psrealloc(liwork,iwork,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
! exchange overlap elements
if(ichoice) then
call PSI_dSwapData(imode,k,1.d0,x,&
& x,desc_a%matrix_data,&
& desc_a%halo_index,iwork,liwork,info)
end if
if(info.ne.0) then
call psb_errpush(4010,name,a_err='PSI_dSwapData')
goto 9999
end if
i=0
! switch on update type
select case (iupdate)
case(psb_square_root_)
do while(desc_a%ovrlap_elem(i).ne.-ione)
x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_)) =&
& x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_))/&
& sqrt(real(desc_a%ovrlap_elem(i+psb_n_dom_ovr_)))
i = i+2
end do
case(psb_avg_)
do while(desc_a%ovrlap_elem(i).ne.-ione)
x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_)) =&
& x(desc_a%ovrlap_elem(i+psb_ovrlp_elem_))/&
& real(desc_a%ovrlap_elem(i+psb_n_dom_ovr_))
i = i+2
end do
case(psb_sum_)
! do nothing
case default
! wrong value for choice argument
info = 70
int_err=(/10,iupdate,0,0,0/)
call psb_errpush(info,name,i_err=int_err)
goto 9999
end select
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dovrlv

@ -0,0 +1,387 @@
! File: psb_dscatter.f90
!
! Subroutine: psb_dscatterm
! This subroutine scatters a global matrix locally owned by one process
! into pieces that are local to alle the processes.
!
! Parameters:
! globx - real,dimension(:,:). The global matrix to scatter.
! locx - real,dimension(:,:). The local piece of the ditributed matrix.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! iroot - integer(optional). The process that owns the global matrix. If -1 all
! the processes have a copy.
! iiglobx - integer(optional). The starting row of the global matrix.
! ijglobx - integer(optional). The starting column of the global matrix.
! iilocx - integer(optional). The starting row of the local piece of matrix.
! ijlocx - integer(optional). The starting column of the local piece of matrix.
! ik - integer(optional). The number of columns to gather.
!
subroutine psb_dscatterm(globx, locx, desc_a, info, iroot,&
& iiglobx, ijglobx, iilocx,ijlocx,ik)
use psb_descriptor_type
use psb_error_mod
implicit none
include 'mpif.h'
real(kind(1.d0)), intent(out) :: locx(:,:)
real(kind(1.d0)), intent(in) :: globx(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot,iiglobx,&
& ijglobx,iilocx,ijlocx,ik
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), i, j, idx, nrow, iiroot, iglobx, jglobx,&
& ilocx, jlocx, lda_locx, lda_globx, lock, globk, icomm, k, maxk, root, ilx,&
& jlx, myrank, rootrank, c, pos
real(kind(1.d0)) :: locmax(2), amax
real(kind(1.d0)),pointer :: scatterv(:)
integer, pointer :: displ(:), l_t_g_all(:), all_dim(:)
integer :: blacs_pnum
character(len=20) :: name, ch_err
name='psb_scatterm'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
if (present(iroot)) then
root = iroot
if((root.lt.-1).or.(root.gt.nprow)) then
info=30
int_err(1:2)=(/5,root/)
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
else
root = -1
end if
if (root==-1) then
iiroot=0
endif
if (present(iiglobx)) then
iglobx = iiglobx
else
iglobx = 1
end if
if (present(ijglobx)) then
jglobx = ijglobx
else
jglobx = 1
end if
if (present(iilocx)) then
ilocx = iilocx
else
ilocx = 1
end if
if (present(ijlocx)) then
jlocx = ijlocx
else
jlocx = 1
end if
lda_globx = size(globx,1)
lda_locx = size(locx, 1)
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
lock=size(locx,2)-jlocx+1
globk=size(globx,2)-jglobx+1
maxk=min(lock,globk)
if(present(ik)) then
if(ik.gt.maxk) then
k=maxk
else
k=ik
end if
else
k = maxk
end if
call blacs_get(icontxt,10,icomm)
myrank = blacs_pnum(icontxt,myrow,mycol)
lda_globx = size(globx)
lda_locx = size(locx)
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
if (myrow == iiroot) then
call igebs2d(icontxt, 'all', ' ', 1, 1, k, 1)
else
call igebr2d(icontxt, 'all', ' ', 1, 1, k, 1, iiroot, 0)
end if
! there should be a global check on k here!!!
call psb_chkglobvect(m,n,size(globx),iglobx,jglobx,desc_a%matrix_data,info)
call psb_chkvect(m,n,size(locx),ilocx,jlocx,desc_a%matrix_data,info,ilx,jlx)
if(info.ne.0) then
info=4010
ch_err='psb_chk(glob)vect'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
if ((ilx.ne.1).or.(iglobx.ne.1)) then
info=3040
call psb_errpush(info,name)
goto 9999
end if
nrow=desc_a%matrix_data(psb_n_row_)
if(root.eq.-1) then
! extract my chunk
do j=1,k
do i=1, nrow
idx=desc_a%loc_to_glob(i)
locx(i,jlocx+j-1)=globx(idx,jglobx+j-1)
end do
end do
else
rootrank = blacs_pnum(icontxt,root,mycol)
end if
! root has to gather size information
allocate(displ(nprow),all_dim(nprow))
call mpi_gather(nrow,1,mpi_integer,all_dim,&
& nprow,mpi_integer,rootrank,icomm,info)
displ(1)=1
displ(2:)=all_dim(1:nprow-1)+1
! root has to gather loc_glob from each process
if(myrow.eq.root) then
allocate(l_t_g_all(sum(all_dim)),scatterv(sum(all_dim)))
end if
call mpi_gatherv(desc_a%loc_to_glob,nrow,&
& mpi_integer,l_t_g_all,sum(all_dim),&
& displ,mpi_integer,rootrank,icomm,info)
do c=1, k
! prepare vector to scatter
if(myrow.eq.root) then
do i=1,nprow
pos=displ(i)
do j=1, all_dim(i)
idx=l_t_g_all(pos+j-1)
scatterv(pos+j-1)=globx(idx,jglobx+c-1)
end do
end do
end if
! scatter !!!
call mpi_scatterv(scatterv,sum(all_dim),displ,&
& mpi_double_precision,locx(1,jlocx+c-1),nrow,&
& mpi_double_precision,rootrank,icomm,info)
end do
deallocate(all_dim, l_t_g_all, displ, scatterv)
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dscatterm
! Subroutine: psb_dscatterv
! This subroutine scatters a global vector locally owned by one process
! into pieces that are local to alle the processes.
!
! Parameters:
! globx - real,dimension(:). The global vector to scatter.
! locx - real,dimension(:). The local piece of the ditributed vector.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! iroot - integer(optional). The process that owns the global vector. If -1 all
! the processes have a copy.
!
subroutine psb_dscatterv(globx, locx, desc_a, info, iroot)
use psb_descriptor_type
use psb_error_mod
implicit none
include 'mpif.h'
real(kind(1.d0)), intent(out) :: locx(:)
real(kind(1.d0)), intent(in) :: globx(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), i, j, idx, nrow, iiroot, iglobx, jglobx,&
& ilocx, jlocx, lda_locx, lda_globx, lock, globk, root, k, maxk, icomm, myrank,&
& rootrank, c, pos, ilx, jlx
real(kind(1.d0)) :: locmax(2), amax
real(kind(1.d0)),pointer :: scatterv(:)
integer, pointer :: displ(:), l_t_g_all(:), all_dim(:)
integer :: blacs_pnum
character(len=20) :: name, ch_err
name='psb_scatterv'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
if (present(iroot)) then
root = iroot
if((root.lt.-1).or.(root.gt.nprow)) then
info=30
int_err(1:2)=(/5,root/)
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
else
root = -1
end if
call blacs_get(icontxt,10,icomm)
myrank = blacs_pnum(icontxt,myrow,mycol)
lda_globx = size(globx)
lda_locx = size(locx)
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
k = 1
if (myrow == iiroot) then
call igebs2d(icontxt, 'all', ' ', 1, 1, k, 1)
else
call igebr2d(icontxt, 'all', ' ', 1, 1, k, 1, iiroot, 0)
end if
! there should be a global check on k here!!!
call psb_chkglobvect(m,n,size(globx),iglobx,jglobx,desc_a%matrix_data,info)
call psb_chkvect(m,n,size(locx),ilocx,jlocx,desc_a%matrix_data,info,ilx,jlx)
if(info.ne.0) then
info=4010
ch_err='psb_chk(glob)vect'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
if ((ilx.ne.1).or.(iglobx.ne.1)) then
info=3040
call psb_errpush(info,name)
goto 9999
end if
nrow=desc_a%matrix_data(psb_n_row_)
if(root.eq.-1) then
! extract my chunk
do i=1, nrow
idx=desc_a%loc_to_glob(i)
locx(i)=globx(idx)
end do
else
rootrank = blacs_pnum(icontxt,root,mycol)
end if
! root has to gather size information
allocate(displ(nprow),all_dim(nprow))
call mpi_gather(nrow,1,mpi_integer,all_dim,&
& nprow,mpi_integer,rootrank,icomm,info)
displ(1)=1
displ(2:)=all_dim(1:nprow-1)+1
! root has to gather loc_glob from each process
if(myrow.eq.root) then
allocate(l_t_g_all(sum(all_dim)),scatterv(sum(all_dim)))
end if
call mpi_gatherv(desc_a%loc_to_glob,nrow,&
& mpi_integer,l_t_g_all,sum(all_dim),&
& displ,mpi_integer,rootrank,icomm,info)
! prepare vector to scatter
if(myrow.eq.root) then
do i=1,nprow
pos=displ(i)
do j=1, all_dim(i)
idx=l_t_g_all(pos+j-1)
scatterv(pos+j-1)=globx(idx)
end do
end do
end if
call mpi_scatterv(scatterv,sum(all_dim),displ,&
& mpi_double_precision,locx,nrow,&
& mpi_double_precision,rootrank,icomm,info)
deallocate(all_dim, l_t_g_all, displ, scatterv)
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_dscatterv

@ -0,0 +1,319 @@
! File: psb_ihalo.f90
!
! Subroutine: psb_ihalom
! This subroutine performs the exchange of the halo elements in a distributed dense matrix between all the processes.
!
! Parameters:
! x - integer,dimension(:,:). The local part of the dense matrix.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! alpha - real(optional). ???.
! jx - integer(optional). The starting column of the global matrix.
! ik - integer(optional). The number of columns to gather.
! work - integer(optional). A working area.
! tran - character(optional). ???.
! mode - integer(optional).
!
subroutine psb_ihalom(x,desc_a,info,alpha,jx,ik,work,tran,mode)
use psb_descriptor_type
use psb_const_mod
use psi_mod
use psb_error_mod
implicit none
integer, intent(inout), target :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
integer, intent(inout), optional, target :: work(:)
integer, intent(in), optional :: mode,jx,ik
character, intent(in), optional :: tran
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), ix, ijx, nrow, ncol, k, maxk, liwork,&
& imode, err
integer, pointer :: xp(:,:), iwork(:)
character :: ltran
character(len=20) :: name, ch_err
name='psb_ihalom'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
ix = 1
if (present(jx)) then
ijx = jx
else
ijx = 1
endif
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
nrow = desc_a%matrix_data(psb_n_row_)
maxk=size(x,2)-jx+1
if(present(ik)) then
if(ik.gt.maxk) then
k=maxk
else
k=ik
end if
else
k = maxk
end if
if (present(tran)) then
ltran = tran
else
ltran = 'N'
endif
if (present(mode)) then
imode = mode
else
imode = IOR(psb_swap_send_,psb_swap_recv_)
endif
! check vector correctness
call psb_chkvect(m,1,size(x,1),ix,ijx,desc_a%matrix_data,info,iix,jjx)
if(info.ne.0) then
info=4010
ch_err='psb_chkvect'
call psb_errpush(info,name,a_err=ch_err)
end if
if (iix.ne.1) then
info=3040
call psb_errpush(info,name)
end if
err=info
call psb_errcomm(icontxt,err)
if(err.ne.0) goto 9999
! we should write an "iscal"
!!$ if(present(alpha)) then
!!$ if(alpha.ne.1.d0) then
!!$ do i=0, k-1
!!$ call iscal(nrow,alpha,x(1,jjx+i),1)
!!$ end do
!!$ end if
!!$ end if
liwork=ncol
if (present(work)) then
if(size(work).lt.liwork) then
call psrealloc(liwork,work,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
iwork => work
else
call psrealloc(liwork,iwork,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
xp => x(iix:size(x,1),jjx:jjx+k-1)
! exchange halo elements
if(ltran.eq.'N') then
call psi_swapdata(imode,k,0,xp,&
& desc_a,iwork,info)
else if((ltran.eq.'T').or.(ltran.eq.'H')) then
call psi_swaptran(imode,k,1,xp,&
& desc_a,iwork,info)
end if
if(info.ne.0) then
call psb_errpush(4010,name,a_err='PSI_iSwap...')
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_ihalom
! Subroutine: psb_ihalov
! This subroutine performs the exchange of the halo elements in a distributed dense matrix between all the processes.
!
! Parameters:
! x - integer,dimension(:). The local part of the dense matrix.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! alpha - real(optional). ???.
! work - integer(optional). A working area.
! tran - character(optional). ???.
! mode - integer(optional).
!
subroutine psb_ihalov(x,desc_a,info,alpha,work,tran,mode)
use psb_descriptor_type
use psb_const_mod
use psi_mod
use psb_error_mod
implicit none
integer, intent(inout) :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
integer, intent(inout), optional, target :: work(:)
integer, intent(in), optional :: mode
character, intent(in), optional :: tran
! locals
integer :: int_err(5), icontxt, nprow, npcol, myrow, mycol,&
& err_act, m, n, iix, jjx, temp(2), ix, ijx, nrow, ncol, k, maxk, imode,&
& err, liwork
integer,pointer :: iwork(:)
character :: ltran
character(len=20) :: name, ch_err
name='psb_ihalov'
info=0
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
! check on blacs grid
call blacs_gridinfo(icontxt, nprow, npcol, myrow, mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
ix = 1
ijx = 1
m = desc_a%matrix_data(psb_m_)
n = desc_a%matrix_data(psb_n_)
nrow = desc_a%matrix_data(psb_n_row_)
if (present(tran)) then
ltran = tran
else
ltran = 'N'
endif
if (present(mode)) then
imode = mode
else
imode = IOR(psb_swap_send_,psb_swap_recv_)
endif
! check vector correctness
call psb_chkvect(m,1,size(x,1),ix,ijx,desc_a%matrix_data,info,iix,jjx)
if(info.ne.0) then
info=4010
ch_err='psb_chkvect'
call psb_errpush(info,name,a_err=ch_err)
end if
if (iix.ne.1) then
info=3040
call psb_errpush(info,name)
end if
err=info
call psb_errcomm(icontxt,err)
if(err.ne.0) goto 9999
!!$ if(present(alpha)) then
!!$ if(alpha.ne.1.d0) then
!!$ call dscal(nrow,alpha,x,1)
!!$ end if
!!$ end if
liwork=ncol
if (present(work)) then
if(size(work).lt.liwork) then
call psb_realloc(liwork,work,info)
if(info.ne.0) then
info=4010
ch_err='psb_realloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
iwork => work
else
call psrealloc(liwork,iwork,info)
if(info.ne.0) then
info=4010
ch_err='psrealloc'
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
end if
! exchange halo elements
if(ltran.eq.'N') then
call psi_swapdata(imode,0,x(iix:size(x)),&
& desc_a,iwork,info)
else if((ltran.eq.'T').or.(ltran.eq.'H')) then
call psi_swaptran(imode,1,x(iix:size(x)),&
& desc_a,iwork,info)
end if
if(info.ne.0) then
call psb_errpush(4010,name,a_err='PSI_iSwap...')
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psb_ihalov

@ -0,0 +1,21 @@
include ../../Make.inc
FOBJS = psi_compute_size.o psi_crea_bnd_elem.o psi_crea_index.o \
psi_crea_ovr_elem.o psi_dl_check.o \
psi_exist_ovr_elem.o psi_gthsct.o \
psi_list_search.o psi_sort_dl.o srtlist.o
COBJS = avltree.o
MPFOBJS = psi_dswapdata.o psi_dswaptran.o psi_iswapdata.o \
psi_iswaptran.o psi_extrct_dl.o psi_desc_index.o
INCDIRS = ../../lib .
lib: mpfobjs $(FOBJS) $(COBJS)
mpfobjs:
(make $(MPFOBJS) F90="$(MPF90)" FC="$(MPF90)" FCOPT="$(F90COPT)")
clean:
/bin/rm -f $(MPFOBJS) $(FOBJS) $(COBJS)

@ -0,0 +1,782 @@
/*****************************************************************/
/* */
/* avltree.c: balanced AVL tree search and insertion */
/* written by: Salvatore Filippone */
/* */
/* Last updated: Mar 09 2004 */
/* */
/* Referrences: [1] D. E. Knuth */
/* The Art of Computer Programming */
/* Vol. 3: Sorting and Searching, sec. 6.2.3 */
/* Addison-Wesley */
/* */
/* General description: */
/* */
/* Build and maintain a balanced binary search tree with */
/* arbitrary keys. The user is responsible for providing */
/* compare functions operating on the keys themselves. */
/* Key pointers are stored into nodes that are managed */
/* by the subroutine calls; the user should never examine */
/* nodes directly. */
/* The nodes for user items are allocated in batches, */
/* and the batches are kept as a doubly linked list. */
/* */
/* Data types: */
/* AVLTree: structure containing pointers to the list */
/* of node batches and to the root of the binary tree */
/* structure */
/* */
/* AVLNode: binary tree node, containing link pointers */
/* a reserved field, and a pointer to user data */
/* */
/* */
/* User callable functions: */
/* */
/* AVLTreePtr GetAVLTree() */
/* Purpose: allocate a new tree; */
/* Function value: a fresh AVL tree pointer; */
/* returns NULL in case of a memory failure*/
/* */
/* */
/* int AVLTreeReInit(AVLTreePtr Tree) */
/* Purpose: reinitialize an existing AVL Tree, reusing */
/* node batches already allocated. */
/* Input: 1. Tree */
/* A pointer to an existing tree structure */
/* Function value: 0 Normal termination */
/* -1 Invalid input pointer */
/* -3 Memory allocation failure */
/* */
/* AVLNodePtr AVLTreeSearch(AVLTreePtr Tree, void *key, */
/* int (*comp)(void*, void*)) */
/* Purpose: search an existing AVL Tree for a key */
/* Input: 1. Tree */
/* A valid pointer to a Tree */
/* 2. key */
/* The item being searched for */
/* 3. comp */
/* A comparison function: */
/* a<b => comp(a,b)<0 */
/* a==b => comp(a,b)=0 */
/* a>b => comp(a,b)>0 */
/* The function is always invoked as: */
/* comp(user_key,tree_key); */
/* */
/* */
/* Function value: NULL: input error or item not found */
/* valid pointer: pointer to a node */
/* containing the key */
/* */
/* int AVLTreeInsert(AVLTreePtr Tree, void *key, */
/* int (*comp)(void*,void*), */
/* void (*update)(void*,void*)) */
/* */
/* Purpose: Insert an item into an existing (possibly */
/* empty) tree. */
/* */
/* Input: 1. Tree */
/* The (existing) tree */
/* 2. key */
/* The (new) item to be inserted */
/* 3. comp */
/* comparison function (as in AVLTreeSearch) */
/* 4. update */
/* A user provided function to be called when */
/* the key is already present in the tree */
/* with the calling sequence: */
/* update(new_key,existing_key) */
/* enables the user to specify an arbitrary */
/* update procedure. */
/* */
/* */
/* */
/* AVLNodePtr AVLTreeUserInsert(AVLTreePtr Tree, void *key, */
/* int (*comp)(void*,void*)) */
/* */
/* Purpose: Insert an item into an existing (possibly */
/* empty) tree; returns a pointer to a node */
/* containing the item, even when that node */
/* was already existing; does no update */
/* */
/* Input: 1. Tree */
/* The (existing) tree */
/* 2. key */
/* The (new) item to be inserted */
/* 3. comp */
/* comparison function (as in AVLTreeSearch) */
/* */
/* Function value: Valid pointer: pointer to a node */
/* containing the item (possibly */
/* was already there) */
/* NULL input error or memory failure */
/* */
/* */
/* int HowManyKeys(AVLTreePtr Tree) */
/* Purpose: how many keys does Tree contain? */
/* Function value: >=0 */
/* */
/* */
/* void AVLTreeInorderTraverse(AVLTreePtr Tree, */
/* void (*func)( void*, void*), void *data) */
/* */
/* Purpose: visit the nodes of the binary tree in their */
/* natural order, performing an arbitrary */
/* task upon visit. */
/* Input: 1. Tree */
/* A tree pointer */
/* 2. func */
/* A function performing a user specified */
/* task on each node; the fuction is invoked as */
/* func( key,data) */
/* where data is parm. 3 */
/* 3. data */
/* Auxiliary data to be passed to func upon */
/* each visit */
/* */
/* int AVLTreeInorderTraverseWithDelims(AVLTreePtr Tree, */
/* void *first, void *last, int (*comp)(void*,void*) */
/* void (*func)( void*, void*), void *data) */
/* */
/* Purpose: visit the nodes of the binary tree in their */
/* natural order, performing an arbitrary */
/* task upon visit, but only on nodes */
/* with their key within a specified range. */
/* */
/* Input: 1. Tree */
/* A tree pointer */
/* 2. first */
/* Visit nodes with first <= node->key */
/* 3. last */
/* Visit nodes with node->key <= last */
/* 4. comp */
/* comparison function (as in AVLTreeSearch) */
/* 5. func */
/* A function performing a user specified */
/* task on each node; the fuction is invoked as */
/* func( key,data) */
/* where data is parm. 3 */
/* 6. data */
/* Auxiliary data to be passed to func upon */
/* each visit */
/* Function value: total number of nodes visited (>=0) */
/* */
/* */
/* */
/* void AVLTreeFree(AVLTreePtr Tree, void (*ffree)(void*)) */
/* Purpose: free up tree data storage */
/* Does NOT free the Tree pointer itself, */
/* rather all the structures that it points to */
/* Input: 1. Tree */
/* A tree pointer */
/* 2. ffree */
/* A user specified function invoked on each */
/* key pointer contained in the tree to free */
/* its memory (if necessary). Can be NULL. */
/* */
/* */
/*****************************************************************/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "avltree.h"
#define POOLSIZE 4096
#define MAXSTACK 64
#define MAX(a,b) ((a)>=(b) ? (a) : (b))
typedef struct avltvect {
AVLNode pool[POOLSIZE];
int avail;
AVLTVectPtr previous, next;
} AVLTVect;
int HowManyItems(AVLTreePtr Tree)
{
if (Tree==NULL) {
return(0);
} else {
return(Tree->nnodes);
}
}
AVLTreePtr GetAVLTree()
{
AVLTreePtr tree;
if ((tree=(AVLTreePtr) malloc(sizeof(AVLTree)))!=NULL){
memset(tree,'\0',sizeof(AVLTree));
AVLTreeInit(tree);
}
return(tree);
}
int AVLTreeInit(AVLTreePtr Tree)
{
AVLTVectPtr current;
if (Tree==NULL) {
fprintf(stderr,"Cannot initialize a NULL Tree pointer\n");
return(-1);
}
if (Tree->first!=NULL) {
fprintf(stderr,"Cannot initialize a nonempty Tree: call AVLTreeFree first\n");
return(-2);
}
if ((current=(AVLTVectPtr)malloc(sizeof(AVLTVect)))==NULL) {
fprintf(stderr,"Memory allocation failure\n");
return(-3);
}
memset(current,'\0',sizeof(AVLTVect));
Tree->first=Tree->current=current;
Tree->nnodes=0;
Tree->root=NULL;
return(0);
}
int AVLTreeReInit(AVLTreePtr Tree)
{
AVLTVectPtr current /* , next */ ;
if (Tree==NULL) {
fprintf(stderr,"Cannot ReInitialize a NULL Tree pointer\n");
return(-1);
}
if (Tree->first!=NULL) {
current=Tree->first;
while (current!=NULL) {
current->avail=0;
memset(current->pool,'\0',POOLSIZE*sizeof(AVLNode));
current=current->next;
}
} else {
if ((current=(AVLTVectPtr)malloc(sizeof(AVLTVect)))==NULL) {
fprintf(stderr,"Memory allocation failure\n");
return(-3);
}
current->avail=0;
current->previous=current->next=NULL;
Tree->first=current;
}
Tree->current=Tree->first;
Tree->nnodes=0;
Tree->root=NULL;
return(0);
}
AVLNodePtr AVLTreeSearch(AVLTreePtr Tree, void *key,
int (*comp)(void *, void *))
{
AVLNodePtr current;
int icmp;
if (Tree==NULL) return(NULL);
current = Tree->root;
while (current != NULL) {
icmp = (*comp)(key,current->key);
if (icmp<0) {
current = current->llink;
} else if (icmp==0){
return(current);
} else if (icmp>0) {
current = current->rlink;
}
}
return(current);
}
void AVLTreeInorderTraverse(AVLTreePtr Tree, void (*func)(void *, void *),
void *data)
{
int lev;
AVLNodePtr root;
AVLNodePtr stack[MAXSTACK+2];
int choice[MAXSTACK+2];
root=Tree->root;
if (root == NULL) return;
lev=0;
stack[lev] = root;
choice[lev] = -1;
while (lev>=0) {
if (stack[lev]==NULL) {
lev--;
} else {
if (choice[lev]==-1) {
stack[lev+1] = stack[lev]->llink;
choice[lev+1] = -1;
choice[lev] += 1;
lev++;
} else if (choice[lev]==0) {
(*func)(stack[lev]->key,data);
stack[lev+1] = stack[lev]->rlink;
choice[lev+1] = -1;
choice[lev] += 1;
lev++;
} else {
lev--;
}
}
}
}
int AVLTreeInorderTraverseWithDelims(AVLTreePtr Tree, void *first, void *last,
int (*comp)(void*, void*),
void (*func)(void *, void *),
void *data)
{
AVLNodePtr root, current;
int lev, nvisit, icmp;
AVLNodePtr stack[MAXSTACK+2];
int choice[MAXSTACK+2];
root=Tree->root;
if (root == NULL) return(0);
nvisit=0;
lev=0;
current = root;
while (current != NULL) {
stack[lev] = current;
icmp = (*comp)(first,current->key);
if (icmp<=0) {
choice[lev]=0;
current = current->llink;
} else if (icmp>0) {
current = current->rlink;
choice[lev]=1;
}
lev++;
}
lev--;
while (lev>=0) {
if (stack[lev]==NULL) {
lev--;
} else {
if (choice[lev]==-1) {
stack[lev+1] = stack[lev]->llink;
choice[lev+1] = -1;
choice[lev] += 1;
lev++;
} else if (choice[lev]==0) {
if (((*comp)(last,stack[lev]->key))<0) {
lev--;
} else {
(*func)(stack[lev]->key,data);
nvisit++;
stack[lev+1] = stack[lev]->rlink;
choice[lev+1] = -1;
choice[lev] += 1;
lev++;
}
} else {
lev--;
}
}
}
return(nvisit);
}
void AVLTreePreorderTraverse(AVLTreePtr Tree, void (*func)(void *, void *),
void *data)
{
AVLNodePtr root;
int lev;
AVLNodePtr stack[MAXSTACK+2];
int choice[MAXSTACK+2];
root=Tree->root;
if (root == NULL) return;
lev=0;
stack[lev] = root;
choice[lev] = -1;
while (lev>=0) {
if (stack[lev]==NULL) {
lev--;
} else {
if (choice[lev]==-1) {
(*func)(stack[lev]->key,data);
stack[lev+1] = stack[lev]->llink;
choice[lev+1] = -1;
choice[lev] += 1;
lev++;
} else if (choice[lev]==0) {
stack[lev+1] = stack[lev]->rlink;
choice[lev+1] = -1;
choice[lev] += 1;
lev++;
} else {
lev--;
}
}
}
}
void AVLTreeFree(AVLTreePtr Tree, void (*ffree)(void *))
{
AVLTVectPtr current, next;
int i;
if (Tree == NULL) return;
current=Tree->first;
while (current != NULL) {
next=current->next;
if (*ffree != NULL) {
for (i=0; i<current->avail; i++)
(*ffree)((current->pool[i]).key);
}
free(current);
current=next;
}
Tree->nnodes=0;
Tree->first=Tree->current=NULL;
return;
}
AVLNodePtr GetAVLNode(AVLTreePtr Tree)
{
AVLTVectPtr current, new;
AVLNodePtr newnode;
if (Tree==NULL) {
return(NULL);
}
if ((current=Tree->current)==NULL) {
return(NULL);
}
while ((current->avail>=POOLSIZE)&&(current->next!=NULL))
current=current->next;
if (current->avail<POOLSIZE) {
newnode=&(current->pool[current->avail]);
current->avail += 1;
} else {
if ((new=(AVLTVectPtr)malloc(sizeof(AVLTVect)))==NULL) {
fprintf(stderr,"Memory allocation failure\n");
return(NULL);
}
memset(new,'\0',sizeof(AVLTVect));
newnode=&(new->pool[0]);
new->avail = 1;
current->next=new;
new->previous=current;
new->next=NULL;
Tree->current=new;
}
return(newnode);
}
int AVLTreeInsert(AVLTreePtr Tree, void *key,int (*comp)(void *, void *),
void (*update)(void *, void *))
{
AVLNodePtr root, t, s, p, q, r;
int search, bal, icmp;
if (Tree==NULL) {
fprintf(stderr,"Fatal error: null tree pointer\n");
return(-1);
}
if ((root = Tree->root) == NULL) {
if ((t=GetAVLNode(Tree))==NULL) {
return(-2);
}
t->key = key;
t->rlink=t->llink=NULL;
t->bal=0;
Tree->root = t;
Tree->nnodes=1;
return(0);
}
t = NULL;
s = root;
p = root;
search=1;
while (search) {
icmp = (*comp)(key,p->key);
if (icmp<0) {
if ((q=p->llink)==NULL) {
if ((q=GetAVLNode(Tree))==NULL) {
return(-2);
}
p->llink=q;
search=0;
} else {
if (q->bal != 0) {
t=p;
s=q;
}
}
} else if (icmp == 0) {
(*update)(key,p->key);
return(1);
} else {
if ((q=p->rlink)==NULL) {
if ((q=GetAVLNode(Tree))==NULL) {
return(-2);
}
p->rlink=q;
search=0;
} else {
if (q->bal != 0) {
t=p;
s=q;
}
}
}
p=q;
}
q->key=key;
q->llink=q->rlink=NULL;
q->bal=0;
Tree->nnodes += 1;
if ((*comp)(key,s->key)<0) {
r=p=s->llink;
} else {
r=p=s->rlink;
}
while (p!=q) {
if ((*comp)(key,p->key)<0) {
p->bal=-1;
p = p->llink;
} else {
p->bal=1;
p=p->rlink;
}
}
if ((*comp)(key,s->key)<0) {
bal=-1;
} else {
bal=1;
}
if (s->bal == 0) {
s->bal=bal;
return (0);
} else if (s->bal == -bal) {
s->bal=0;
return (0);
} else if (s->bal == bal) {
if (r->bal == bal) {
/* single rotation */
p=r;
if (bal>0) {
s->rlink=r->llink;
r->llink=s;
} else {
s->llink=r->rlink;
r->rlink=s;
}
s->bal=r->bal=0;
} else if (r->bal == -bal) {
/* double rotation */
if (bal>0) {
p=r->llink;
r->llink=p->rlink;
p->rlink=r;
s->rlink=p->llink;
p->llink=s;
} else {
p=r->rlink;
r->rlink=p->llink;
p->llink=r;
s->llink=p->rlink;
p->rlink=s;
}
if (p->bal == bal) {
s->bal=-bal;
r->bal=0;
} else if (p->bal==0) {
s->bal=r->bal=0;
} else {
r->bal=bal;
s->bal=0;
}
p->bal=0;
}
if (t==NULL) {
root=p;
} else {
if (t->rlink==s) {
t->rlink=p;
} else {
t->llink=p;
}
}
Tree->root=root;
return(0);
}
return(0);
}
AVLNodePtr AVLTreeUserInsert(AVLTreePtr Tree, void *key,
int (*comp)(void *, void *))
{
AVLNodePtr root, t, s, p, q, r;
int search, bal, icmp;
if (Tree==NULL) {
fprintf(stderr,"Fatal error: null tree pointer\n");
return(NULL);
}
if ((root = Tree->root) == NULL) {
if ((t=GetAVLNode(Tree))==NULL) {
return(NULL);
}
t->key = key;
t->rlink=t->llink=NULL;
t->bal=0;
Tree->root = t;
Tree->nnodes=1;
return(t);
}
t = NULL;
s = root;
p = root;
search=1;
while (search) {
icmp = (*comp)(key,p->key);
if (icmp<0) {
if ((q=p->llink)==(AVLNodePtr) NULL) {
if ((q=GetAVLNode(Tree))==NULL) {
return(NULL);
}
p->llink=q;
search=0;
} else {
if (q->bal != 0) {
t=p;
s=q;
}
}
} else if (icmp == 0) {
return(p);
} else {
if ((q=p->rlink)==NULL) {
if ((q=GetAVLNode(Tree))==NULL) {
return(NULL);
}
p->rlink=q;
search=0;
} else {
if (q->bal != 0) {
t=p;
s=q;
}
}
}
p=q;
}
q->key=key;
q->llink=q->rlink=NULL;
q->bal=0;
Tree->nnodes += 1;
if ((*comp)(key,s->key)<0) {
r=p=s->llink;
} else {
r=p=s->rlink;
}
while (p!=q) {
if ((*comp)(key,p->key)<0) {
p->bal=-1;
p = p->llink;
} else {
p->bal=1;
p=p->rlink;
}
}
if ((*comp)(key,s->key)<0) {
bal=-1;
} else {
bal=1;
}
if (s->bal == 0) {
s->bal=bal;
return (q);
} else if (s->bal == -bal) {
s->bal=0;
return (q);
} else if (s->bal == bal) {
if (r->bal == bal) {
/* single rotation */
p=r;
if (bal>0) {
s->rlink=r->llink;
r->llink=s;
} else {
s->llink=r->rlink;
r->rlink=s;
}
s->bal=r->bal=0;
} else if (r->bal == -bal) {
/* double rotation */
if (bal>0) {
p=r->llink;
r->llink=p->rlink;
p->rlink=r;
s->rlink=p->llink;
p->llink=s;
} else {
p=r->rlink;
r->rlink=p->llink;
p->llink=r;
s->llink=p->rlink;
p->rlink=s;
}
if (p->bal == bal) {
s->bal=-bal;
r->bal=0;
} else if (p->bal==0) {
s->bal=r->bal=0;
} else {
r->bal=bal;
s->bal=0;
}
p->bal=0;
}
if (t==NULL) {
root=p;
} else {
if (t->rlink==s) {
t->rlink=p;
} else {
t->llink=p;
}
}
Tree->root=root;
return(q);
}
return(q);
}

@ -0,0 +1,38 @@
/* Type definitions for balanced AVL tree search and insertion */
/* See avltree.c for a full definition of the subroutines */
/* */
typedef struct avlnode *AVLNodePtr;
typedef struct avlnode {
AVLNodePtr llink,rlink;
int bal;
void *key;
} AVLNode;
typedef struct avltvect *AVLTVectPtr;
typedef struct avltree *AVLTreePtr;
typedef struct avltree {
AVLTVectPtr first, current;
AVLNodePtr root;
int nnodes;
} AVLTree;
AVLNodePtr AVLTreeSearch(AVLTreePtr, void *, int (*)(void *, void *));
AVLNodePtr GetAVLNode(AVLTreePtr);
int AVLTreeInit(AVLTreePtr);
int AVLTreeReInit(AVLTreePtr);
AVLTreePtr GetAVLTree();
int AVLTreeInsert(AVLTreePtr, void *, int (*)(void *, void *),
void (*)(void *, void *));
AVLNodePtr AVLTreeUserInsert(AVLTreePtr, void *, int (*)(void *, void *));
void AVLTreeInorderTraverse(AVLTreePtr, void (*)(void *, void *), void *);
void AVLTreePreorderTraverse(AVLTreePtr, void (*)(void *, void *), void *);
void AVLTreeFree(AVLTreePtr, void (*)(void *));
int HowManyItems(AVLTreePtr);
int AVLTreeInorderTraverseWithDelims(AVLTreePtr,void*, void*, int (*)(void*,void*),
void (*)(void *, void *), void *);

@ -0,0 +1,314 @@
/* This header file replaces every call to a BLACS routine by C interface
with the same call performed by Fortran interface */
#ifndef CTOF_BLACS
#define CTOF_BLACS
#endif
/* Variables necessary for invocations where
constant arguments are used */
static int i1, i2, i3, i4, i5, i6, i7;
/* Support routines:
Initialization */
#define Cblacs_pinfo(mypnum, nprocs) \
blacs_pinfo_(mypnum, nprocs)
#define Cblacs_setup(mypnum, nprocs) \
blacs_setup_(mypnum, nprocs)
#define Cblacs_get(icontxt, what, val) \
{i1 = icontxt; i2 = what; \
blacs_get_(&i1, &i2,val);}
#define Cblacs_set(icontxt, what, val) \
{i1 = icontxt; i2 = what; \
blacs_set_(&i1, &i2, &val);}
#define Cblacs_gridinit(icontxt, order, nprow, npcol) \
{i1 = nprow; i2 = npcol; \
blacs_gridinit_(icontxt, order, &i1, &i2);}
#define Cblacs_gridmap(icontxt, pmap, ldpmap, nprow, npcol) \
{i1 = ldpmap; i2 = nprow; i3 = npcol; \
blacs_gridmap_(icontxt, pmap, &i1, &i2, &i3);}
/* Support routines:
Destruction */
#define Cblacs_freebuff(icontxt, wait) \
{i1 = icontxt; i2 = wait; \
blacs_freebuff_(&i1, &i2);}
#define Cblacs_gridexit(icontxt) \
{i1 = icontxt; \
blacs_gridexit_(&i1);}
#define Cblacs_abort(icontxt, errornum) \
{i1 = icontxt; i2 = errornum; \
blacs_abort_(&i1, &i2);}
#define Cblacs_exit(doneflag) \
{i1 = doneflag; \
blacs_exit_(&i1);}
/* Support routines:
Informational and Miscellaneous */
#define Cblacs_gridinfo(icontxt,nprow,npcol,myprow,mypcol) \
{i1 = icontxt; \
blacs_gridinfo_(&i1, nprow, npcol, myprow, mypcol);}
#define Cblacs_pnum(icontxt, prow, pcol) \
{i1 = icontxt; i2 = prow; i3 = pcol; \
blacs_pnum_(&i1, &i2, &i3);}
#define Cblacs_pcoord(icontxt, pnum, prow, pcol) \
{i1 = icontxt; i2 = pnum; \
blacs_pcoord_(&i1, &i2, prow, pcol);}
#define Cblacs_barrier(icontxt, scope) \
{i1 = icontxt; \
blacs_barrier_(&i1, scope);}
/* Support routines:
Unofficial */
#define Csetpvmtids(ntasks, tids) \
{i1 = ntasks; \
setpvmtids_(&i1, tids);}
#define Cdcputime() \
dcputime_()
#define Cdwalltime() \
dwalltime_()
#define Cksendid(icontxt, rdest, cdest) \
{i1 = icontxt; i2 = rdest; i3 = cdest; \
ksendid_(&i1, &i2, &i3);}
#define Ckrecvid(icontxt, rsrc, csrc) \
{i1 = icontxt; i2 = rsrc; i3 = csrc; \
krecvid_(&i1, &i2, &i3);}
#define Ckbsid(icontxt, scope) \
{i1 = icontxt; \
kbsid_(&i1, scope);}
#define Ckbrid(icontxt, scope, rsrc, csrc) \
{i1 = icontxt; i2 = rsrc; i3 = csrc; \
kbrid_(&i1, scope, &i2, &i3);}
/* Point to Point :
Integer */
#define Cigesd2d(icontxt, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
igesd2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Cigerv2d(icontxt, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
igerv2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Citrsd2d(icontxt, uplo, diag, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
itrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
#define Citrrv2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
itrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Point to Point :
Single precision real */
#define Csgesd2d(icontxt, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
sgesd2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Csgerv2d(icontxt, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
sgerv2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Cstrsd2d(icontxt, uplo, diag, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
strsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
#define Cstrrv2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
strsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Point to Point :
Double precision real */
#define Cdgesd2d(icontxt, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
dgesd2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Cdgerv2d(icontxt, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
dgerv2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Cdtrsd2d(icontxt, uplo, diag, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
dtrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
#define Cdtrrv2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
dtrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Point to Point :
Single precision complex */
#define Ccgesd2d(icontxt, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
cgesd2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Ccgerv2d(icontxt, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
cgerv2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Cctrsd2d(icontxt, uplo, diag, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
ctrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
#define Cctrrv2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
ctrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Point to Point :
Double precision complex */
#define Czgesd2d(icontxt, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
zgesd2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Czgerv2d(icontxt, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
zgerv2d_(&i1, &i2, &i3, A, &i4, &i5, &i6);}
#define Cztrsd2d(icontxt, uplo, diag, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
ztrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
#define Cztrrv2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
ztrsd2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Broadcasts :
Integer */
#define Cigebs2d(icontxt, scope, top, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
igebs2d_(&i1, scope, top, &i2, &i3, A, &i4);}
#define Cigebr2d(icontxt, scope, top, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
igebr2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Citrbs2d(icontxt, scope, top, uplo, diag, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
itrbs2d_(&i1, scope, top, uplo, diag, &i2, &i3, A, &i4);}
#define Citrbr2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
igebr2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Broadcasts :
Single precision real */
#define Csgebs2d(icontxt, scope, top, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
sgebs2d_(&i1, scope, top, &i2, &i3, A, &i4);}
#define Csgebr2d(icontxt, scope, top, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
sgebr2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Cstrbs2d(icontxt, scope, top, uplo, diag, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
strbs2d_(&i1, scope, top, uplo, diag, &i2, &i3, A, &i4);}
#define Cstrbr2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
sgebr2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Broadcasts :
Double precision real */
#define Cdgebs2d(icontxt, scope, top, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
dgebs2d_(&i1, scope, top, &i2, &i3, A, &i4);}
#define Cdgebr2d(icontxt, scope, top, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
dgebr2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Cdtrbs2d(icontxt, scope, top, uplo, diag, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
dtrbs2d_(&i1, scope, top, uplo, diag, &i2, &i3, A, &i4);}
#define Cdtrbr2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
dgebr2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Broadcasts :
Single precision complex */
#define Ccgebs2d(icontxt, scope, top, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
cgebs2d_(&i1, scope, top, &i2, &i3, A, &i4);}
#define Ccgebr2d(icontxt, scope, top, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
cgebr2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Cctrbs2d(icontxt, scope, top, uplo, diag, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
ctrbs2d_(&i1, scope, top, uplo, diag, &i2, &i3, A, &i4);}
#define Cctrbr2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
cgebr2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Broadcasts :
Double precision complex */
#define Czgebs2d(icontxt, scope, top, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
zgebs2d_(&i1, scope, top, &i2, &i3, A, &i4);}
#define Czgebr2d(icontxt, scope, top, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
zgebr2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Cztrbs2d(icontxt, scope, top, uplo, diag, m, n, A, lda) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; \
ztrbs2d_(&i1, scope, top, uplo, diag, &i2, &i3, A, &i4);}
#define Cztrbr2d(icontxt, uplo, diag, m, n, A, lda, rsrc, csrc) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rsrc; i6 = csrc; \
zgebr2d_(&i1, uplo, diag, &i2, &i3, A, &i4, &i5, &i6);}
/* Combines:
Integer */
#define Cigsum2d(icontxt, scope, top, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
igsum2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Cigamx2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
igamx2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
#define Cigamn2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
igamn2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
/* Combines:
Single precision real */
#define Csgsum2d(icontxt, scope, top, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
sgsum2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Csgamx2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
sgamx2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
#define Csgamn2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
sgamn2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
/* Combines:
Double precision real */
#define Cdgsum2d(icontxt, scope, top, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
dgsum2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Cdgamx2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
dgamx2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
#define Cdgamn2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
dgamn2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
/* Combines:
Single precision complex */
#define Ccgsum2d(icontxt, scope, top, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
cgsum2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Ccgamx2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
cgamx2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
#define Ccgamn2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
cgamn2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
/* Combines:
Double precision complex */
#define Czgsum2d(icontxt, scope, top, m, n, A, lda, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = rdest; i6 = cdest; \
zgsum2d_(&i1, scope, top, &i2, &i3, A, &i4, &i5, &i6);}
#define Czgamx2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
zgamx2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}
#define Czgamn2d(icontxt, scope, top, m, n, A, lda, RA, CA, RCflag, rdest, cdest) \
{i1 = icontxt; i2 = m; i3 = n; i4 = lda; i5 = RCflag; i6 = rdest; i7 = cdest; \
zgamn2d_(&i1, scope, top, &i2, &i3, A, &i4, RA, CA, &i5, &i6, &i7);}

@ -0,0 +1,292 @@
/* ---------------------------------------------------------------------
*
* -- PSBLAS routine (version 1.0) --
*
* ---------------------------------------------------------------------
*/
/*
* This file includes the standard C libraries, as well as system
* dependent include files. All PSBLAS routines include this file.
*/
#include <string.h>
#ifndef PSBLASH
#define PSBLASH
/*
* ========================================================================
* Machine Specific PBLAS macros
* ========================================================================
*/
/* This is a debugging option.
#define PS_CONTROL_LEVEL */
#define _HAL_ 0
#define _T3D_ 1
#ifdef T3D
#define _MACH_ _T3D_
#endif
#ifndef _MACH_
#define _MACH_ _HAL_
#endif
/*
* ========================================================================
* Include files
* ========================================================================
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#if( _MACH_ == _T3D_ )
#include <fortran.h>
#endif
#ifdef USE_FBLACS
#ifndef CTOF_BLACS
#include "ctof_blacs.h"
#endif
#endif
/*
* ========================================================================
* FORTRAN <-> C interface
* ========================================================================
*
* These macros define how the PBLAS will be called. _F2C_ADD_ assumes
* that they will be called by FORTRAN, which expects C routines to have
* an underscore postfixed to the name (Suns, and Intel machines expect
* this). _F2C_NOCHANGE indicates that FORTRAN will be calling, and that
* it expects the name called by FORTRAN to be identical to that compiled
* by the C (RS6K's do this). _F2C_UPCASE says it expects C routines
* called by FORTRAN to be in all upcase (CRAY wants this).
*/
#define _F2C_ADD_ 0
#define _F2C_NOCHANGE 1
#define _F2C_UPCASE 2
#ifdef UpCase
#define _F2C_CALL_ _F2C_UPCASE
#endif
#ifdef NoChange
#define _F2C_CALL_ _F2C_NOCHANGE
#endif
#ifdef Add_
#define _F2C_CALL_ _F2C_ADD_
#endif
#ifndef _F2C_CALL_
#define _F2C_CALL_ _F2C_ADD_
#endif
/*
* ========================================================================
* TYPE DEFINITIONS AND CONVERSION UTILITIES
* ========================================================================
*/
typedef struct { float re, im; } complex;
typedef struct { double re, im; } complex16;
#if( _MACH_ == _T3D_ )
/* Type of character argument in a FORTRAN call */
#define F_CHAR _fcd
/* Character conversion utilities */
#define F2C_CHAR(a) ( _fcdtocp( (a) ) )
#define C2F_CHAR(a) ( _cptofcd( (a), 1 ) )
/* Type of FORTRAN functions */
#define F_VOID_FCT void fortran /* Subroutine */
#define F_INTG_FCT int fortran /* INTEGER function */
#define F_DBLE_FCT double fortran /* DOUBLE PRECISION function */
#else
/* Type of character argument in a FORTRAN call */
typedef char * F_CHAR;
/* Character conversion utilities */
#define F2C_CHAR(a) (a)
#define C2F_CHAR(a) (a)
/* Type of FORTRAN functions */
#define F_VOID_FCT void /* Subroutine */
#define F_INTG_FCT int /* INTEGER function */
#define F_DBLE_FCT double /* DOUBLE PRECISION function */
#endif
/*
* ======================================================================
* FUNCTIONS PROTOTYPES
* ======================================================================
*/
void DVSct(int n, int k,int idx[],int flag, double X[], int lx,
double beta, double Y[], int ly);
void DVGth(int n, int k,int idx[],int flag, double X[], int lx,double Y[], int ly);
void IVSct(int n, int k,int idx[],int flag, int X[], int lx,
int beta, int Y[], int ly);
void IVGth(int n, int k,int idx[],int flag, int X[], int lx,int Y[], int ly);
void PSI_dSwapData(int iflag, int n, double beta, double Y[], int ly,
int desc_data[], int desc_halo[],
double *work, int *lwork, int *ierror);
void PSI_dSwapTran(int flag, int n, double beta, double Y[], int ly,
int desc_data[], int desc_halo[],
double *work, int *lwork, int *ierror);
void PSI_zSwapData(int n, double Y[], int ly, int desc_data[], int desc_halo[],
double *work, int *lwork, int *ierror);
void PSI_zSwapOverlap(double Y[], double Sum_Ovrlap[], int desc_data[],
int desc_ovrlap[], double work[], int *lwork, int *ierror);
void PSI_iSwapData(int iflag, int n, int beta, int Y[], int ly,
int desc_data[], int desc_halo[],
int *work, int *lwork, int *ierror);
void PSI_iSwapTran(int flag, int n, int beta, int Y[], int ly,
int desc_data[], int desc_halo[],
int *work, int *lwork, int *ierror);
/*
* ========================================================================
* #DEFINE MACRO CONSTANTS
* ========================================================================
*/
/* MACRO max */
#define max(x,y) ((x)>(y)?(x):(y))
/*MACRO for ovrlap update*/
#define NOHALO_ 0
#define HALO_ 4
#define NONE_ 0
#define SUM_ 1
#define AVG_ 2
#define SQUARE_ROOT_ 3
/* Bit fields to control swapdata/ovrlap behaviour.
BEWARE: check consistency with tools_const.f.
Should it be automated? */
#define SWAP_SEND 1
#define SWAP_RECV 2
#define SWAP_SYNC 4
#define SWAP_MPI 8
/* Macro for MATRIX_DATA array */
#define DEC_TYPE_ 0 /* The type of decomposition of global
matrix A. */
#define M_ 1 /* Number of equations */
#define N_ 2 /* Number of variables */
#define N_ROW_ 3 /* The number of row of local matrix. */
#define N_COL_ 4 /* The number of columns of local
matrix. */
#define CTXT_ 5 /* The BLACS context handle, indicating
the global context of the operation
on the matrix.
The context itself is global. */
#define LOC_TO_GLOB_ 6 /* The pointer to the array
loc_to_glob */
#define MPI_C_ 8 /* The MPI Fortran handle */
/* values for DEC_TYPE_ */
#define DESC_ASB 3099
#define DESC_BLD (DESC_ASB+1)
/* Macro for HALO array */
#define PROC_ID_ 0 /* The identifier of domain. */
#define N_ELEM_RECV_ 1 /* The number of elements to receive*/
#define ELEM_RECV_ 2 /* The first index of local elements */
#define N_ELEM_SEND_ 2 /* The number of elements to send */
#define ELEM_SEND_ 3 /* The first index of local elements */
/* Macro for OVERLAP array */
#define N_OVRLP_ELEM_ 1 /* The number of overlap elements to recv/send */
#define OVRLP_ELEM_TO_ 2 /* The first index of local elements */
/* Macro for OVR_ELEM_D array */
#define OVRLP_ELEM_ 0
#define N_DOM_OVR_ 1
#define BROADCAST "B" /* Blacs operation definitions */
#define COMBINE "C"
#define ALL "A" /* Scope definitions */
#define COLUMN "C"
#define ROW "R"
#define TOPDEF " " /* Default BLACS topology, PB-BLAS routines */
#define CTOPDEF ' '
#define TOPGET "!"
#define YES "Y"
#define NO "N"
#define MULLENFAC 2
#define ONE 1.0
#define ZERO 0.0
/* Integer values for error checking */
#define no_err 0
#define act_ret 0
#define act_abort 1
/*
* ========================================================================
* PREPROCESSOR MACRO FUNCTIONS USED FOR OPTIMIZATION & CONVENIENCE
* ========================================================================
*/
#define ABS(a) ((a > 0) ? (a) : (-a))
#define MIN(a,b) ((a < b) ? (a) : (b))
#define MAX(a,b) ((a > b) ? (a) : (b))
#define CEIL(a,b) ( (a+b-1) / (b) )
#define Mlowcase(C) ( ((C) > 64 && (C) < 91) ? (C) | 32 : (C) )
#define Mupcase(C) ( ((C) > 96 && (C) < 123) ? (C) & 0xDF : (C) )
#define INDXG2L( iglob, nb, iproc, isrcproc, nprocs )\
( (nb) * ( ( (iglob)-1) / ( (nb) * (nprocs) ) ) +\
( ( (iglob) - 1 ) % (nb) ) + 1 )
#define INDXL2G( iloc, nb, iproc, isrcproc, nprocs )\
( (nprocs) * (nb) * ( ( (iloc) - 1 ) / (nb) ) +\
( ( (iloc) - 1 ) % (nb) ) +\
( ( (nprocs) + (iproc) - (isrcproc) ) % (nprocs) ) * (nb) + 1 )
#define INDXG2P( iglob, nb, iproc, isrcproc, nprocs ) \
( ( (isrcproc) + ( (iglob) - 1 ) / (nb) ) % (nprocs) )
#define MYROC0( nblocks, n, nb, nprocs )\
( ( (nblocks) % (nprocs) ) ? ( ( (nblocks) / (nprocs) ) * (nb) + (nb) )\
: ( ( (nblocks) / (nprocs) )* (nb) + ( (n) % (nb) ) ) )
#if( _F2C_CALL_ == _F2C_ADD_ )
/*
* These defines set up the naming scheme required to have a FORTRAN
* routine call a C routine (which is what the PBLAS are written in).
* No redefinition necessary to have following FORTRAN to C interface:
* FORTRAN CALL C DECLARATION
* call pdgemm(...) void pdgemm_(...)
*
* This is the default.
*/
#define pbchkvectf pbchkvectf_
#define fcpsb_errcomm fcpsb_errcomm_
#define fcpsb_erractionsave fcpsb_erractionsave_
#define fcpsb_erractionrestore fcpsb_erractionrestore_
#define fcpsb_perror fcpsb_perror_
#define fcpsb_serror fcpsb_serror_
#define fcpsb_errpush fcpsb_errpush_
#endif

@ -0,0 +1,102 @@
subroutine psi_compute_size(desc_data,&
& index_in, dl_lda, info)
use psb_const_mod
use psb_error_mod
implicit none
! ....scalars parameters....
integer :: info, dl_lda
! .....array parameters....
integer :: desc_data(:), index_in(:)
! ....local scalars....
integer :: i,npcol,nprow,mycol,myrow,proc,counter, max_index
integer :: icontxt, err, err_act, np
! ...local array...
integer :: exch(2)
integer :: int_err(5)
integer, pointer :: counter_recv(:), counter_dl(:)
! ...parameters
logical, parameter :: debug=.false.
character(len=20) :: name
name='psi_compute_size'
call psb_get_erraction(err_act)
info = 0
icontxt = desc_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
np=npcol
allocate(counter_dl(0:np-1),counter_recv(0:np-1))
! ..initialize counters...
do i=0,np-1
counter_recv(i)=0
counter_dl(i)=0
enddo
! ....verify local correctness of halo_in....
i=1
do while (index_in(i).ne.-1)
proc=index_in(i)
if ((proc.gt.np-1).or.(proc.lt.0)) then
info = 115
int_err(1) = 11
int_err(2) = proc
call psb_errpush(info,name,i_err=int_err)
goto 9999
endif
counter_dl(proc)=1
! ..update no of elements to receive from proc proc..
counter_recv(proc)=counter_recv(proc)+&
& index_in(i+1)
i=i+index_in(i+1)+2
enddo
! ...computing max_halo: max halo points to be received from
! same processor
max_index=0
dl_lda=0
do i=0,np-1
if (counter_recv(i).gt.max_index) max_index = counter_recv(i)
if (counter_dl(i).eq.1) dl_lda = dl_lda+1
enddo
! computing max global value of dl_lda
call igamx2d(icontxt, psb_all_, psb_topdef_, 1, ione, dl_lda, &
&1, counter, counter, -ione ,-ione,-ione)
if (debug) then
write(0,*) 'psi_compute_size: ',dl_lda
endif
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_serror(icontxt)
return
end if
return
end subroutine psi_compute_size

@ -0,0 +1,73 @@
subroutine psi_crea_bnd_elem(desc_a,info)
use psb_descriptor_type
use psb_error_mod
implicit none
type(psb_desc_type) :: desc_a
integer, intent(out) :: info
integer, pointer :: work(:)
integer :: i, j, nr, ns, k, irv, err_act
character(len=20) :: name, ch_err
info = 0
name='psi_crea_bnd_elem'
call psb_erractionsave(err_act)
allocate(work(size(desc_a%halo_index)),stat=info)
if (info /= 0 ) then
info = 4000
call psb_errpush(info,name)
goto 9999
end if
i=0
j=1
do while(desc_a%halo_index(j) /= -1)
nr = desc_a%halo_index(j+1)
ns = desc_a%halo_index(j+1+nr+1)
do k=1, ns
i = i + 1
work(i) = desc_a%halo_index(j+1+nr+1+k)
enddo
j = j + 1 + ns + 1 + nr + 1
enddo
if (i>0) then
call isr(i,work)
j=1
irv = work(1)
do k=2, i
if (work(k) /= irv) then
irv = work(k)
j = j + 1
work(j) = work(k)
endif
enddo
else
j = 0
endif
allocate(desc_a%bnd_elem(j+1))
if (.false.) then
desc_a%bnd_elem(1) = j
desc_a%bnd_elem(2:j+1) = work(1:j)
else
desc_a%bnd_elem(1:j) = work(1:j)
desc_a%bnd_elem(j+1) = -1
endif
deallocate(work)
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_serror()
return
end if
return
end subroutine psi_crea_bnd_elem

@ -0,0 +1,88 @@
subroutine psi_crea_index(desc_a,index_in,index_out,glob_idx,info)
use psb_realloc_mod
use psb_descriptor_type
use psb_error_mod
implicit none
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in) :: index_in(:)
integer, intent(out) :: index_out(:)
logical :: glob_idx
! ....local scalars...
integer :: me,npcol,mycol,nprow,i,j,k,&
& mode, int_err(5), err, err_act, np,&
& dl_lda, icontxt
! ...parameters...
integer, pointer :: dep_list(:,:), length_dl(:)
integer,parameter :: root=0,no_comm=-1
logical,parameter :: debug=.false.
character(len=20) :: name, ch_err
info = 0
name='psi_crea_index'
call psb_erractionsave(err_act)
icontxt = desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,np,npcol,me,mycol)
if (np == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
! allocate dependency list
call psi_compute_size(desc_a%matrix_data, index_in, dl_lda, info)
allocate(dep_list(dl_lda,0:np-1),length_dl(0:np-1))
! ...extract dependence list (ordered list of identifer process
! which every process must communcate with...
if (debug) write(*,*) 'crea_halo: calling extract_dep_list'
mode = 1
call psi_extract_dep_list(desc_a%matrix_data,index_in,&
& dep_list,length_dl,np,dl_lda,mode,info)
if(info /= 0) then
call psb_errpush(4010,name,a_err='extrct_dl')
goto 9999
end if
if (debug) write(*,*) 'crea_index: from extract_dep_list',&
& me,length_dl(0),index_in(1), ':',dep_list(:length_dl(me),me)
! ...now process root contains dependence list of all processes...
if (debug) write(*,*) 'crea_halo: root sorting dep list'
! ....i must order communication in in halo
call psi_dl_check(dep_list,dl_lda,np,length_dl)
! ....now i can sort dependence list......
call psi_sort_dl(dep_list,length_dl,np,info)
if(info.ne.0) then
call psb_errpush(4010,name,a_err='psi_sort_dl')
goto 9999
end if
! ...create desc_halo array.....
if(debug) write(0,*)'in psi_crea_index calling psi_desc_index',&
& size(index_out)
call psi_desc_index(desc_a%matrix_data,index_in,dep_list(1,me),&
& length_dl(me),desc_a%loc_to_glob,desc_a%glob_to_loc,&
& index_out,glob_idx)
deallocate(dep_list,length_dl)
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_serror(icontxt)
return
end if
return
end subroutine psi_crea_index

@ -0,0 +1,63 @@
subroutine psi_crea_ovr_elem(desc_overlap,ovr_elem)
use psb_realloc_mod
implicit none
! ...parameter arrays....
integer :: desc_overlap(:)
integer, pointer :: ovr_elem(:)
! ...local scalars...
integer :: i,pnt_new_elem,ret,j, info
integer :: dim_ovr_elem
! ...external function...
integer :: psi_exist_ovr_elem,dim
external :: psi_exist_ovr_elem
logical, parameter :: usetree=.true.
i=1
pnt_new_elem=1
if (usetree) call initpairsearchtree(info)
do while (desc_overlap(i).ne.-1)
! ...loop over all procs of desc_overlap list....
i=i+1
do j=1,desc_overlap(i)
! ....loop over all overlap indices referred to act proc.....
if (usetree) then
call searchinskeyval(desc_overlap(i+j),pnt_new_elem,&
& ret,info)
if (ret == pnt_new_elem) ret=-1
else
ret=psi_exist_ovr_elem(ovr_elem,pnt_new_elem-2,&
& desc_overlap(i+j))
endif
if (ret.eq.-1) then
! ...this point not exist in ovr_elem list:
! add to it.............................
ovr_elem(pnt_new_elem)=desc_overlap(i+j)
ovr_elem(pnt_new_elem+1)=2
pnt_new_elem=pnt_new_elem+2
! ...check if overflow element_d array......
if (pnt_new_elem.gt.dim_ovr_elem) then
dim=(3*size(ovr_elem))/2+2
write(0,*) 'calling realloc crea_ovr_elem',dim
call psrealloc(dim,ovr_elem,info)
endif
else
! ....this point already exist in ovr_elem list
! its position is ret............................
ovr_elem(ret+1)=ovr_elem(ret+1)+1
endif
enddo
i=i+2*desc_overlap(i)+2
enddo
! ...add -1 at the end of output list......
ovr_elem(pnt_new_elem)=-1
if (usetree) call freepairsearchtree()
end subroutine psi_crea_ovr_elem

@ -0,0 +1,232 @@
subroutine psi_desc_index(desc_data,index_in,dep_list,&
& length_dl,loc_to_glob,glob_to_loc,desc_index,&
& isglob_in,info)
use psb_realloc_mod
use psb_error_mod
use psb_const_mod
implicit none
include 'mpif.h'
!c ...array parameters.....
integer :: desc_data(:),index_in(:),dep_list(:)
integer :: loc_to_glob(:),glob_to_loc(:)
integer,pointer :: desc_index(:)
integer :: length_dl, info
logical :: isglob_in
!c ....local scalars...
integer :: j,me,np,npcol,mycol,i,proc,dim
!c ...parameters...
integer, parameter :: ione=1
integer :: icontxt
integer :: no_comm,err
parameter (no_comm=-1)
!c ...local arrays..
integer :: int_err(5)
integer,pointer :: brvindx(:),rvsz(:),&
& bsdindx(:),sdsz(:), sndbuf(:), rcvbuf(:)
integer :: ihinsz,ntot,k,err_act,&
& idxr, idxs, iszs, iszr, nesd, nerv, icomm, iret
logical,parameter :: debug=.false., usempi=.true.
character(len=20) :: name, ch_err
info = 0
name='psi_desc_index'
call psb_erractionsave(err_act)
!c if mode == 1 then we can use glob_to_loc array
!c else we can't utilize it
icontxt=desc_data(psb_ctxt_)
call blacs_gridinfo(icontxt,np,npcol,me,mycol)
if (np == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
if (debug) then
write(0,*) me,'start desc_index'
call blacs_barrier(icontxt,'all')
endif
call blacs_get(icontxt,10,icomm)
!c
!c first, find out the total sizes to be exchanged.
!c note: things marked here as sndbuf/rcvbuf (for mpi) corresponds to things
!c to be received/sent (in the final psblas descriptor).
!c be careful of the inversion
!c
allocate(sdsz(np),rvsz(np),bsdindx(np),brvindx(np),stat=info)
if(info /= 0) then
info=4000
call psb_errpush(info,name)
goto 9999
end if
sdsz(:) = 0
rvsz(:) = 0
bsdindx(:) = 0
brvindx(:) = 0
i = 1
do
if (index_in(i) == -1) exit
proc = index_in(i)
i = i + 1
nerv = index_in(i)
sdsz(proc+1) = sdsz(proc+1) + nerv
i = i + nerv + 1
end do
ihinsz=i
call mpi_alltoall(sdsz,1,mpi_integer,rvsz,1,mpi_integer,icomm,info)
if(info /= 0) then
call psb_errpush(4010,name,a_err='mpi_alltoall')
goto 9999
end if
i = 1
idxs = 0
idxr = 0
do i=1, length_dl
proc = dep_list(i)
bsdindx(proc+1) = idxs
idxs = idxs + sdsz(proc+1)
brvindx(proc+1) = idxr
idxr = idxr + rvsz(proc+1)
end do
iszs = sum(sdsz)
iszr = sum(rvsz)
if ((iszs /= idxs).or.(iszr /= idxr)) then
write(0,*) 'strange results???', iszs,idxs,iszr,idxr
end if
if (debug) then
write(0,*) me,'computed sizes ',iszr,iszs
call blacs_barrier(icontxt,'all')
endif
ntot = (3*(max(count(sdsz>0),count(rvsz>0)))+ iszs + iszr) + 1
if (size(desc_index) < ntot) then
!c$$$ write(0,*) 'potential error on desc_index :',
!c$$$ + length_dh, size(desc_index),ntot
write(0,*) 'calling irealloc psi_desc_index ',ntot
call psrealloc(ntot,desc_index,info)
endif
if (info /= 0) then
call psb_errpush(4010,name,a_err='psrealloc')
goto 9999
end if
if (debug) then
write(0,*) me,'computed allocated workspace ',iszr,iszs
call blacs_barrier(icontxt,'all')
endif
allocate(sndbuf(iszs),rcvbuf(iszr),stat=info)
if(info /= 0) then
info=4000
call psb_errpush(info,name)
goto 9999
end if
i = 1
do
if (i > ihinsz) then
write(0,*) me,' did not find index_in end??? ',i,ihinsz
exit
end if
if (index_in(i) == -1) exit
proc = index_in(i)
i = i + 1
nerv = index_in(i)
! c
! c note that here bsdinx is zero-based, hence the following loop
! c
if (isglob_in) then
do j=1, nerv
sndbuf(bsdindx(proc+1)+j) = (index_in(i+j))
end do
else
do j=1, nerv
sndbuf(bsdindx(proc+1)+j) = loc_to_glob(index_in(i+j))
end do
endif
bsdindx(proc+1) = bsdindx(proc+1) + nerv
i = i + nerv + 1
end do
if (debug) then
write(0,*) me,' prepared send buffer '
call blacs_barrier(icontxt,'all')
endif
!c
!c now have to regenerate bsdindx
!c
idxs = 0
idxr = 0
do i=1, length_dl
proc = dep_list(i)
bsdindx(proc+1) = idxs
idxs = idxs + sdsz(proc+1)
brvindx(proc+1) = idxr
idxr = idxr + rvsz(proc+1)
end do
call mpi_alltoallv(sndbuf,sdsz,bsdindx,mpi_integer,&
& rcvbuf,rvsz,brvindx,mpi_integer,icomm,info)
if(info /= 0) then
call psb_errpush(4010,name,a_err='mpi_alltoallv')
goto 9999
end if
!c
!c at this point we can finally build the output desc_index. beware
!c of snd/rcv inversion.
!c
i = 1
do k = 1, length_dl
proc = dep_list(k)
desc_index(i) = proc
i = i + 1
nerv = sdsz(proc+1)
desc_index(i) = nerv
do j=1, nerv
desc_index(i+j) = glob_to_loc(sndbuf(bsdindx(proc+1)+j))
end do
i = i + nerv + 1
nesd = rvsz(proc+1)
desc_index(i) = nesd
do j=1, nesd
desc_index(i+j) = glob_to_loc(rcvbuf(brvindx(proc+1)+j))
end do
i = i + nesd + 1
end do
desc_index(i) = - 1
deallocate(sdsz,rvsz,bsdindx,brvindx,sndbuf,rcvbuf,stat=info)
if (info /= 0) then
info=4000
call psb_errpush(info,name)
goto 9999
end if
if (debug) then
write(0,*) me,'end desc_index'
call blacs_barrier(icontxt,'all')
endif
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_serror(icontxt)
return
end if
return
end subroutine psi_desc_index

@ -0,0 +1,45 @@
subroutine psi_dl_check(dep_list,dl_lda,np,length_dl)
use psb_const_mod
implicit none
integer :: np,dl_lda,length_dl(:)
integer :: dep_list(dl_lda,0:np-1)
! locals
integer :: proc, proc2, i, j
! ...i must order communication in in halo
! ...if in dep_list of process i there is j
! and in dep_list of process j there isn't i,
! add to it process i...
do proc=0,np-1
i=1
do while (i.le.length_dl(proc))
proc2=dep_list(i,proc)
if (proc2.ne.psb_no_comm_) then
! ...search proc in proc2's dep_list....
j=1
do while ((j.le.length_dl(proc2).and.&
& dep_list(j,proc2).ne.proc))
j=j+1
enddo
if ((dep_list(j,proc2).ne.proc).or.&
& (j.gt.length_dl(proc2))) then
! ...proc not found...
! ...add proc to proc2's dep_list.....
length_dl(proc2)=length_dl(proc2)+1
if (length_dl(proc2).gt.size(dep_list,1)) then
write(0,*)'error in crea_halo', proc2,&
& length_dl(proc2),'>',size(dep_list,1)
endif
dep_list(length_dl(proc2),proc2)=proc
endif
endif
i=i+1
enddo
enddo
end subroutine psi_dl_check

@ -0,0 +1,739 @@
subroutine psi_dswapdatam(flag,n,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag, n
integer, intent(out) :: info
real(kind(1.d0)) :: y(:,:), beta
real(kind(1.d0)), target ::work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt
integer :: blacs_pnum, krecvid, ksendid
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: int_err(5)
logical :: swap_mpi, swap_sync, swap_send, swap_recv
real(kind(1.d0)), pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_dswap_data'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = n*nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = n*nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(sndbuf,sdsz,bsdidx,&
& mpi_double_precision,rcvbuf,rvsz,&
& brvidx,mpi_double_precision,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
call dgesd2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I receive
rcv_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
rcv_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
call dgesd2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
rcv_pt = brvidx(proc_to_comm)
call mpi_irecv(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(sndbuf(snd_pt),sdsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
call dgesd2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
rcv_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_dswapdatam
subroutine psi_dswapdatav(flag,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag
integer, intent(out) :: info
real(kind(1.d0)) :: y(:), beta
real(kind(1.d0)), target :: work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt, n
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: blacs_pnum, krecvid, ksendid
integer :: int_err(5)
logical :: swap_mpi, swap_sync, swap_send, swap_recv
real(kind(1.d0)), pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_dswap_datav'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
n=1
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = n*nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = n*nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(sndbuf,sdsz,bsdidx,&
& mpi_double_precision,rcvbuf,rvsz,&
& brvidx,mpi_double_precision,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
call dgesd2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I receive
rcv_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
rcv_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
call dgesd2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
rcv_pt = brvidx(proc_to_comm)
call mpi_irecv(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(sndbuf(snd_pt),sdsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
call dgesd2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
rcv_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_dswapdatav

@ -0,0 +1,735 @@
subroutine psi_dswaptranm(flag,n,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag, n
integer, intent(out) :: info
real(kind(1.d0)) :: y(:,:), beta
real(kind(1.d0)), target :: work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: int_err(5)
integer :: blacs_pnum, krecvid, ksendid
logical :: swap_mpi, swap_sync, swap_send, swap_recv
real(kind(1.d0)), pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_dswaptranm'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = n*nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = n*nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(rcvbuf,rvsz,brvidx,&
& mpi_double_precision,sndbuf,sdsz,&
& bsdidx,mpi_double_precision,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
call dgesd2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I receive
snd_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
snd_pt = bsdidx(proc_to_comm)
call dgerv2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
call dgesd2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = bsdidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
snd_pt = brvidx(proc_to_comm)
call mpi_irecv(sndbuf(rcv_pt),sdsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
call dgesd2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
snd_pt = bsdidx(proc_to_comm)
call dgerv2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
idx_pt = point_to_proc+nerv+psb_elem_send_
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_dswaptranm
subroutine psi_dswaptranv(flag,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag
integer, intent(out) :: info
real(kind(1.d0)) :: y(:), beta
real(kind(1.d0)), target :: work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt, n
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: int_err(5)
integer :: blacs_pnum, krecvid, ksendid
logical :: swap_mpi, swap_sync, swap_send, swap_recv
real(kind(1.d0)), pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_dswaptranv'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
n=1
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(rcvbuf,rvsz,brvidx,&
& mpi_double_precision,sndbuf,sdsz,&
& bsdidx,mpi_double_precision,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
call dgesd2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I receive
snd_pt = brvidx(proc_to_comm)
call dgerv2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
snd_pt = bsdidx(proc_to_comm)
call dgerv2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
call dgesd2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = bsdidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
snd_pt = brvidx(proc_to_comm)
call mpi_irecv(sndbuf(snd_pt),sdsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_double_precision,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
call dgesd2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
snd_pt = bsdidx(proc_to_comm)
call dgerv2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_dswaptranv

@ -0,0 +1,43 @@
INTEGER FUNCTION PSI_EXIST_OVR_ELEM(OVR_ELEM,
+ DIM_LIST,ELEM_SEARCHED)
C PURPOSE:
C =======
C
C If ELEM_SEARCHED exist in the list OVR_ELEM returns its position in
C the list, else returns -1
C
C
C INPUT
C ======
C OVRLAP_ELEMENT_D.: Contains for all overlap points belonging to
C the current process:
C 1. overlap point index
C 2. Number of domains sharing that overlap point
C the end is marked by a -1...............................
C
C DIM_LIST..........: Dimension of list OVRLAP_ELEMENT_D
C
C ELEM_SEARCHED.....:point's Local index identifier to be searched.
IMPLICIT NONE
C ...Array Parameters....
INTEGER OVR_ELEM(*)
C ....Scalars parameters....
INTEGER DIM_LIST,ELEM_SEARCHED
C ...Local Scalars....
INTEGER I
I=1
DO WHILE ((I.LE.DIM_LIST).AND.(OVR_ELEM(I).NE.ELEM_SEARCHED))
I=I+2
ENDDO
IF ((I.LE.DIM_LIST).AND.(OVR_ELEM(I).EQ.ELEM_SEARCHED)) THEN
PSI_EXIST_OVR_ELEM=I
ELSE
PSI_EXIST_OVR_ELEM=-1
ENDIF
END

@ -0,0 +1,266 @@
subroutine psi_extract_dep_list(desc_data,
+ desc_str,dep_list,
+ length_dl,np,dl_lda,mode,info)
c internal routine
c ================
c
c _____called by psi_crea_halo and psi_crea_ovrlap ______
c
c purpose
c =======
c process root (pid=0) extracts for each process "k" the ordered list of process
c to which "k" must communicate. this list with its order is extracted from
c desc_str list
c
c
c input
c =======
c desc_data :integer array
c explanation:
c name explanation
c ------------------ -------------------------------------------------------
c desc_data array of integer that contains some local and global
c information of matrix.
c
c
c now we explain each of the above vectors.
c
c let a be a generic sparse matrix. we denote with matdata_a the matrix_data
c array for matrix a.
c data stored in matrix_data array are:
c
c notation stored in explanation
c --------------- ---------------------- -------------------------------------
c dec_type matdata_a[psb_dec_type_] decomposition type
c m matdata_a[m_] total number of equations
c n matdata_a[n_] total number of variables
c n_row matdata_a[psb_n_row_] number of local equations
c n_col matdata_a[psb_n_col_] number of local variables
c psb_ctxt_a matdata_a[ctxt_] the blacs context handle, indicating
c the global context of the operation
c on the matrix.
c the context itself is global.
c desc_str integer array
c explanation:
c let desc_str_p be the array desc_str for local process.
c this is composed of variable dimension blocks for each process to
c communicate to.
c each block contain indexes of local halo elements to exchange with other
c process.
c let p be the pointer to the first element of a block in desc_str_p.
c this block is stored in desc_str_p as :
c
c notation stored in explanation
c --------------- --------------------------- -----------------------------------
c process_id desc_str_p[p+psb_proc_id_] identifier of process which exchange
c data with.
c n_elements_recv desc_str_p[p+n_elem_recv_] number of elements to receive.
c elements_recv desc_str_p[p+elem_recv_+i] indexes of local elements to
c receive. these are stored in the
c array from location p+elem_recv_ to
c location p+elem_recv_+
c desc_str_p[p+n_elem_recv_]-1.
c if desc_data(psb_dec_type_) == 0
c then also will be:
c n_elements_send desc_str_p[p+n_elem_send_] number of elements to send.
c elements_send desc_str_p[p+elem_send_+i] indexes of local elements to
c send. these are stored in the
c array from location p+elem_send_ to
c location p+elem_send_+
c desc_str_p[p+n_elem_send_]-1.
c list is ended by -1 value
c
c np integer (global input)
c number of grid process.
c
c mode integer (global input)
c if mode =0 then will be inserted also duplicate element in
c a same dependence list
c if mode =1 then not will be inserted duplicate element in
c a same dependence list
c output
c =====
c only for root (pid=0) process:
c dep_list integer array(dl_lda,0:np)
c dependence list dep_list(*,i) is the list of process identifiers to which process i
c must communicate with. this list with its order is extracted from
c desc_str list.
c length_dl integer array(0:np)
c length_dl(i) is the length of dep_list(*,i) list
implicit none
include 'psb_const.fh'
include 'mpif.h'
c ....scalar parameters...
integer np,dl_lda,mode, info
c ....array parameters....
integer desc_str(*),desc_data(*),
+ dep_list(dl_lda,0:np),length_dl(0:np)
integer, pointer :: itmp(:)
c .....local arrays....
integer int_err(5)
double precision real_err(5)
c .....local scalars...
integer i,nprow,npcol,me,mycol,pointer_dep_list,proc,j,err_act
integer icontxt, err, icomm
logical debug
parameter (debug=.false.)
character name*20
name='psi_extrct_dl'
call fcpsb_get_erraction(err_act)
info = 0
icontxt = desc_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,me,mycol)
do i=0,np
length_dl(i) = 0
enddo
i=1
if (debug) write(0,*) 'extract: info ',info,
+ desc_data(psb_dec_type_)
pointer_dep_list=1
if (desc_data(psb_dec_type_).eq.psb_desc_bld_) then
do while (desc_str(i).ne.-1)
if (debug) write(0,*) me,' extract: looping ',i,
+ desc_str(i),desc_str(i+1),desc_str(i+2)
c ...with different decomposition type we have different
c structure of indices lists............................
if ((desc_str(i+1).ne.0).or.(desc_str(i+2).ne.0)) then
c ..if number of element to be exchanged !=0
proc=desc_str(i)
if ((proc.lt.0).or.(proc.ge.nprow)) then
if (debug) write(0,*) 'extract error ',i,desc_str(i)
info = 3999
goto 998
endif
if (mode.eq.1) then
c ...search if already exist proc
c in dep_list(*,me)...
j=1
do while ((j.lt.pointer_dep_list).and.
+ (dep_list(j,me).ne.proc))
j=j+1
enddo
if (j.eq.pointer_dep_list) then
c ...if not found.....
dep_list(pointer_dep_list,me)=proc
pointer_dep_list=pointer_dep_list+1
endif
else if (mode.eq.0) then
if (pointer_dep_list.gt.dl_lda) then
info = 4000
goto 998
endif
dep_list(pointer_dep_list,me)=proc
pointer_dep_list=pointer_dep_list+1
endif
endif
i=i+desc_str(i+1)+2
enddo
else if (desc_data(psb_dec_type_).eq.psb_desc_upd_) then
do while (desc_str(i).ne.-1)
if (debug) write(0,*) 'extract: looping ',i,desc_str(i)
c ...with different decomposition type we have different
c structure of indices lists............................
if (desc_str(i+1).ne.0) then
proc=desc_str(i)
c ..if number of element to be exchanged !=0
if (mode.eq.1) then
c ...search if already exist proc....
j=1
do while ((j.lt.pointer_dep_list).and.
+ (dep_list(j,me).ne.proc))
j=j+1
enddo
if (j.eq.pointer_dep_list) then
c ...if not found.....
if (pointer_dep_list.gt.dl_lda) then
info = 4000
goto 998
endif
dep_list(pointer_dep_list,me)=proc
pointer_dep_list=pointer_dep_list+1
endif
else if (mode.eq.0) then
if (pointer_dep_list.gt.dl_lda) then
info = 4000
goto 998
endif
dep_list(pointer_dep_list,me)=proc
pointer_dep_list=pointer_dep_list+1
endif
endif
i=i+desc_str(i+1)+2
enddo
else
write(0,*) 'invalid dec_type',desc_data(psb_dec_type_)
info = 2020
endif
length_dl(me)=pointer_dep_list-1
c ... check for errors...
998 continue
if (debug) write(0,*) 'extract: info ',info
err = info
c$$$ call igamx2d(icontxt, all, topdef, ione, ione, err, ione,
c$$$ + i, i, -ione ,-ione,-ione)
if (err.ne.0) goto 9999
if (.true.) then
call igsum2d(icontxt,'all',' ',np+1,1,length_dl,np+1,-1,-1)
call blacs_get(icontxt,10,icomm )
allocate(itmp(dl_lda))
itmp(1:dl_lda) = dep_list(1:dl_lda,me)
call mpi_allgather(itmp,dl_lda,mpi_integer,
+ dep_list,dl_lda,mpi_integer,icomm,info)
deallocate(itmp)
else
if (me.eq.root) then
do proc=0,np-1
if (proc.ne.root) then
if (debug) write(0,*) 'receiving from: ',proc
c ...receive from proc length of its dependence list....
call igerv2d(icontxt,1,1,length_dl(proc),1,
+ proc,mycol)
c ...receive from proc its dependence list....
call igerv2d(icontxt,length_dl(proc),1,
+ dep_list(1,proc),length_dl(proc),proc,mycol)
endif
enddo
else if (me.ne.root) then
c ...send to root dependence list length.....
if (debug) write(0,*) 'sending to: ',me,root
call igesd2d(icontxt,1,1,length_dl(me),1,root,mycol)
if (debug) write(0,*) 'sending to: ',me,root
c ...send to root dependence list....
call igesd2d(icontxt,length_dl(me),1,dep_list(1,me),
+ length_dl(me),root,mycol)
endif
end if
return
9999 continue
call fcpsb_errpush(info,name,int_err)
if(err_act.eq.act_abort) then
call fcpsb_perror(icontxt)
endif
return
end

@ -0,0 +1,202 @@
subroutine psi_dgthm(n,k,idx,x,y)
implicit none
integer :: n, k, idx(:)
real(kind(1.d0)) :: x(:,:), y(:)
! Locals
integer :: i, j, pt
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(pt)=x(idx(i),j)
end do
end do
end subroutine psi_dgthm
subroutine psi_dgthv(n,idx,x,y)
implicit none
integer :: n, idx(:)
real(kind(1.d0)) :: x(:), y(:)
! Locals
integer :: i, j
do i=1,n
y(i)=x(idx(i))
end do
end subroutine psi_dgthv
subroutine psi_dsctm(n,k,idx,x,beta,y)
implicit none
integer :: n, k, idx(:)
real(kind(1.d0)) :: beta, x(:), y(:,:)
! Locals
integer :: i, j, pt
if (beta.eq.0.d0) then
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(idx(i),j) = x(pt)
end do
end do
else if (beta.eq.1.d0) then
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(idx(i),j) = y(idx(i),j)+x(pt)
end do
end do
else
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(idx(i),j) = beta*y(idx(i),j)+x(pt)
end do
end do
end if
end subroutine psi_dsctm
subroutine psi_dsctv(n,idx,x,beta,y)
implicit none
integer :: n, k, idx(:)
real(kind(1.d0)) :: beta, x(:), y(:)
! Locals
integer :: i, j, pt
if (beta.eq.0.d0) then
do i=1,n
y(idx(i)) = x(i)
end do
else if (beta.eq.1.d0) then
do i=1,n
y(idx(i)) = y(idx(i))+x(i)
end do
else
do i=1,n
y(idx(i)) = beta*y(idx(i))+x(i)
end do
end if
end subroutine psi_dsctv
subroutine psi_igthm(n,k,idx,x,y)
implicit none
integer :: n, k, idx(:)
integer :: x(:,:), y(:)
! Locals
integer :: i, j, pt
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(pt)=x(idx(i),j)
end do
end do
end subroutine psi_igthm
subroutine psi_igthv(n,idx,x,y)
implicit none
integer :: n, idx(:)
integer :: x(:), y(:)
! Locals
integer :: i, j
do i=1,n
y(i)=x(idx(i))
end do
end subroutine psi_igthv
subroutine psi_isctm(n,k,idx,x,beta,y)
implicit none
integer :: n, k, idx(:)
integer :: beta, x(:), y(:,:)
! Locals
integer :: i, j, pt
if (beta.eq.0.d0) then
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(idx(i),j) = x(pt)
end do
end do
else if (beta.eq.1.d0) then
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(idx(i),j) = y(idx(i),j)+x(pt)
end do
end do
else
pt=0
do j=1,k
do i=1,n
pt=pt+1
y(idx(i),j) = beta*y(idx(i),j)+x(pt)
end do
end do
end if
end subroutine psi_isctm
subroutine psi_isctv(n,idx,x,beta,y)
implicit none
integer :: n, k, idx(:)
integer :: beta, x(:), y(:)
! Locals
integer :: i, j, pt
if (beta.eq.0.d0) then
do i=1,n
y(idx(i)) = x(i)
end do
else if (beta.eq.1.d0) then
do i=1,n
y(idx(i)) = y(idx(i))+x(i)
end do
else
do i=1,n
y(idx(i)) = beta*y(idx(i))+x(i)
end do
end if
end subroutine psi_isctv

@ -0,0 +1,739 @@
subroutine psi_iswapdatam(flag,n,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag, n
integer, intent(out) :: info
integer :: y(:,:), beta
integer, target ::work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt
integer :: blacs_pnum, krecvid, ksendid
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: int_err(5)
logical :: swap_mpi, swap_sync, swap_send, swap_recv
integer, pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_iswapdata'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = n*nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = n*nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(sndbuf,sdsz,bsdidx,&
& mpi_integer,rcvbuf,rvsz,&
& brvidx,mpi_integer,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
call igesd2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I receive
rcv_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
rcv_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
call igesd2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
rcv_pt = brvidx(proc_to_comm)
call mpi_irecv(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(sndbuf(snd_pt),sdsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd*n-1))
call igesd2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
rcv_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_iswapdatam
subroutine psi_iswapdatav(flag,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag
integer, intent(out) :: info
integer :: y(:), beta
integer, target :: work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt, n
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: blacs_pnum, krecvid, ksendid
integer :: int_err(5)
logical :: swap_mpi, swap_sync, swap_send, swap_recv
integer, pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_iswapdatav'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
n=1
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = n*nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = n*nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(sndbuf,sdsz,bsdidx,&
& mpi_integer,rcvbuf,rvsz,&
& brvidx,mpi_integer,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
call igesd2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I receive
rcv_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
rcv_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
call igesd2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
rcv_pt = brvidx(proc_to_comm)
call mpi_irecv(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(sndbuf(snd_pt),sdsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_gth(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& y,sndbuf(snd_pt:snd_pt+nesd-1))
call igesd2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
rcv_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
else
idx_pt = point_to_proc+psb_elem_recv_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_iswapdatav

@ -0,0 +1,735 @@
subroutine psi_iswaptranm(flag,n,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag, n
integer, intent(out) :: info
integer :: y(:,:), beta
integer, target :: work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: int_err(5)
integer :: blacs_pnum, krecvid, ksendid
logical :: swap_mpi, swap_sync, swap_send, swap_recv
integer, pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_dswaptranm'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = n*nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = n*nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(rcvbuf,rvsz,brvidx,&
& mpi_integer,sndbuf,sdsz,&
& bsdidx,mpi_integer,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
call igesd2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I receive
snd_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
snd_pt = bsdidx(proc_to_comm)
call igerv2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
call igesd2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = bsdidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
snd_pt = brvidx(proc_to_comm)
call mpi_irecv(sndbuf(rcv_pt),sdsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,n,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv*n-1))
call igesd2d(icontxt,nerv,n,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
snd_pt = bsdidx(proc_to_comm)
call igerv2d(icontxt,nesd,n,sndbuf(snd_pt),nesd,proc_to_comm,0)
idx_pt = point_to_proc+nerv+psb_elem_send_
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+n*nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,n,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+n*nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_iswaptranm
subroutine psi_iswaptranv(flag,beta,y,desc_a,work,info)
use psb_error_mod
use psb_descriptor_type
implicit none
include 'mpif.h'
integer, intent(in) :: flag
integer, intent(out) :: info
integer :: y(:), beta
integer, target :: work(:)
type(psb_desc_type) :: desc_a
! locals
integer :: icontxt, nprow, npcol, myrow,&
& mycol, point_to_proc, nesd, nerv,&
& proc_to_comm, p2ptag, icomm, p2pstat,&
& idxs, idxr, iret, errlen, ifcomm, rank,&
& err_act, totxch, ixrec, i, lw, idx_pt,&
& snd_pt, rcv_pt, n
integer, pointer, dimension(:) :: bsdidx, brvidx,&
& sdsz, rvsz, prcid, ptp, rvhd, h_idx
integer :: int_err(5)
integer :: blacs_pnum, krecvid, ksendid
logical :: swap_mpi, swap_sync, swap_send, swap_recv
integer, pointer, dimension(:) :: sndbuf, rcvbuf
character(len=20) :: name, ch_err
info = 0
name='psi_dswaptranv'
call psb_erractionsave(err_act)
icontxt=desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprow,npcol,myrow,mycol)
if (nprow == -1) then
info = 2010
call psb_errpush(info,name)
goto 9999
else if (npcol /= 1) then
info = 2030
int_err(1) = npcol
call psb_errpush(info,name)
goto 9999
endif
call blacs_get(icontxt,10,icomm)
allocate(sdsz(0:nprow-1), rvsz(0:nprow-1), bsdidx(0:nprow-1),&
& brvidx(0:nprow-1), rvhd(0:nprow-1), prcid(0:nprow-1),&
& ptp(0:nprow-1), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
swap_mpi = iand(flag,psb_swap_mpi_).ne.0
swap_sync = iand(flag,psb_swap_sync_).ne.0
swap_send = iand(flag,psb_swap_send_).ne.0
swap_recv = iand(flag,psb_swap_recv_).ne.0
h_idx => desc_a%halo_index
idxs = 0
idxr = 0
totxch = 0
point_to_proc = 1
rvhd(:) = mpi_request_null
n=1
! prepare info for communications
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm.ne.-1)
if(proc_to_comm .ne. myrow) totxch = totxch+1
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
prcid(proc_to_comm) = blacs_pnum(icontxt,proc_to_comm,mycol)
ptp(proc_to_comm) = point_to_proc
brvidx(proc_to_comm) = idxr
rvsz(proc_to_comm) = nerv
idxr = idxr+rvsz(proc_to_comm)
bsdidx(proc_to_comm) = idxs
sdsz(proc_to_comm) = nesd
idxs = idxs+sdsz(proc_to_comm)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
if((idxr+idxs).lt.size(work)) then
sndbuf => work(1:idxs)
rcvbuf => work(idxs+1:idxs+idxr)
else
allocate(sndbuf(idxs),rcvbuf(idxr), stat=info)
if(info.ne.0) then
call psb_errpush(4000,name)
goto 9999
end if
end if
! Case SWAP_MPI
if(swap_mpi) then
! gather elements into sendbuffer for swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! swap elements using mpi_alltoallv
call mpi_alltoallv(rcvbuf,rvsz,brvidx,&
& mpi_integer,sndbuf,sdsz,&
& bsdidx,mpi_integer,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
! scatter elements from receivebuffer after swapping
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_sync) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if (proc_to_comm .lt. myrow) then
! First I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
call igesd2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
! Then I receive
snd_pt = brvidx(proc_to_comm)
call igerv2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
else if (proc_to_comm .gt. myrow) then
! First I receive
snd_pt = bsdidx(proc_to_comm)
call igerv2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
! Then I send
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
call igesd2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
else if (proc_to_comm .eq. myrow) then
! I send to myself
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = bsdidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
snd_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send .and. swap_recv) then
! First I post all the non blocking receives
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
p2ptag = krecvid(icontxt,proc_to_comm,myrow)
snd_pt = brvidx(proc_to_comm)
call mpi_irecv(sndbuf(snd_pt),sdsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag, icomm,rvhd(proc_to_comm),iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
! Then I post all the blocking sends
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
if(proc_to_comm .ne. myrow) then
p2ptag=ksendid(icontxt,proc_to_comm,myrow)
call mpi_send(rcvbuf(rcv_pt),rvsz(proc_to_comm),&
& mpi_integer,prcid(proc_to_comm),&
& p2ptag,icomm,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
do i=1, totxch
call mpi_waitany(nprow,rvhd,ixrec,p2pstat,iret)
if(iret.ne.mpi_success) then
int_err(1) = iret
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
if (ixrec .ne. mpi_undefined) then
ixrec=ixrec-1 ! mpi_waitany returns an 1 to nprow index
point_to_proc = ptp(ixrec)
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = bsdidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
else
int_err(1) = ixrec
info=400
call psb_errpush(info,name,i_err=int_err)
goto 9999
end if
end do
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm .eq. myrow) then
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_send) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_gth(nerv,h_idx(idx_pt:idx_pt+nerv-1),&
& y,rcvbuf(rcv_pt:rcv_pt+nerv-1))
call igesd2d(icontxt,nerv,1,rcvbuf(rcv_pt),nerv,proc_to_comm,0)
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
else if (swap_recv) then
point_to_proc = 1
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
do while (proc_to_comm .ne. -1)
nerv = h_idx(point_to_proc+psb_n_elem_recv_)
nesd = h_idx(point_to_proc+nerv+psb_n_elem_send_)
if(proc_to_comm.ne.myrow) then
snd_pt = bsdidx(proc_to_comm)
call igerv2d(icontxt,nesd,1,sndbuf(snd_pt),nesd,proc_to_comm,0)
idx_pt = point_to_proc+psb_elem_recv_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& sndbuf(snd_pt:snd_pt+nesd-1),beta,y)
else
idx_pt = point_to_proc+nerv+psb_elem_send_
rcv_pt = brvidx(proc_to_comm)
call psi_sct(nesd,h_idx(idx_pt:idx_pt+nesd-1),&
& rcvbuf(rcv_pt:rcv_pt+nerv-1),beta,y)
end if
point_to_proc = point_to_proc+nerv+nesd+3
proc_to_comm = h_idx(point_to_proc+psb_proc_id_)
end do
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error(icontxt)
return
end if
return
end subroutine psi_iswaptranv

@ -0,0 +1,25 @@
INTEGER FUNCTION PSI_LIST_SEARCH(LIST,LENGHT_LIST,ELEM)
C !RETURNS POSITION OF ELEM IN A ARRAY LIST
C !OF LENGHT LENGHT_LIST, IF THIS ELEMENT NOT EXISTS
C !RETURNS -1
INTEGER LIST(*)
INTEGER LENGHT_LIST
INTEGER ELEM
INTEGER I
I=1
DO WHILE ((I.LE.LENGHT_LIST).AND.(LIST(I).NE.ELEM))
I=I+1
ENDDO
IF (I.LE.LENGHT_LIST) THEN
IF (LIST(I).EQ.ELEM) THEN
PSI_LIST_SEARCH=I
ELSE
PSI_LIST_SEARCH=-1
ENDIF
ELSE
PSI_LIST_SEARCH=-1
ENDIF
END

@ -0,0 +1,60 @@
subroutine psi_sort_dl(dep_list,l_dep_list,np,info)
!
! interface between former sort_dep_list subroutine
! and new srtlist
!
use psb_error_mod
implicit none
integer :: np,dep_list(:,:), l_dep_list(:)
integer :: idg, iupd, idgp, iedges, iidx, iich,ndgmx, isz, err_act
integer :: i, info
integer, pointer :: work(:)
logical, parameter :: debug=.false.
character(len=20) :: name, ch_err
name='psi_sort_dl'
info=0
call psb_erractionsave(err_act)
info = 0
ndgmx = 0
do i=1,np
ndgmx = ndgmx + l_dep_list(i)
if (debug) write(0,*) i,l_dep_list(i)
enddo
idg = 1
iupd = idg+np
idgp = iupd+np
iedges = idgp + ndgmx
iidx = iedges + 2*ndgmx
iich = iidx + ndgmx
isz = iich + ndgmx
if (debug)write(0,*) 'psi_sort_dl: ndgmx ',ndgmx,isz
allocate(work(isz))
! call srtlist(dep_list, dl_lda, l_dep_list, np, info)
call srtlist(dep_list,size(dep_list,1),l_dep_list,np,work(idg),&
& work(idgp),work(iupd),work(iedges),work(iidx),work(iich),info)
if (info .ne. 0) then
call psb_errpush(4010,name,a_err='srtlist')
goto 9999
endif
deallocate(work)
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error()
return
end if
return
end subroutine psi_sort_dl

@ -0,0 +1,177 @@
***********************************************************************
* *
* The communication step among processors at each *
* matrix-vector product is a variable all-to-all *
* collective communication that we reimplement *
* in terms of point-to-point communications. *
* The data in input is a list of dependencies: *
* for each node a list of all the nodes it has to *
* communicate with. The lists are guaranteed to be *
* symmetric, i.e. for each pair (I,J) there is a *
* pair (J,I). The idea is to organize the ordering *
* so that at each communication step as many *
* processors as possible are communicating at the *
* same time, i.e. a step is defined by the fact *
* that all edges (I,J) in it have no common node. *
* *
* Formulation of the problem is: *
* Given an undirected graph (forest): *
* Find the shortest series of steps to cancel all *
* graph edges, where at each step all edges belonging *
* to a matching in the graph are canceled. *
* *
* An obvious lower bound to the optimum number of steps *
* is the largest degree of any node in the graph. *
* *
* The algorithm proceeds as follows: *
* 1. Build a list of all edges, e.g. copy the *
* dependencies lists keeping only (I,J) with I<J *
* 2. Compute an auxiliary vector with the degree of *
* each node of the graph. *
* 3. While there are edges in the graph do: *
* 4. Weight the edges with the sum of the degrees *
* of their nodes and sort them into descending order *
* 5. Scan the list of edges; if neither node of the *
* edge has been marked yet, cancel the edge and mark *
* the two nodes *
* 6. If no edge was chosen but the graph is nonempty *
* raise an error condition *
* 7. Queue the edges in the matchin to the output *
* sequence; *
* 8. Decrease by 1 the degree of all marked nodes, *
* then clear all marks *
* 9. Cycle to 3. *
* 10. For each node: scan the edge sequence; if an *
* edge has the node as an endpoint, queue the other *
* node in the dependency list for the current one *
* *
***********************************************************************
SUBROUTINE SRTLIST(DEP_LIST,DL_LDA,LDL,NP,dg,dgp,upd,
+ edges,idx,ich,INFO)
IMPLICIT NONE
INTEGER NP, DL_LDA, INFO
INTEGER DEP_LIST(DL_LDA,*), LDL(*),DG(*), DGP(*), IDX(*),
+ UPD(*),EDGES(2,*),ICH(*)
INTEGER I,J, NEDGES,IP1,IP2,NCH,IP,IEDGE,I1,IX,IST,ISWAP(2)
INTEGER NO_COMM
PARAMETER (NO_COMM=-1)
double precision mpi_wtime, t1, t2
external mpi_wtime
IF (NP .LT. 0) THEN
INFO = 1
RETURN
ENDIF
C
C dg contains number of communications
C
DO I=1, NP
DG(I)=LDL(I)
ENDDO
NEDGES = 0
DO I=1, NP
DO J=1, DG(I)
IP = DEP_LIST(J,I) + 1
c$$$ write(0,*) 'SRTLIST Input :',i,ip
IF (IP.GT.I)
+ NEDGES = NEDGES + 1
ENDDO
ENDDO
IEDGE = 0
DO I=1, NP
DO J=1, DG(I)
IP = DEP_LIST(J,I) + 1
IF (IP.GT.I) THEN
IEDGE = IEDGE + 1
EDGES(1,IEDGE) = I
EDGES(2,IEDGE) = IP
ENDIF
ENDDO
ENDDO
IST = 1
DO WHILE (IST.LE.NEDGES)
DO I=1, NP
UPD(I) = 0
ENDDO
DO I=IST, NEDGES
DGP(I) = -(DG(EDGES(1,I))+DG(EDGES(2,I)))
ENDDO
CALL ISRX(NEDGES-IST+1,DGP(IST),IDX(IST))
I1 = IST
NCH = 0
DO I = IST, NEDGES
IX = IDX(I)+IST-1
IP1 = EDGES(1,IX)
IP2 = EDGES(2,IX)
IF ((UPD(IP1).eq.0).AND.(UPD(IP2).eq.0)) THEN
UPD(IP1) = -1
UPD(IP2) = -1
NCH = NCH + 1
ICH(NCH) = IX
ENDIF
ENDDO
IF (NCH.eq.0) THEN
write(0,*) 'SRTLIST ?????? Impossible error !!!!!?????',
+ nedges,ist
do i=ist, nedges
IX = IDX(I)+IST-1
write(0,*) 'SRTLIST: Edge:',ix,edges(1,ix),
+ edges(2,ix),dgp(ix)
enddo
info = 30
return
ENDIF
CALL ISR(NCH,ICH)
DO I=1, NCH
ISWAP(1) = EDGES(1,IST)
ISWAP(2) = EDGES(2,IST)
EDGES(1,IST) = EDGES(1,ICH(I))
EDGES(2,IST) = EDGES(2,ICH(I))
EDGES(1,ICH(I)) = ISWAP(1)
EDGES(2,ICH(I)) = ISWAP(2)
IST = IST + 1
ENDDO
DO I=1, NP
DG(I) = DG(I) + UPD(I)
ENDDO
ENDDO
DO I=1, NP
IF (DG(I).NE.0) THEN
WRITE(0,*) 'SRTLIST Error on exit:',i,dg(i)
ENDIF
DG(I) = 0
ENDDO
DO J=1,NEDGES
I = EDGES(1,J)
DG(I) = DG(I)+1
DEP_LIST(DG(I),I) = EDGES(2,J)-1
I = EDGES(2,J)
DG(I) = DG(I)+1
DEP_LIST(DG(I),I) = EDGES(1,J)-1
ENDDO
DO I=1, NP
IF (DG(I).NE.LDL(I)) THEN
WRITE(0,*) 'SRTLIST Mismatch on output',i,dg(i),ldl(i)
ENDIF
ENDDO
c$$$ t2 = mpi_wtime()
c$$$ WRITE(0,*) 'Output communication:',t2-t1
c$$$ do i=1,np
c$$$ do j=1,ldl(i)
c$$$ write(0,*)'SRTLIST', i,dep_list(j,i)+1
c$$$ enddo
c$$$ enddo
RETURN
END

@ -0,0 +1,24 @@
include ../../Make.inc
LIBDIR=../../lib/
LIBNAME=$(LIBDIR)/$(F90LIB)
HERE=.
F90OBJS= f90_dcgstab.o f90_dcg.o f90_dcgs.o \
f90_dbicg.o f90_dcgstabl.o f90_zcgstab.o f90_dgmresr.o
INCDIRS=-I. -I.. -I$(LIBDIR)
lib: $(F90OBJS)
ar -cur $(LIBNAME) $(F90OBJS)
ranlib $(LIBNAME)
#$(F90OBJS): $(MODS)
veryclean: clean
/bin/rm -f $(LIBNAME)
clean:
/bin/rm -f $(F90OBJS) $(LOCAL_MODS)

@ -0,0 +1,340 @@
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
!!$ C C
!!$ C References: C
!!$ C [1] Duff, I., Marrone, M., Radicati, G., and Vittoli, C. C
!!$ C Level 3 basic linear algebra subprograms for sparse C
!!$ C matrices: a user level interface C
!!$ C ACM Trans. Math. Softw., 23(3), 379-401, 1997. C
!!$ C C
!!$ C C
!!$ C [2] S. Filippone, M. Colajanni C
!!$ C PSBLAS: A library for parallel linear algebra C
!!$ C computation on sparse matrices C
!!$ C ACM Trans. on Math. Softw., 26(4), 527-550, Dec. 2000. C
!!$ C C
!!$ C [3] M. Arioli, I. Duff, M. Ruiz C
!!$ C Stopping criteria for iterative solvers C
!!$ C SIAM J. Matrix Anal. Appl., Vol. 13, pp. 138-144, 1992 C
!!$ C C
!!$ C C
!!$ C [4] R. Barrett et al C
!!$ C Templates for the solution of linear systems C
!!$ C SIAM, 1993 C
!!$ C C
!!$ C C
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
!
! File: psb_dbicg.f90
!
! Subroutine: psb_dbicg
! This subroutine implements the BiCG method.
!
! Parameters:
! a - type(<psb_dspmat_type>). The sparse matrix containing A.
! prec - type(<psb_prec_type>). The data structure containing the preconditioner.
! b - real,dimension(:). The right hand side.
! x - real,dimension(:). The vector of unknowns.
! eps - real. The error tolerance.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! itmax - integer(optional). The maximum number of iterations.
! iter - integer(optional). The number of iterations performed.
! err - real(optional). The error on return.
! itrace - integer(optional). The unit to write messages onto.
! istop - integer(optional). The stopping criterium.
!
subroutine psb_dbicg(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err, itrace,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
use psb_psblas_mod
use psb_tools_mod
use psb_const_mod
use psb_prec_mod
use psb_error_mod
implicit none
!!$ parameters
type(psb_dspmat_type), intent(in) :: a
type(psb_dprec_type), intent(in) :: prec
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), intent(in) :: b(:)
real(kind(1.d0)), intent(inout) :: x(:)
real(kind(1.d0)), intent(in) :: eps
integer, intent(out) :: info
integer, optional, intent(in) :: itmax, itrace, istop
integer, optional, intent(out) :: iter
real(kind(1.d0)), optional, intent(out) :: err
!!$ local data
real(kind(1.d0)), pointer :: aux(:),wwrk(:,:)
real(kind(1.d0)), pointer :: ww(:), q(:),&
& r(:), p(:), zt(:), pt(:), z(:), rt(:),qt(:)
integer, pointer :: iperm(:), ipnull(:), ipsave(:), int_err(:)
real(kind(1.d0)) ::rerr
integer ::litmax, liter, naux, m, mglob, it, itrac,&
& nprows,npcols,me,mecol, n_row, n_col, listop, err_act
character ::diagl, diagu
logical, parameter :: debug = .false.
logical, parameter :: exchange=.true., noexchange=.false.
integer, parameter :: ione=1
integer, parameter :: irmax = 8
integer :: itx, i, isvch, ich, icontxt
logical :: do_renum_left
real(kind(1.d0)), parameter :: one=1.d0, zero=0.d0, epstol=1.d-35
real(kind(1.d0)) :: alpha, beta, rho, rho_old, rni, xni, bni, ani,&
& sigma, omega, tau,bn2
character(len=20) :: name,ch_err
info = 0
name = 'psb_dbicg'
call psb_erractionsave(err_act)
if (debug) write(*,*) 'entering psb_dbicg'
icontxt = desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprows,npcols,me,mecol)
if (debug) write(*,*) 'psb_dbicg: from gridinfo',nprows,npcols,me
mglob = desc_a%matrix_data(m_)
n_row = desc_a%matrix_data(psb_n_row_)
n_col = desc_a%matrix_data(psb_n_col_)
! ensure global coherence for convergence checks.
call blacs_get(icontxt,16,isvch)
ich = 1
call blacs_set(icontxt,16,ich)
if (present(istop)) then
listop = istop
else
listop = 1
endif
!
! listop = 1: normwise backward error, infinity norm
! listop = 2: ||r||/||b|| norm 2
!
!!$
!!$ if ((prec%prec < min_prec_).or.(prec%prec > max_prec_) ) then
!!$ write(0,*) 'f90_bicg: invalid iprec',prec%prec
!!$ if (present(ierr)) ierr=-1
!!$ return
!!$ endif
if ((listop < 1 ).or.(listop > 2 ) ) then
write(0,*) 'psb_bicg: invalid istop',listop
info=5001
int_err=listop
err=info
call psb_errpush(info,name,i_err=int_err)
goto 9999
endif
naux=4*n_col
allocate(aux(naux),stat=info)
call psb_dalloc(mglob,9,wwrk,desc_a,info)
call psb_asb(wwrk,desc_a,info)
if(info.ne.0) then
info=4011
ch_err='psb_asb'
err=info
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
q => wwrk(:,1)
qt => wwrk(:,2)
r => wwrk(:,3)
rt => wwrk(:,4)
p => wwrk(:,5)
pt => wwrk(:,6)
z => wwrk(:,7)
zt => wwrk(:,8)
ww => wwrk(:,9)
if (present(itmax)) then
litmax = itmax
else
litmax = 1000
endif
if (present(itrace)) then
itrac = itrace
else
itrac = -1
end if
diagl = 'u'
diagu = 'u'
itx = 0
if (listop == 1) then
ani = psb_nrmi(a,desc_a,info)
bni = psb_amax(b,desc_a,info)
else if (listop == 2) then
bn2 = psb_nrm2(b,desc_a,info)
endif
if(info.ne.0) then
info=4011
err=info
call psb_errpush(info,name,a_err=ch_err)
goto 9999
end if
restart: do
!!$
!!$ r0 = b-ax0
!!$
if (itx.ge.itmax) exit restart
it = 0
call psb_axpby(one,b,zero,r,desc_a,info)
call psb_spmm(-one,a,x,one,r,desc_a,info,work=aux)
call psb_axpby(one,r,zero,rt,desc_a,info)
if(info.ne.0) then
info=4011
call psb_errpush(info,name)
goto 9999
end if
rho = zero
if (debug) write(*,*) 'on entry to amax: b: ',size(b)
if (listop == 1) then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
else if (listop == 2) then
rni = psb_nrm2(r,desc_a,info)
endif
if(info.ne.0) then
info=4011
call psb_errpush(info,name)
goto 9999
end if
if (listop == 1) then
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
if (me.eq.0) write(itrac,'(a,i4,5(2x,es10.4))') 'bicg: ',itx,rerr,rni,bni,&
&xni,ani
endif
else if (listop == 2) then
rerr = rni/bn2
if (itrac /= -1) then
if (me.eq.0) write(itrac,'(a,i4,3(2x,es10.4))') 'bicg: ',itx,rerr,rni,bn2
endif
endif
if(info.ne.0) then
info=4011
call psb_errpush(info,name)
goto 9999
end if
if (rerr<=eps) then
exit restart
end if
iteration: do
it = it + 1
itx = itx + 1
if (debug) write(*,*) 'iteration: ',itx
call psb_prcaply(prec,r,z,desc_a,info,work=aux)
call psb_prcaply(prec,rt,zt,desc_a,info,trans='t',work=aux)
rho_old = rho
rho = psb_dot(rt,z,desc_a,info)
if (rho==zero) then
if (debug) write(0,*) 'bicg itxation breakdown r',rho
exit iteration
endif
if (it==1) then
call psb_axpby(one,z,zero,p,desc_a,info)
call psb_axpby(one,zt,zero,pt,desc_a,info)
else
beta = (rho/rho_old)
call psb_axpby(one,z,beta,p,desc_a,info)
call psb_axpby(one,zt,beta,pt,desc_a,info)
end if
call psb_spmm(one,a,p,zero,q,desc_a,info,&
& work=aux)
call psb_spmm(one,a,pt,zero,qt,desc_a,info,&
& work=aux,trans='t')
sigma = psb_dot(pt,q,desc_a,info)
if (sigma==zero) then
if (debug) write(0,*) 'cgs iteration breakdown s1', sigma
exit iteration
endif
alpha = rho/sigma
call psb_axpby(alpha,p,one,x,desc_a,info)
call psb_axpby(-alpha,q,one,r,desc_a,info)
call psb_axpby(-alpha,qt,one,rt,desc_a,info)
if (listop == 1) then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
else if (listop == 2) then
rni = psb_nrm2(r,desc_a,info)
endif
if (listop == 1) then
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
if (me.eq.0) write(itrac,'(a,i4,5(2x,es10.4))') 'bicg: ',itx,rerr,rni,bni,&
&xni,ani
endif
else if (listop == 2) then
rerr = rni/bn2
if (itrac /= -1) then
if (me.eq.0) write(itrac,'(a,i4,3(2x,es10.4))') 'bicg: ',itx,rerr,rni,bn2
endif
endif
if (rerr<=eps) then
exit restart
end if
if (itx.ge.itmax) exit restart
end do iteration
end do restart
if (present(err)) err=rerr
if (present(iter)) iter = itx
if (rerr>eps) then
write(0,*) 'bicg failed to converge to ',eps,&
& ' in ',itx,' iterations '
end if
deallocate(aux)
call psb_free(wwrk,desc_a,info)
! restore external global coherence behaviour
call blacs_set(icontxt,16,isvch)
if(info/=0) then
call psb_errpush(info,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error()
return
end if
return
end subroutine psb_dbicg

@ -0,0 +1,283 @@
! File: psb_dcg.f90
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
!!$ C C
!!$ C References: C
!!$ C [1] Duff, I., Marrone, M., Radicati, G., and Vittoli, C. C
!!$ C Level 3 basic linear algebra subprograms for sparse C
!!$ C matrices: a user level interface C
!!$ C ACM Trans. Math. Softw., 23(3), 379-401, 1997. C
!!$ C C
!!$ C C
!!$ C [2] S. Filippone, M. Colajanni C
!!$ C PSBLAS: A library for parallel linear algebra C
!!$ C computation on sparse matrices C
!!$ C ACM Trans. on Math. Softw., 26(4), 527-550, Dec. 2000. C
!!$ C C
!!$ C [3] M. Arioli, I. Duff, M. Ruiz C
!!$ C Stopping criteria for iterative solvers C
!!$ C SIAM J. Matrix Anal. Appl., Vol. 13, pp. 138-144, 1992 C
!!$ C C
!!$ C C
!!$ C [4] R. Barrett et al C
!!$ C Templates for the solution of linear systems C
!!$ C SIAM, 1993
!!$ C C
!!$ C C
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
! File: psb_dcg.f90
!
! Subroutine: psb_dcg
! This subroutine implements the Conjugate Gradient method.
!
! Parameters:
! a - type(<psb_dspmat_type>). The sparse matrix containing A.
! prec - type(<psb_prec_type>). The data structure containing the preconditioner.
! b - real,dimension(:). The right hand side.
! x - real,dimension(:). The vector of unknowns.
! eps - real. The error tolerance.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! itmax - integer(optional). The maximum number of iterations.
! iter - integer(optional). The number of iterations performed.
! err - real(optional). The error on return.
! itrace - integer(optional). The unit to write messages onto.
! istop - integer(optional). The stopping criterium.
!
Subroutine psb_dcg(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err, itrace, istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
use psb_psblas_mod
use psb_tools_mod
use psb_const_mod
use psb_prec_mod
use psb_error_mod
implicit none
!!$ Parameters
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_dprec_type), Intent(in) :: prec
Type(psb_desc_type), Intent(in) :: desc_a
Real(Kind(1.d0)), Intent(in) :: b(:)
Real(Kind(1.d0)), Intent(inout) :: x(:)
Real(Kind(1.d0)), Intent(in) :: eps
integer, intent(out) :: info
Integer, Optional, Intent(in) :: itmax, itrace, istop
Integer, Optional, Intent(out) :: iter
Real(Kind(1.d0)), Optional, Intent(out) :: err
!!$ Local data
real(kind(1.d0)), pointer :: aux(:), q(:), p(:),&
& r(:), z(:), w(:), wwrk(:,:)
real(kind(1.d0)) ::rerr
real(kind(1.d0)) ::alpha, beta, rho, rho_old, rni, xni, bni, ani,bn2,&
& sigma
integer :: litmax, liter, listop, naux, m, mglob, it, itrac,&
& nprows,npcols,me,mecol, n_col, isvch, ich, icontxt, n_row,err_act, int_err(5)
character ::diagl, diagu
logical, parameter :: exchange=.true., noexchange=.false.
integer, parameter :: ione=1
real(kind(1.d0)), parameter :: one=1.d0, zero=0.d0, epstol=1.d-35
character(len=20) :: name,ch_err
info = 0
name = 'psb_dcg'
call psb_erractionsave(err_act)
icontxt = desc_a%matrix_data(psb_ctxt_)
call blacs_gridinfo(icontxt,nprows,npcols,me,mecol)
mglob = desc_a%matrix_data(m_)
n_row = desc_a%matrix_data(psb_n_row_)
n_col = desc_a%matrix_data(psb_n_col_)
if (present(istop)) then
listop = istop
else
listop = 1
endif
!
! LISTOP = 1: Normwise backward error, infinity norm
! LISTOP = 2: ||r||/||b|| norm 2
!
!!$ If ((prec%prec < min_prec_).Or.(prec%prec > max_prec_) ) Then
!!$ Write(0,*) 'F90_CG: Invalid IPREC',prec%prec
!!$ If (Present(ierr)) ierr=-1
!!$ Return
!!$ Endif
if ((listop < 1 ).or.(listop > 2 ) ) then
write(0,*) 'psb_cg: invalid istop',listop
info=5001
int_err(1)=listop
err=info
call psb_errpush(info,name,i_err=int_err)
goto 9999
endif
naux=4*n_col
allocate(aux(naux), stat=info)
call psb_dalloc(mglob,5,wwrk,desc_a,info)
call psb_asb(wwrk,desc_a,info)
if (info.ne.0) then
info=4011
call psb_errpush(info,name)
goto 9999
end if
p => wwrk(:,1)
q => wwrk(:,2)
r => wwrk(:,3)
z => wwrk(:,4)
w => wwrk(:,5)
if (present(itmax)) then
litmax = itmax
else
litmax = 1000
endif
if (present(itrace)) then
itrac = itrace
else
itrac = -1
end if
!!$ DIAGL = 'U'
!!$ DIAGU = 'R'
! Ensure global coherence for convergence checks.
call blacs_get(icontxt,16,isvch)
ich = 1
call blacs_set(icontxt,16,ich)
!!$
!!$ r0 = b-Ax0
!!$
call psb_axpby(one,b,zero,r,desc_a,info)
call psb_spmm(-one,a,x,one,r,desc_a,info,work=aux)
if (info.ne.0) then
info=4011
call psb_errpush(info,name)
goto 9999
end if
rho = zero
if (listop == 1) then
ani = psb_nrmi(a,desc_a,info)
bni = psb_amax(b,desc_a,info)
else if (listop == 2) then
bn2 = psb_nrm2(b,desc_a,info)
endif
if (info.ne.0) then
info=4011
call psb_errpush(info,name)
goto 9999
end if
iteration: do it = 1, itmax
!!$
!!$ solve mz = r
!!$ Note: the overlapped preconditioner (if overlap is non empty)
!!$ is non-symmetric: M^{-1} = \Lambda P^T K^{-1} P
!!$ For CG we use instead
!!$ M^{-1} = \sqrt{\Lambda} P^T K^{-1} P \sqrt{\Lambda}
!!$ Keep track of the old symmetrized stuf, might come in useful.
!!$ CALL F90_PSAXPBY(ONE,R,ZERO,Z,DECOMP_DATA)
!!$ CALL F90_PSOVRL(Z,DECOMP_DATA,&
!!$ & UPDATE_TYPE=SQUARE_ROOT_,CHOICE=NOEXCHANGE)
!!$ CALL F90_PSSPSM(ONE,L,Z,ZERO,W,DECOMP_DATA,&
!!$ & TRANS='N',UNIT=DIAGL,CHOICE=NONE_,WORK=AUX)
!!$ CALL F90_PSSPSM(ONE,U,W,ZERO,Z,DECOMP_DATA,&
!!$ & TRANS='N',UNIT=DIAGU,CHOICE=NONE_,DIAG=VDIAG,WORK=AUX)
!!$ CALL F90_PSOVRL(Z,DECOMP_DATA,&
!!$ & UPDATE_TYPE=SQUARE_ROOT_)
!!$ CALL F90_PSHALO(Z,DECOMP_DATA)
Call psb_prcaply(prec,r,z,desc_a,info,work=aux)
rho_old = rho
rho = f90_psdot(r,z,desc_a,info)
if (it==1) then
call psb_axpby(one,z,zero,p,desc_a,info)
else
if (rho_old==zero) then
write(0,*) 'CG Iteration breakdown'
exit iteration
endif
beta = rho/rho_old
call psb_axpby(one,z,beta,p,desc_a,info)
end if
call psb_spmm(one,a,p,zero,q,desc_a,info,work=aux)
sigma = psb_dot(p,q,desc_a,info)
if (sigma==zero) then
write(0,*) 'CG Iteration breakdown'
exit iteration
endif
alpha = rho/sigma
call psb_axpby(alpha,p,one,x,desc_a,info)
call psb_axpby(-alpha,q,one,r,desc_a,info)
if (listop == 1) Then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
If (itrac /= -1) Then
If (me.Eq.0) Write(itrac,'(a,i4,5(2x,es10.4))') 'cg: ',it,rerr,rni,bni,&
&xni,ani
Endif
Else If (listop == 2) Then
rni = psb_nrm2(r,desc_a,info)
rerr = rni/bn2
If (itrac /= -1) Then
If (me.Eq.0) Write(itrac,'(a,i4,3(2x,es10.4)))') 'cg: ',it,rerr,rni,bn2
Endif
Endif
if (rerr<=eps) then
exit iteration
end if
end do iteration
if (present(err)) err=rerr
if (present(iter)) iter = it
if (rerr>eps) then
write(0,*) 'CG Failed to converge to ',eps,&
& ' in ',litmax,' iterations '
info=it
end if
deallocate(aux)
call psb_free(wwrk,desc_a,info)
! restore external global coherence behaviour
call blacs_set(icontxt,16,isvch)
if (info.ne.0) then
call psb_errpush(info,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error()
return
end if
return
end subroutine psb_dcg

@ -0,0 +1,332 @@
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
!!$ C C
!!$ C References: C
!!$ C [1] Duff, I., Marrone, M., Radicati, G., and Vittoli, C. C
!!$ C Level 3 basic linear algebra subprograms for sparse C
!!$ C matrices: a user level interface C
!!$ C ACM Trans. Math. Softw., 23(3), 379-401, 1997. C
!!$ C C
!!$ C C
!!$ C [2] S. Filippone, M. Colajanni C
!!$ C PSBLAS: A library for parallel linear algebra C
!!$ C computation on sparse matrices C
!!$ C ACM Trans. on Math. Softw., 26(4), 527-550, Dec. 2000. C
!!$ C C
!!$ C [3] M. Arioli, I. Duff, M. Ruiz C
!!$ C Stopping criteria for iterative solvers C
!!$ C SIAM J. Matrix Anal. Appl., Vol. 13, pp. 138-144, 1992 C
!!$ C C
!!$ C C
!!$ C [4] R. Barrett et al C
!!$ C Templates for the solution of linear systems C
!!$ C SIAM, 1993
!!$ C C
!!$ C C
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
! File: psb_dcgs.f90
!
! Subroutine: psb_dcgs
!
! Parameters:
! a - type(<psb_dspmat_type>). The sparse matrix containing A.
! prec - type(<psb_prec_type>). The data structure containing the preconditioner.
! b - real,dimension(:). The right hand side.
! x - real,dimension(:). The vector of unknowns.
! eps - real. The error tolerance.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! itmax - integer(optional). The maximum number of iterations.
! iter - integer(optional). The number of iterations performed.
! err - real(optional). The error on return.
! itrace - integer(optional). The unit to write messages onto.
! istop - integer(optional). The stopping criterium.
!
Subroutine psb_dcgs(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err,itrace,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
use psb_psblas_mod
use psb_tools_mod
use psb_const_mod
use psb_prec_mod
use psb_error_mod
implicit none
!!$ parameters
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_desc_type), Intent(in) :: desc_a
Type(psb_dprec_type), Intent(in) :: prec
Real(Kind(1.d0)), Intent(in) :: b(:)
Real(Kind(1.d0)), Intent(inout) :: x(:)
Real(Kind(1.d0)), Intent(in) :: eps
integer, intent(out) :: info
Integer, Optional, Intent(in) :: itmax, itrace,istop
Integer, Optional, Intent(out) :: iter
Real(Kind(1.d0)), Optional, Intent(out) :: err
!!$ local data
Real(Kind(1.d0)), Pointer :: aux(:),wwrk(:,:)
Real(Kind(1.d0)), Pointer :: ww(:), q(:),&
& r(:), p(:), v(:), s(:), t(:), z(:), f(:), rt(:),qt(:),uv(:)
Integer, Pointer :: iperm(:), ipnull(:), ipsave(:)
Real(Kind(1.d0)) ::rerr
Integer ::litmax, liter, naux, m, mglob, it, itrac,int_err(5),&
& nprows,npcols,me,mecol, n_row, n_col,listop, err_act
Character ::diagl, diagu
Logical, Parameter :: exchange=.True., noexchange=.False.
Integer, Parameter :: ione=1
Integer, Parameter :: irmax = 8
Integer :: itx, i, isvch, ich, icontxt
Logical :: do_renum_left
Logical, Parameter :: debug = .false.
Real(Kind(1.d0)), Parameter :: one=1.d0, zero=0.d0, epstol=1.d-35
Real(Kind(1.d0)) :: alpha, beta, rho, rho_old, rni, xni, bni, ani,bn2,&
& sigma, omega, tau
character(len=20) :: name,ch_err
info = 0
name = 'psb_dcgs'
call psb_erractionsave(err_act)
If (debug) Write(*,*) 'entering psb_dcgs'
icontxt = desc_a%matrix_data(psb_ctxt_)
Call blacs_gridinfo(icontxt,nprows,npcols,me,mecol)
If (debug) Write(*,*) 'psb_dcgs: from gridinfo',nprows,npcols,me
mglob = desc_a%matrix_data(m_)
n_row = desc_a%matrix_data(psb_n_row_)
n_col = desc_a%matrix_data(psb_n_col_)
If (Present(istop)) Then
listop = istop
Else
listop = 1
Endif
!
! listop = 1: normwise backward error, infinity norm
! listop = 2: ||r||/||b|| norm 2
!
!!$
!!$ If ((prec%prec < 0).Or.(prec%prec > 6) ) Then
!!$ Write(0,*) 'f90_cgstab: invalid iprec',prec%prec
!!$ If (Present(ierr)) ierr=-1
!!$ Return
!!$ Endif
if ((listop < 1 ).or.(listop > 2 ) ) then
write(0,*) 'psb_cgs: invalid istop',listop
info=5001
int_err=listop
err=info
call psb_errpush(info,name,i_err=int_err)
goto 9999
endif
naux=4*n_col
Allocate(aux(naux),stat=info)
Call psb_alloc(mglob,11,wwrk,desc_a,info)
Call psb_asb(wwrk,desc_a,info)
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
q => wwrk(:,1)
qt => wwrk(:,2)
r => wwrk(:,3)
rt => wwrk(:,4)
p => wwrk(:,5)
v => wwrk(:,6)
uv => wwrk(:,7)
z => wwrk(:,8)
f => wwrk(:,9)
s => wwrk(:,10)
ww => wwrk(:,11)
If (Present(itmax)) Then
litmax = itmax
Else
litmax = 1000
Endif
If (Present(itrace)) Then
itrac = itrace
Else
itrac = -1
End If
! ensure global coherence for convergence checks.
Call blacs_get(icontxt,16,isvch)
ich = 1
Call blacs_set(icontxt,16,ich)
diagl = 'u'
diagu = 'u'
itx = 0
if (listop == 1) then
ani = psb_nrmi(a,desc_a,info)
bni = psb_amax(b,desc_a,info)
else if (listop == 2) then
bn2 = psb_nrm2(b,desc_a,info)
endif
if(info/=0)then
info=4011
call psb_errpush(info,name)
goto 9999
end if
restart: Do
!!$
!!$ r0 = b-ax0
!!$
If (itx.Ge.itmax) Exit restart
it = 0
Call psb_axpby(one,b,zero,r,desc_a,info)
Call psb_spmm(-one,a,x,one,r,desc_a,info,work=aux)
Call psb_axpby(one,r,zero,rt,desc_a,info)
if(info/=0)then
info=4011
call psb_errpush(info,name)
goto 9999
end if
rho = zero
If (debug) Write(*,*) 'on entry to amax: b: ',Size(b)
if (listop == 1) then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'cgs: ',&
& itx,rerr,rni,bni,xni,ani
endif
else if (listop == 2) then
rni = psb_nrm2(r,desc_a,info)
rerr = rni/bn2
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4))') 'cgs: ',itx,rerr,rni,bn2
endif
endif
if(info/=0)then
info=4011
call psb_errpush(info,name)
goto 9999
end if
If (rerr<=eps) Then
Exit restart
End If
iteration: Do
it = it + 1
itx = itx + 1
If (debug) Write(*,*) 'iteration: ',itx
rho_old = rho
rho = psb_dot(rt,r,desc_a,info)
If (rho==zero) Then
If (debug) Write(0,*) 'cgs iteration breakdown r',rho
Exit iteration
Endif
If (it==1) Then
Call psb_axpby(one,r,zero,uv,desc_a,info)
Call psb_axpby(one,r,zero,p,desc_a,info)
Else
beta = (rho/rho_old)
Call psb_axpby(one,r,zero,uv,desc_a,info)
Call psb_axpby(beta,q,one,uv,desc_a,info)
Call psb_axpby(one,q,beta,p,desc_a,info)
Call psb_axpby(one,uv,beta,p,desc_a,info)
End If
Call psb_prcaply(prec,p,f,desc_a,info,work=aux)
Call psb_spmm(one,a,f,zero,v,desc_a,info,&
& work=aux)
sigma = psb_dot(rt,v,desc_a,info)
If (sigma==zero) Then
If (debug) Write(0,*) 'cgs iteration breakdown s1', sigma
Exit iteration
Endif
alpha = rho/sigma
Call psb_axpby(one,uv,zero,q,desc_a,info)
Call psb_axpby(-alpha,v,one,q,desc_a,info)
Call psb_axpby(one,uv,zero,s,desc_a,info)
Call psb_axpby(one,q,one,s,desc_a,info)
Call psb_prcaply(prec,s,z,desc_a,info,work=aux)
Call psb_axpby(alpha,z,one,x,desc_a,info)
Call psb_spmm(one,a,z,zero,qt,desc_a,info,&
& work=aux)
Call psb_axpby(-alpha,qt,one,r,desc_a,info)
if (listop == 1) then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'cgs: ',&
& itx,rerr,rni,bni,xni,ani
endif
else if (listop == 2) then
rni = psb_nrm2(r,desc_a,info)
rerr = rni/bn2
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4))') 'cgs: ',&
& itx,rerr,rni,bn2
endif
endif
If (rerr<=eps) Then
Exit restart
End If
If (itx.Ge.itmax) Exit restart
End Do iteration
End Do restart
If (Present(err)) err=rerr
If (Present(iter)) iter = itx
If (rerr>eps) Then
Write(0,*) 'cgs failed to converge to ',eps,&
& ' in ',itx,' iterations '
End If
Deallocate(aux)
Call psb_dsfree(wwrk,desc_a,info)
! restore external global coherence behaviour
Call blacs_set(icontxt,16,isvch)
if(info/=0) then
call psb_errpush(info,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error()
return
end if
return
End Subroutine psb_dcgs

@ -0,0 +1,365 @@
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
!!$ C C
!!$ C References: C
!!$ C [1] Duff, I., Marrone, M., Radicati, G., and Vittoli, C. C
!!$ C Level 3 basic linear algebra subprograms for sparse C
!!$ C matrices: a user level interface C
!!$ C ACM Trans. Math. Softw., 23(3), 379-401, 1997. C
!!$ C C
!!$ C C
!!$ C [2] S. Filippone, M. Colajanni C
!!$ C PSBLAS: A library for parallel linear algebra C
!!$ C computation on sparse matrices C
!!$ C ACM Trans. on Math. Softw., 26(4), 527-550, Dec. 2000. C
!!$ C C
!!$ C [3] M. Arioli, I. Duff, M. Ruiz C
!!$ C Stopping criteria for iterative solvers C
!!$ C SIAM J. Matrix Anal. Appl., Vol. 13, pp. 138-144, 1992 C
!!$ C C
!!$ C C
!!$ C [4] R. Barrett et al C
!!$ C Templates for the solution of linear systems C
!!$ C SIAM, 1993
!!$ C C
!!$ C C
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
! File: psb_dcgstab.f90
!
! Subroutine: psb_dcgstab
! This subroutine implements the CG Stabilized method.
!
! Parameters:
! a - type(<psb_dspmat_type>). The sparse matrix containing A.
! prec - type(<psb_prec_type>). The data structure containing the preconditioner.
! b - real,dimension(:). The right hand side.
! x - real,dimension(:). The vector of unknowns.
! eps - real. The error tolerance.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! itmax - integer(optional). The maximum number of iterations.
! iter - integer(optional). The number of iterations performed.
! err - real(optional). The error on return.
! itrace - integer(optional). The unit to write messages onto.
! istop - integer(optional). The stopping criterium.
!
Subroutine psb_dcgstab(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err,itrace, istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
use psb_psblas_mod
use psb_tools_mod
use psb_const_mod
use psb_prec_mod
use psb_error_mod
Implicit None
!!$ parameters
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_dprec_type), Intent(in) :: prec
Type(psb_desc_type), Intent(in) :: desc_a
Real(Kind(1.d0)), Intent(in) :: b(:)
Real(Kind(1.d0)), Intent(inout) :: x(:)
Real(Kind(1.d0)), Intent(in) :: eps
integer, intent(out) :: info
Integer, Optional, Intent(in) :: itmax, itrace, istop
Integer, Optional, Intent(out) :: iter
Real(Kind(1.d0)), Optional, Intent(out) :: err
!!$ Local data
Real(Kind(1.d0)), Pointer :: aux(:),wwrk(:,:)
Real(Kind(1.d0)), Pointer :: q(:),&
& r(:), p(:), v(:), s(:), t(:), z(:), f(:)
Integer, Pointer :: iperm(:), ipnull(:), ipsave(:)
Real(Kind(1.d0)) ::rerr
Integer ::litmax, liter, naux, m, mglob, it,itrac,&
& nprows,npcols,me,mecol, n_row, n_col
Character ::diagl, diagu
Logical, Parameter :: debug = .false.
Logical, Parameter :: exchange=.True., noexchange=.False., debug1 = .False.
Integer, Parameter :: ione=1
Integer, Parameter :: irmax = 8
Integer :: itx, i, isvch, ich, icontxt, err_act, int_err(5)
Integer :: listop
Logical :: do_renum_left
Real(Kind(1.d0)), Parameter :: one=1.d0, zero=0.d0, epstol=1.d-35
Real(Kind(1.d0)) :: alpha, beta, rho, rho_old, rni, xni, bni, ani,&
& sigma, omega, tau, rn0, bn2
!!$ Integer istpb, istpe, ifctb, ifcte, imerr, irank, icomm,immb,imme
!!$ Integer mpe_log_get_event_number,mpe_Describe_state,mpe_log_event
character(len=20) :: name,ch_err
info = 0
name = 'psb_dcgstab'
call psb_erractionsave(err_act)
If (debug) Write(*,*) 'Entering PSB_DCGSTAB',present(istop)
icontxt = desc_a%MATRIX_DATA(CTXT_)
CALL BLACS_GRIDINFO(icontxt,NPROWS,NPCOLS,ME,MECOL)
if (debug) write(*,*) 'PSB_DCGSTAB: From GRIDINFO',nprows,npcols,me
mglob = desc_a%matrix_data(m_)
n_row = desc_a%matrix_data(psb_n_row_)
n_col = desc_a%matrix_data(psb_n_col_)
If (Present(istop)) Then
listop = istop
Else
listop = 1
Endif
!
! LISTOP = 1: Normwise backward error, infinity norm
! LISTOP = 2: ||r||/||b|| norm 2
!
If ((prec%prec < min_prec_).Or.(prec%prec > max_prec_) ) Then
Write(0,*) 'PSB_CGSTAB: Invalid IPREC',prec%prec
info=5002
int_err(1)=prec%prec
err=info
call psb_errpush(info,name,i_err=int_err)
goto 9999
Endif
if ((listop < 1 ).or.(listop > 2 ) ) then
write(0,*) 'psb_bicgstab: invalid istop',listop
info=5001
int_err(1)=listop
err=info
call psb_errpush(info,name,i_err=int_err)
goto 9999
endif
naux=6*n_col
allocate(aux(naux),stat=info)
call psb_alloc(mglob,8,wwrk,desc_a,info)
call psb_asb(wwrk,desc_a,info)
if (info /= 0) then
info=4011
call psb_errpush(info,name)
goto 9999
End If
Q => WWRK(:,1)
R => WWRK(:,2)
P => WWRK(:,3)
V => WWRK(:,4)
F => WWRK(:,5)
S => WWRK(:,6)
T => WWRK(:,7)
Z => WWRK(:,8)
If (Present(itmax)) Then
litmax = itmax
Else
litmax = 1000
Endif
If (Present(itrace)) Then
itrac = itrace
Else
itrac = -1
End If
diagl = 'U'
diagu = 'U'
! Ensure global coherence for convergence checks.
Call blacs_get(icontxt,16,isvch)
ich = 1
Call blacs_set(icontxt,16,ich)
itx = 0
If (listop == 1) Then
ani = psb_nrmi(a,desc_a,info)
bni = psb_amax(b,desc_a,info)
Else If (listop == 2) Then
bn2 = psb_nrm2(b,desc_a,info)
Endif
if (info /= 0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
restart: Do
!!$
!!$ r0 = b-Ax0
!!$
If (itx >= itmax) Exit restart
it = 0
Call psb_axpby(one,b,zero,r,desc_a,info)
!!$ imerr = MPE_Log_event( immb, 0, "st SPMM" )
Call psb_spmm(-one,a,x,one,r,desc_a,info,work=aux)
!!$ imerr = MPE_Log_event( imme, 0, "ed SPMM" )
Call psb_axpby(one,r,zero,q,desc_a,info)
if (info /= 0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
rho = zero
If (debug) Write(*,*) 'On entry to AMAX: B: ',Size(b)
!
! Must always provide norm of R into RNI below for first check on
! residual
!
If (listop == 1) Then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
Else If (listop == 2) Then
rni = psb_nrm2(r,desc_a,info)
Endif
if (info /= 0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
If (itx == 0) Then
rn0 = rni
End If
If (rn0 == 0.d0 ) Then
If (itrac /= -1) Then
If (me == 0) Write(itrac,*) 'BiCGSTAB: ',itx,rn0
Endif
Exit restart
End If
If (listop == 1) Then
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
If (itrac /= -1) Then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'bicgstab: ',itx,rerr,rni,bni,&
&xni,ani
Endif
Else If (listop == 2) Then
rerr = rni/bn2
If (itrac /= -1) Then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4))') 'bicgstab: ',itx,rerr,rni,bn2
Endif
Endif
if (info /= 0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
If (rerr<=eps) Then
Exit restart
End If
iteration: Do
it = it + 1
itx = itx + 1
If (debug) Write(*,*) 'Iteration: ',itx
rho_old = rho
rho = psb_dot(q,r,desc_a,info)
If (rho==zero) Then
If (debug) Write(0,*) 'Bi-CGSTAB Itxation breakdown R',rho
Exit iteration
Endif
If (it==1) Then
Call psb_axpby(one,r,zero,p,desc_a,info)
Else
beta = (rho/rho_old)*(alpha/omega)
Call psb_axpby(-omega,v,one,p,desc_a,info)
Call psb_axpby(one,r,beta,p,desc_a,info)
End If
Call psb_prcaply(prec,p,f,desc_a,info,work=aux)
Call psb_spmm(one,a,f,zero,v,desc_a,info,&
& work=aux)
sigma = psb_dot(q,v,desc_a,info)
If (sigma==zero) Then
If (debug) Write(0,*) 'Bi-CGSTAB Iteration breakdown S1', sigma
Exit iteration
Endif
alpha = rho/sigma
Call psb_axpby(one,r,zero,s,desc_a,info)
Call psb_axpby(-alpha,v,one,s,desc_a,info)
Call psb_prcaply(prec,s,z,desc_a,info,work=aux)
Call psb_spmm(one,a,z,zero,t,desc_a,info,&
& work=aux)
sigma = psb_dot(t,t,desc_a,info)
If (sigma==zero) Then
If (debug) Write(0,*) 'BI-CGSTAB ITERATION BREAKDOWN S2', sigma
Exit iteration
Endif
tau = psb_dot(t,s,desc_a,info)
omega = tau/sigma
If (omega==zero) Then
If (debug) Write(0,*) 'BI-CGSTAB ITERATION BREAKDOWN O',omega
Exit iteration
Endif
Call psb_axpby(alpha,f,one,x,desc_a,info)
Call psb_axpby(omega,z,one,x,desc_a,info)
Call psb_axpby(one,s,zero,r,desc_a,info)
Call psb_axpby(-omega,t,one,r,desc_a,info)
If (listop == 1) Then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
If (itrac /= -1) Then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'bicgstab: ',itx,rerr,rni,bni,&
&xni,ani
Endif
Else If (listop == 2) Then
rni = psb_nrm2(r,desc_a,info)
rerr = rni/bn2
If (itrac /= -1) Then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4)))') 'bicgstab: ',itx,rerr,rni,bn2
Endif
Endif
If (rerr<=eps) Then
Exit restart
End If
If (itx.Ge.itmax) Exit restart
End Do iteration
End Do restart
If (Present(err)) err=rerr
If (Present(iter)) iter = itx
If (rerr>eps) Then
Write(0,*) 'BI-CGSTAB FAILED TO CONVERGE TO ',EPS,&
& ' IN ',ITX,' ITERATIONS '
End If
Deallocate(aux)
Call psb_free(wwrk,desc_a,info)
! restore external global coherence behaviour
Call blacs_set(icontxt,16,isvch)
!!$ imerr = MPE_Log_event( istpe, 0, "ed CGSTAB" )
if(info/=0) then
call psb_errpush(info,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error()
return
end if
return
End Subroutine psb_dcgstab

@ -0,0 +1,396 @@
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
!!$ C C
!!$ C References: C
!!$ C [1] Duff, I., Marrone, M., Radicati, G., and Vittoli, C. C
!!$ C Level 3 basic linear algebra subprograms for sparse C
!!$ C matrices: a user level interface C
!!$ C ACM Trans. Math. Softw., 23(3), 379-401, 1997. C
!!$ C C
!!$ C C
!!$ C [2] S. Filippone, M. Colajanni C
!!$ C PSBLAS: A library for parallel linear algebra C
!!$ C computation on sparse matrices C
!!$ C ACM Trans. on Math. Softw., 26(4), 527-550, Dec. 2000. C
!!$ C C
!!$ C [3] M. Arioli, I. Duff, M. Ruiz C
!!$ C Stopping criteria for iterative solvers C
!!$ C SIAM J. Matrix Anal. Appl., Vol. 13, pp. 138-144, 1992 C
!!$ C C
!!$ C C
!!$ C [4] R. Barrett et al C
!!$ C Templates for the solution of linear systems C
!!$ C SIAM, 1993 C
!!$ C C
!!$ C C
!!$ C [5] G. Sleijpen, D. Fokkema C
!!$ C BICGSTAB(L) for linear equations involving unsymmetric C
!!$ C matrices with complex spectrum C
!!$ C Electronic Trans. on Numer. Analysis, Vol. 1, pp. 11-32, C
!!$ C Sep. 1993 C
!!$ C C
!!$ C C
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
! File: psb_dcgstabl.f90
!
! Subroutine: psb_dcgstabl
!
! Parameters:
! a - type(<psb_dspmat_type>). The sparse matrix containing A.
! prec - type(<psb_prec_type>). The data structure containing the preconditioner.
! b - real,dimension(:). The right hand side.
! x - real,dimension(:). The vector of unknowns.
! eps - real. The error tolerance.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! itmax - integer(optional). The maximum number of iterations.
! iter - integer(optional). The number of iterations performed.
! err - real(optional). The error on return.
! itrace - integer(optional). The unit to write messages onto.
! istop - integer(optional). The stopping criterium.
!
Subroutine psb_dcgstabl(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err,itrace,irst,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
use psb_psblas_mod
use psb_tools_mod
use psb_const_mod
use psb_prec_mod
use psb_error_mod
implicit none
!!$ parameters
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_dprec_type), Intent(in) :: prec
Type(psb_desc_type), Intent(in) :: desc_a
Real(Kind(1.d0)), Intent(in) :: b(:)
Real(Kind(1.d0)), Intent(inout) :: x(:)
Real(Kind(1.d0)), Intent(in) :: eps
integer, intent(out) :: info
Integer, Optional, Intent(in) :: itmax, itrace, irst,istop
Integer, Optional, Intent(out) :: iter
Real(Kind(1.d0)), Optional, Intent(out) :: err
!!$ local data
Real(Kind(1.d0)), Pointer :: aux(:),wwrk(:,:)
Real(Kind(1.d0)), Pointer :: ww(:), q(:), r(:), rt0(:), p(:), v(:), &
& s(:), t(:), z(:), f(:), uh(:,:), rh(:,:), &
& gamma(:), gamma1(:), gamma2(:), taum(:,:), sigma(:),&
&pv1(:), pv2(:), pm1(:,:), pm2(:,:)
Integer, Pointer :: iperm(:), ipnull(:), ipsave(:)
Real(Kind(1.d0)) ::rerr
Integer ::litmax, liter, naux, m, mglob, it, itrac,&
& nprows,npcols,me,mecol, n_row, n_col, nl, err_act
Character ::diagl, diagu
Logical, Parameter :: exchange=.True., noexchange=.False.
Integer, Parameter :: ione=1
Integer, Parameter :: irmax = 8
Integer :: itx, i, isvch, ich, icontxt,listop,j, int_err(5)
Logical :: do_renum_left
Real(Kind(1.d0)), Parameter :: one=1.d0, zero=0.d0, epstol=1.d-35
Logical, Parameter :: debug = .False.
Real(Kind(1.d0)) :: alpha, beta, rho, rho_old, rni, xni, bni, ani,bn2,&
& omega, tau
character(len=20) :: name,ch_err
info = 0
name = 'psb_dcgstabl'
call psb_erractionsave(err_act)
If (debug) Write(0,*) 'entering psb_dbicgstabl'
icontxt = desc_a%matrix_data(psb_ctxt_)
Call blacs_gridinfo(icontxt,nprows,npcols,me,mecol)
If (debug) Write(0,*) 'psb_dbicgstabl: from gridinfo',nprows,npcols,me
mglob = desc_a%matrix_data(m_)
n_row = desc_a%matrix_data(psb_n_row_)
n_col = desc_a%matrix_data(psb_n_col_)
if (present(istop)) then
listop = istop
else
listop = 1
endif
!
! LISTOP = 1: Normwise backward error, infinity norm
! LISTOP = 2: ||r||/||b|| norm 2
!
if ((listop < 1 ).or.(listop > 2 ) ) then
write(0,*) 'psb_bicgstabl: invalid istop',listop
info=5001
int_err=listop
err=info
call psb_errpush(info,name,i_err=int_err)
goto 9999
endif
If (Present(itmax)) Then
litmax = itmax
Else
litmax = 1000
Endif
If (Present(itrace)) Then
itrac = itrace
Else
itrac = -1
End If
If (Present(irst)) Then
nl = irst
If (debug) Write(0,*) 'present: irst: ',irst,nl
Else
nl = 1
If (debug) Write(0,*) 'not present: irst: ',irst,nl
Endif
naux=4*n_col
Allocate(aux(naux),gamma(0:nl),gamma1(nl),&
&gamma2(nl),taum(nl,nl),sigma(nl), stat=info)
If (info.Ne.0) Then
info=4000
call psb_errpush(info,name)
goto 9999
End If
Call psb_alloc(mglob,10,wwrk,desc_a,info)
Call psb_alloc(mglob,nl+1,uh,desc_a,info,js=0)
Call psb_alloc(mglob,nl+1,rh,desc_a,info,js=0)
Call psb_asb(wwrk,desc_a,info)
Call psb_asb(uh,desc_a,info)
Call psb_asb(rh,desc_a,info)
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
q => wwrk(:,1)
r => wwrk(:,2)
p => wwrk(:,3)
v => wwrk(:,4)
f => wwrk(:,5)
s => wwrk(:,6)
t => wwrk(:,7)
z => wwrk(:,8)
ww => wwrk(:,9)
rt0 => wwrk(:,10)
! ensure global coherence for convergence checks.
Call blacs_get(icontxt,16,isvch)
ich = 1
Call blacs_set(icontxt,16,ich)
if (listop == 1) then
ani = psb_nrmi(a,desc_a,info)
bni = psb_amax(b,desc_a,info)
else if (listop == 2) then
bn2 = psb_nrm2(b,desc_a,info)
endif
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
diagl = 'u'
diagu = 'u'
itx = 0
restart: Do
!!$
!!$ r0 = b-ax0
!!$
If (debug) Write(0,*) 'restart: ',itx,it
If (itx.Ge.itmax) Exit restart
it = 0
Call psb_axpby(one,b,zero,r,desc_a,info)
Call psb_spmm(-one,a,x,one,r,desc_a,info,work=aux)
call psb_prcaply(prec,r,desc_a,info)
Call psb_axpby(one,r,zero,rt0,desc_a,info)
Call psb_axpby(one,r,zero,rh(:,0),desc_a,info)
Call psb_axpby(zero,r,zero,uh(:,0),desc_a,info)
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
rho = one
alpha = zero
omega = one
If (debug) Write(0,*) 'on entry to amax: b: ',Size(b)
if (listop == 1) then
rni = psb_amax(r,desc_a,info)
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'bicgstab(l): ',&
& itx,rerr,rni,bni,xni,ani
endif
else if (listop == 2) then
rni = psb_nrm2(r,desc_a,info)
rerr = rni/bn2
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4))') 'bicgstab(l): ',&
& itx,rerr,rni,bn2
endif
endif
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
If (rerr<=eps) Then
Exit restart
End If
iteration: Do
it = it + nl
itx = itx + nl
rho = -omega*rho
If (debug) Write(0,*) 'iteration: ',itx, rho,rh(1,0)
Do j = 0, nl -1
If (debug) Write(0,*) 'bicg part: ',j, nl
rho_old = rho
rho = psb_dot(rh(:,j),rt0,desc_a,info)
If (rho==zero) Then
If (debug) Write(0,*) 'bi-cgstab iteration breakdown r',rho
Exit iteration
Endif
beta = alpha*rho/rho_old
If (debug) Write(0,*) 'bicg part: ',alpha,beta,rho,rho_old
rho_old = rho
Call psb_axpby(one,rh(:,0:j),-beta,uh(:,0:j),desc_a,info)
If (debug) Write(0,*) 'bicg part: ',rh(1,0),beta
Call psb_spmm(one,a,uh(:,j),zero,uh(:,j+1),desc_a,info,work=aux)
call psb_prcaply(prec,uh(:,j+1),desc_a,info)
gamma(j) = psb_dot(uh(:,j+1),rt0,desc_a,info)
If (gamma(j)==zero) Then
If (debug) Write(0,*) 'bi-cgstab iteration breakdown s2',gamma(j)
Exit iteration
Endif
alpha = rho/gamma(j)
If (debug) Write(0,*) 'bicg part: alpha=r/g ',alpha,rho,gamma(j)
Call psb_axpby(-alpha,uh(:,1:j+1),one,rh(:,0:j),desc_a,info)
Call psb_axpby(alpha,uh(:,0),one,x,desc_a,info)
Call psb_spmm(one,a,rh(:,j),zero,rh(:,j+1),desc_a,info,work=aux)
call psb_prcaply(prec,rh(:,j+1),desc_a,info)
Enddo
Do j=1, nl
If (debug) Write(0,*) 'mod g-s part: ',j, nl,rh(1,0)
Do i=1, j-1
taum(i,j) = psb_dot(rh(:,i),rh(:,j),desc_a,info)
taum(i,j) = taum(i,j)/sigma(i)
Call psb_axpby(-taum(i,j),rh(:,i),one,rh(:,j),desc_a,info)
Enddo
If (debug) Write(0,*) 'mod g-s part: dot prod '
sigma(j) = psb_dot(rh(:,j),rh(:,j),desc_a,info)
gamma1(j) = psb_dot(rh(:,0),rh(:,j),desc_a,info)
If (debug) Write(0,*) 'mod g-s part: gamma1 ', &
&gamma1(j), sigma(j)
gamma1(j) = gamma1(j)/sigma(j)
Enddo
gamma(nl) = gamma1(nl)
omega = gamma(nl)
Do j=nl-1,1,-1
gamma(j) = gamma1(j)
Do i=j+1,nl
gamma(j) = gamma(j) - taum(j,i) * gamma(i)
Enddo
Enddo
If (debug) Write(0,*) 'first solve: ', gamma(:)
Do j=1,nl-1
gamma2(j) = gamma(j+1)
Do i=j+1,nl-1
gamma2(j) = gamma2(j) + taum(j,i) * gamma(i+1)
Enddo
Enddo
If (debug) Write(0,*) 'second solve: ', gamma(:)
Call psb_axpby(gamma(1),rh(:,0),one,x,desc_a,info)
Call psb_axpby(-gamma1(nl),rh(:,nl),one,rh(:,0),desc_a,info)
Call psb_axpby(-gamma(nl),uh(:,nl),one,uh(:,0),desc_a,info)
Do j=1, nl-1
Call psb_axpby(-gamma(j),uh(:,j),one,uh(:,0),desc_a,info)
Call psb_axpby(gamma2(j),rh(:,j),one,x,desc_a,info)
Call psb_axpby(-gamma1(j),rh(:,j),one,rh(:,0),desc_a,info)
Enddo
if (listop == 1) then
rni = psb_amax(rh(:,0),desc_a,info)
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'bicgstab(l): ',&
& itx,rerr,rni,bni,xni,ani
endif
else if (listop == 2) then
rni = psb_nrm2(rh(:,0),desc_a,info)
rerr = rni/bn2
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4))') 'bicgstab(l): ',&
& itx,rerr,rni,bn2
endif
endif
If (rerr<=eps) Then
Exit restart
End If
If (itx.Ge.itmax) Exit restart
End Do iteration
End Do restart
If (Present(err)) err=rerr
If (Present(iter)) iter = itx
If (rerr>eps) Then
Write(0,*) 'bi-cgstabl failed to converge to ',eps,&
& ' in ',itx,' iterations '
End If
Deallocate(aux)
Call psb_free(wwrk,desc_a,info)
Call psb_free(uh,desc_a,info)
Call psb_free(rh,desc_a,info)
! restore external global coherence behaviour
Call blacs_set(icontxt,16,isvch)
if(info/=0) then
call psb_errpush(info,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error()
return
end if
return
End Subroutine psb_dcgstabl

@ -0,0 +1,356 @@
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
!!$ C C
!!$ C References: C
!!$ C [1] Duff, I., Marrone, M., Radicati, G., and Vittoli, C. C
!!$ C Level 3 basic linear algebra subprograms for sparse C
!!$ C matrices: a user level interface C
!!$ C ACM Trans. Math. Softw., 23(3), 379-401, 1997. C
!!$ C C
!!$ C C
!!$ C [2] S. Filippone, M. Colajanni C
!!$ C PSBLAS: A library for parallel linear algebra C
!!$ C computation on sparse matrices C
!!$ C ACM Trans. on Math. Softw., 26(4), 527-550, Dec. 2000. C
!!$ C C
!!$ C [3] M. Arioli, I. Duff, M. Ruiz C
!!$ C Stopping criteria for iterative solvers C
!!$ C SIAM J. Matrix Anal. Appl., Vol. 13, pp. 138-144, 1992 C
!!$ C C
!!$ C C
!!$ C [4] R. Barrett et al C
!!$ C Templates for the solution of linear systems C
!!$ C SIAM, 1993 C
!!$ C C
!!$ C C
!!$ C [5] G. Sleijpen, D. Fokkema C
!!$ C BICGSTAB(L) for linear equations involving unsymmetric C
!!$ C matrices with complex spectrum C
!!$ C Electronic Trans. on Numer. Analysis, Vol. 1, pp. 11-32, C
!!$ C Sep. 1993 C
!!$ C C
!!$ C C
!!$ CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
! File: psb_dgmresr.f90
!
! Subroutine: psb_dgmres
! This subroutine implements the restarted GMRES method.
!
! Parameters:
! a - type(<psb_dspmat_type>). The sparse matrix containing A.
! prec - type(<psb_prec_type>). The data structure containing the preconditioner.
! b - real,dimension(:). The right hand side.
! x - real,dimension(:). The vector of unknowns.
! eps - real. The error tolerance.
! desc_a - type(<psb_desc_type>). The communication descriptor.
! info - integer. Eventually returns an error code.
! itmax - integer(optional). The maximum number of iterations.
! iter - integer(optional). The number of iterations performed.
! err - real(optional). The error on return.
! itrace - integer(optional). The unit to write messages onto.
! irst - integer(optional). The restart value.
! istop - integer(optional). The stopping criterium.
!
Subroutine psb_dgmresr(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err,itrace,irst,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
use psb_psblas_mod
use psb_tools_mod
use psb_const_mod
use psb_prec_mod
use psb_error_mod
implicit none
!!$ Parameters
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_dprec_type), Intent(in) :: prec
Type(psb_desc_type), Intent(in) :: desc_a
Real(Kind(1.d0)), Intent(in) :: b(:)
Real(Kind(1.d0)), Intent(inout) :: x(:)
Real(Kind(1.d0)), Intent(in) :: eps
integer, intent(out) :: info
Integer, Optional, Intent(in) :: itmax, itrace, irst,istop
Integer, Optional, Intent(out) :: iter
Real(Kind(1.d0)), Optional, Intent(out) :: err
!!$ local data
Real(Kind(1.d0)), Pointer :: aux(:),wwrk(:,:)
Real(Kind(1.d0)), Pointer :: w(:), q(:), r(:), rt0(:), p(:), v(:,:), &
& c(:),s(:), t(:), z(:), f(:), uh(:,:), h(:,:), rs(:),&
& gamma(:), gamma1(:), gamma2(:), taum(:,:), sigma(:),&
&pv1(:), pv2(:), pm1(:,:), rr(:,:)
Integer, Pointer :: iperm(:), ipnull(:), ipsave(:), ierrv(:)
Real(Kind(1.d0)) :: rerr, scal, gm
Integer ::litmax, liter, naux, m, mglob, it,k, itrac,&
& nprows,npcols,me,mecol, n_row, n_col, nl, int_err(5)
Character ::diagl, diagu
Logical, Parameter :: exchange=.True., noexchange=.False.
Integer, Parameter :: ione=1
Integer, Parameter :: irmax = 8
Integer :: itx, i, isvch, ich, icontxt,listop, err_act
Logical :: do_renum_left,inner_stop
Real(Kind(1.d0)), Parameter :: one=1.d0, zero=0.d0, epstol=1.d-35
Logical, Parameter :: debug = .false.
Real(Kind(1.d0)) :: alpha, beta, rho, rho_old, rni, xni, bni, ani,bn2,&
& omega, tau
real(kind(1.d0)), external :: dnrm2
character(len=20) :: name,ch_err
info = 0
name = 'psb_dgmres'
call psb_erractionsave(err_act)
If (debug) Write(0,*) 'entering psb_dgmres'
icontxt = desc_a%matrix_data(psb_ctxt_)
Call blacs_gridinfo(icontxt,nprows,npcols,me,mecol)
If (debug) Write(0,*) 'psb_dgmres: from gridinfo',nprows,npcols,me
mglob = desc_a%matrix_data(m_)
n_row = desc_a%matrix_data(psb_n_row_)
n_col = desc_a%matrix_data(psb_n_col_)
if (present(istop)) then
listop = istop
else
listop = 1
endif
!
! LISTOP = 1: Normwise backward error, infinity norm
! LISTOP = 2: ||r||/||b|| norm 2
!
if ((listop < 1 ).or.(listop > 2 ) ) then
write(0,*) 'psb_dgmres: invalid istop',listop
info=5001
int_err(1)=listop
err=info
call psb_errpush(info,name,i_err=int_err)
goto 9999
endif
If (Present(itmax)) Then
litmax = itmax
Else
litmax = 1000
Endif
If (Present(itrace)) Then
itrac = itrace
Else
itrac = -1
End If
If (Present(irst)) Then
nl = irst
If (debug) Write(0,*) 'present: irst: ',irst,nl
Else
nl = 10
If (debug) Write(0,*) 'not present: irst: ',irst,nl
Endif
naux=4*n_col
Allocate(aux(naux),h(nl+1,nl+1),rr(nl+1,nl+1),&
&c(nl+1),s(nl+1),rs(nl+1), stat=info)
If (info.Ne.0) Then
info = 4000
call psb_errpush(info,name)
goto 9999
End If
Call psb_dsall(mglob,nl+1,v,desc_a,info)
Call psb_dsall(mglob,w,desc_a,info)
Call psb_dsasb(v,desc_a,info)
Call psb_dsasb(w,desc_a,info)
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
! ensure global coherence for convergence checks.
Call blacs_get(icontxt,16,isvch)
ich = 1
Call blacs_set(icontxt,16,ich)
if (listop == 1) then
ani = psb_nrmi(a,desc_a,info)
bni = psb_amax(b,desc_a,info)
else if (listop == 2) then
bn2 = psb_nrm2(b,desc_a,info)
endif
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
diagl = 'u'
diagu = 'u'
itx = 0
restart: Do
!!$
!!$ r0 = b-ax0
!!$
If (debug) Write(0,*) 'restart: ',itx,it
it = 0
Call psb_axpby(one,b,zero,v(:,1),desc_a,info)
Call psb_spmm(-one,a,x,one,v(:,1),desc_a,info,work=aux)
call psb_prcaply(prec,v(:,1),desc_a,info)
rs(1) = psb_nrm2(v(:,1),desc_a,info)
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
scal=one/rs(1)
If (debug) Write(0,*) 'on entry to amax: b: ',Size(b),rs(1),scal
if (listop == 1) then
rni = psb_amax(v(:,1),desc_a,info)
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'gmresr(l): ',&
& itx,rerr,rni,bni,xni,ani
endif
else if (listop == 2) then
rni = psb_nrm2(v(:,1),desc_a,info)
rerr = rni/bn2
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4))') 'gmresr(l): ',&
& itx,rerr,rni,bn2
endif
endif
if (info.ne.0) Then
info=4011
call psb_errpush(info,name)
goto 9999
End If
If (rerr<=eps) Then
Exit restart
End If
If (itx.Ge.itmax) Exit restart
v(:,1) = v(:,1) * scal
inner: Do i=1,nl
itx = itx + 1
Call psb_spmm(one,a,v(:,i),zero,w,desc_a,info,work=aux)
call psb_prcaply(prec,w,desc_a,info)
do k = 1, i
h(k,i) = psb_dot(v(:,k),w,desc_a,info)
call psb_axpby(-h(k,i),v(:,k),one,w,desc_a,info)
end do
h(i+1,i) = psb_nrm2(w,desc_a,info)
scal=one/h(i+1,i)
call psb_axpby(scal,w,zero,v(:,i+1),desc_a,info)
do k=2,i
rr(k-1,i) = c(k-1)*h(k-1,i) + s(k-1)*h(k,i)
rr(k,i) = -s(k-1)*h(k-1,i) + c(k-1)*h(k,i)
enddo
gm = safe_dn2(h(i,i),h(i+1,i))
if (debug) write(0,*) 'GM : ',gm
gm = max(gm,epstol)
c(i) = h(i,i)/gm
s(i) = h(i+1,i)/gm
rs(i+1) = -s(i)*rs(i)
rs(i) = c(i)*rs(i)
rr(i,i) = c(i)*h(i,i)+s(i)*h(i+1,i)
if (listop == 1) then
rni = abs(rs(i+1))
xni = psb_amax(x,desc_a,info)
rerr = rni/(ani*xni+bni)
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,5(2x,es10.4))') 'gmresr(l): ',&
& itx,rerr,rni,bni,xni,ani
endif
else if (listop == 2) then
rni = abs(rs(i+1))
rerr = rni/bn2
if (itrac /= -1) then
If (me == 0) Write(itrac,'(a,i4,3(2x,es10.4))') 'gmresr(l): ',&
& itx,rerr,rni,bn2
endif
endif
if (rerr < eps ) then
call dtrsm('l','u','n','n',i,1,one,rr,size(rr,1),rs,nl)
if (debug) write(0,*) 'Rebuild x-> RS:',rs(21:nl)
do k=1, i
call psb_axpby(rs(k),v(:,k),one,x,desc_a,info)
end do
exit restart
end if
end Do inner
if (debug) write(0,*) 'Before DTRSM :',rs(1:nl)
call dtrsm('l','u','n','n',nl,1,one,rr,size(rr,1),rs,nl)
if (debug) write(0,*) 'Rebuild x-> RS:',rs(21:nl)
do k=1, nl
call psb_axpby(rs(k),v(:,k),one,x,desc_a,info)
end do
End Do restart
If (Present(err)) err=rerr
If (Present(iter)) iter = itx
If (rerr>eps) Then
Write(0,*) 'gmresr(l) failed to converge to ',eps,&
& ' in ',itx,' iterations '
End If
Deallocate(aux,h,c,s,rs,rr, stat=info)
Call psb_free(v,desc_a,info)
Call psb_free(w,desc_a,info)
! restore external global coherence behaviour
Call blacs_set(icontxt,16,isvch)
if (info /= 0) then
info=4011
call psb_errpush(info,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_abort) then
call psb_error()
return
end if
return
contains
function safe_dn2(a,b)
real(kind(1.d0)), intent(in) :: a, b
real(kind(1.d0)) :: safe_dn2
real(kind(1.d0)) :: t
t = max(abs(a),abs(b))
if (t==0.d0) then
safe_dn2 = 0.d0
else
safe_dn2 = t * sqrt(abs(a/t)**2 + abs(b/t)**2)
endif
return
end function safe_dn2
End Subroutine psb_dgmresr

@ -0,0 +1,22 @@
include ../../Make.inc
MODULES = psb_realloc_mod.o psb_string_mod.o psb_spmat_type.o \
psb_desc_type.o \
psb_blacs_mod.o psb_serial_mod.o psb_tools_mod.o \
psb_prec_type.o psb_error_mod.o psb_prec_mod.o \
psb_methd_mod.o psb_const_mod.o \
psb_comm_mod.o psb_psblas_mod.o psi_mod.o
OBJS = error.o parts.o
INCDIRS = ../../lib
psb_realloc_mod.o : psb_error_mod.o
psb_spmat_type.o : psb_realloc_mod.o psb_const_mod.o
lib: $(MODULES) $(OBJS)
cp *$(.mod) ./psb_const.fh ../../lib
clean:
/bin/rm -f $(MODULES) $(OBJS) *$(.mod)

@ -0,0 +1,3 @@
1- psb_methd_mod: sistemare tutto
2- psb_prec_mod : sistemare tutto
3- psb_prec_type: sistemare tutto

@ -0,0 +1,130 @@
!
! Wrapper subroutines to provide error tools to F77 and C code
!
subroutine FCpsb_errcomm(icontxt, err)
use psb_error_mod
integer, intent(in) :: icontxt
integer, intent(inout):: err
call psb_errcomm(icontxt, err)
end subroutine FCpsb_errcomm
subroutine FCpsb_errpush(err_c, r_name, i_err)
use psb_error_mod
implicit none
integer, intent(in) :: err_c
character(len=20), intent(in) :: r_name
integer :: i_err(5)
call psb_errpush(err_c, r_name, i_err)
end subroutine FCpsb_errpush
subroutine FCpsb_serror()
use psb_error_mod
implicit none
call psb_error()
end subroutine FCpsb_serror
subroutine FCpsb_perror(icontxt)
use psb_error_mod
implicit none
integer, intent(in) :: icontxt
call psb_error(icontxt)
end subroutine FCpsb_perror
subroutine FCpsb_get_errstatus(s)
use psb_error_mod
implicit none
integer, intent(out) :: s
call psb_get_errstatus(s)
end subroutine FCpsb_get_errstatus
subroutine FCpsb_get_errverbosity(v)
use psb_error_mod
implicit none
integer, intent(out) :: v
call psb_get_errverbosity(v)
end subroutine FCpsb_get_errverbosity
subroutine FCpsb_set_errverbosity(v)
use psb_error_mod
implicit none
integer, intent(inout) :: v
call psb_set_errverbosity(v)
end subroutine FCpsb_set_errverbosity
subroutine FCpsb_erractionsave(err_act)
use psb_error_mod
implicit none
integer, intent(out) :: err_act
call psb_erractionsave(err_act)
end subroutine FCpsb_erractionsave
subroutine FCpsb_get_erraction(err_act)
use psb_error_mod
implicit none
integer, intent(out) :: err_act
call psb_get_erraction(err_act)
end subroutine FCpsb_get_erraction
subroutine FCpsb_erractionrestore(err_act)
use psb_error_mod
implicit none
integer, intent(in) :: err_act
call psb_erractionrestore(err_act)
end subroutine FCpsb_erractionrestore

@ -0,0 +1,8 @@
module psb_parts_mod
interface
subroutine psb_parts(glob_index,nrow,np,pv,nv)
integer, intent (in) :: glob_index,np,nrow
integer, intent (out) :: nv, pv(*)
end subroutine psb_parts
end interface
end module psb_parts_mod

File diff suppressed because it is too large Load Diff

@ -0,0 +1,110 @@
module psb_comm_mod
interface psb_ovrl
subroutine psb_dovrlm(x,desc_a,info,jx,ik,work,choice,update_type)
use psb_descriptor_type
real(kind(1.d0)), intent(inout) :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(inout), optional :: work(:)
logical, intent(in), optional :: choice
integer, intent(in), optional :: update_type,jx,ik
end subroutine psb_dovrlm
subroutine psb_dovrlv(x,desc_a,info,work,choice,update_type)
use psb_descriptor_type
real(kind(1.d0)), intent(inout) :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(inout), optional :: work(:)
logical, intent(in), optional :: choice
integer, intent(in), optional :: update_type
end subroutine psb_dovrlv
end interface
interface psb_halo
subroutine psb_dhalom(x,desc_a,info,alpha,jx,ik,work,tran,mode)
use psb_descriptor_type
real(kind(1.d0)), intent(inout) :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
real(kind(1.d0)), intent(inout), optional :: work(:)
integer, intent(in), optional :: mode,jx,ik
character, intent(in), optional :: tran
end subroutine psb_dhalom
subroutine psb_dhalov(x,desc_a,info,alpha,work,tran,mode)
use psb_descriptor_type
real(kind(1.d0)), intent(inout) :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
real(kind(1.d0)), intent(inout), optional :: work(:)
integer, intent(in), optional :: mode
character, intent(in), optional :: tran
end subroutine psb_dhalov
subroutine psb_ihalom(x,desc_a,info,alpha,jx,ik,work,tran,mode)
use psb_descriptor_type
integer, intent(inout) :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
integer, intent(inout), optional :: work(:)
integer, intent(in), optional :: mode,jx,ik
character, intent(in), optional :: tran
end subroutine psb_ihalom
subroutine psb_ihalov(x,desc_a,info,alpha,work,tran,mode)
use psb_descriptor_type
integer, intent(inout) :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
real(kind(1.d0)), intent(in), optional :: alpha
integer, intent(inout), optional :: work(:)
integer, intent(in), optional :: mode
character, intent(in), optional :: tran
end subroutine psb_ihalov
end interface
interface psb_dscatter
subroutine psb_dscatterm(globx, locx, desc_a, info, iroot,&
& iiglobx, ijglobx, iilocx,ijlocx,ik)
use psb_descriptor_type
real(kind(1.d0)), intent(out) :: locx(:,:)
real(kind(1.d0)), intent(in) :: globx(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot,iiglobx,&
& ijglobx,iilocx,ijlocx,ik
end subroutine psb_dscatterm
subroutine psb_dscatterv(globx, locx, desc_a, info, iroot)
use psb_descriptor_type
real(kind(1.d0)), intent(out) :: locx(:)
real(kind(1.d0)), intent(in) :: globx(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot
end subroutine psb_dscatterv
end interface
interface psb_dgather
subroutine psb_dgatherm(globx, locx, desc_a, info, iroot,&
& iiglobx, ijglobx, iilocx,ijlocx,ik)
use psb_descriptor_type
real(kind(1.d0)), intent(in) :: locx(:,:)
real(kind(1.d0)), intent(out) :: globx(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot, iiglobx, ijglobx, iilocx, ijlocx, ik
end subroutine psb_dgatherm
subroutine psb_dgatherv(globx, locx, desc_a, info, iroot,&
& iiglobx, iilocx)
use psb_descriptor_type
real(kind(1.d0)), intent(in) :: locx(:,:)
real(kind(1.d0)), intent(out) :: globx(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in), optional :: iroot, iiglobx, iilocx
end subroutine psb_dgatherv
end interface
end module psb_comm_mod

@ -0,0 +1,38 @@
integer, parameter :: psb_nohalo_=0, psb_halo_=4
integer, parameter :: psb_none_=0,psb_sum_=1
integer, parameter :: psb_avg_=2,psb_square_root_=3
integer, parameter :: psb_swap_send_=1,psb_swap_recv_=2
integer, parameter :: psb_swap_sync_=4,psb_swap_mpi_=8
integer, parameter :: psb_deadlock_check_=0
integer, parameter :: psb_local_mtrx_check_=1
integer, parameter :: psb_local_comm_check_=2
integer, parameter :: psb_consistency_check_=3
integer, parameter :: psb_global_check_=4
integer, parameter :: psb_order_communication_=5
integer, parameter :: psb_change_represent_=6
integer, parameter :: psb_loc_to_glob_check_=7
integer, parameter :: psb_convert_halo_=1
integer, parameter :: psb_convert_ovrlap_=2
integer, parameter :: psb_act_ret_=0
integer, parameter :: psb_act_abort_=1, no_err_=0
integer, parameter :: psb_dec_type_=1,psb_m_=2,psb_n_=3
integer, parameter :: psb_n_row_=4,psb_n_col_=5,psb_ctxt_=6
integer, parameter :: psb_loc_to_glob_=7
integer, parameter :: psb_mpi_c_=9,psb_mdata_size_=10
integer, parameter :: psb_desc_asb_=3099
integer, parameter :: psb_desc_bld_=psb_desc_asb_+1
integer, parameter :: psb_desc_upd_=psb_desc_bld_+1
integer, parameter :: psb_desc_upd_asb_=psb_desc_upd_+1
integer, parameter :: psb_upd_glb_=998,psb_upd_loc_=997
integer, parameter :: psb_proc_id_=0,psb_n_elem_recv_=1
integer, parameter :: psb_elem_recv_=2,psb_n_elem_send_=2
integer, parameter :: psb_elem_send_=3,psb_n_ovrlp_elem_=1
integer, parameter :: psb_ovrlp_elem_to_=2,psb_ovrlp_elem_=0
integer, parameter :: psb_nnz_=1, psb_n_dom_ovr_=1
integer, parameter :: psb_no_comm_=-1, psb_nzsizereq_=3
integer, parameter :: ione=1, done=1.d0,izero=0, dzero=0.d0
integer, parameter :: itwo=2, ithree=3,root=0, act_abort=1
integer, parameter :: psb_nztotreq_=1,psb_nzrowreq_=2
character, parameter :: psb_all_='A',psb_topdef_=' '

@ -0,0 +1,36 @@
module psb_const_mod
integer, parameter :: psb_nohalo_=0, psb_halo_=4
integer, parameter :: psb_none_=0, psb_sum_=1
integer, parameter :: psb_avg_=2, psb_square_root_=3
integer, parameter :: psb_swap_send_=1, psb_swap_recv_=2
integer, parameter :: psb_swap_sync_=4, psb_swap_mpi_=8
integer, parameter :: psb_deadlock_check_=0, psb_local_mtrx_check_=1
integer, parameter :: psb_local_comm_check_=2, psb_consistency_check_=3
integer, parameter :: psb_global_check_=4, psb_order_communication_=5
integer, parameter :: psb_change_represent_=6, psb_loc_to_glob_check_=7
integer, parameter :: psb_convert_halo_=1, psb_convert_ovrlap_=2
integer, parameter :: psb_act_ret_=0, psb_act_abort_=1, no_err_=0
integer, parameter :: psb_dec_type_=1, psb_m_=2,psb_n_=3
integer, parameter :: psb_n_row_=4, psb_n_col_=5,psb_ctxt_=6
integer, parameter :: psb_loc_to_glob_=7, psb_mpi_c_=9,psb_mdata_size_=10
integer, parameter :: psb_desc_asb_=3099, psb_desc_bld_=psb_desc_asb_+1
integer, parameter :: psb_desc_upd_=psb_desc_bld_+1, psb_desc_upd_asb_=psb_desc_upd_+1
integer, parameter :: psb_upd_glb_=998, psb_upd_loc_=997
integer, parameter :: psb_proc_id_=0, psb_n_elem_recv_=1
integer, parameter :: psb_elem_recv_=2, psb_n_elem_send_=2
integer, parameter :: psb_elem_send_=3, psb_n_ovrlp_elem_=1
integer, parameter :: psb_ovrlp_elem_to_=2, psb_ovrlp_elem_=0, psb_n_dom_ovr_=1
integer, parameter :: psb_nnz_=1
integer, parameter :: psb_no_comm_=-1
integer, parameter :: ione=1, done=1.d0, izero=0, dzero=0.d0
integer, parameter :: itwo=2, ithree=3, root=0
integer, parameter :: psb_nztotreq_=1, psb_nzrowreq_=2, psb_nzsizereq_=3
real(kind(1.d0)), parameter :: psb_colrow_=0.33
character, parameter :: psb_all_='A', psb_topdef_=' '
end module psb_const_mod

@ -0,0 +1,82 @@
!
! Module to define desc_a,
! structure for coomunications.
!
! Typedef: psb_desc_type
! Defines a communication descriptor
module psb_descriptor_type
use psb_const_mod
! desc_type contains data for communications.
type psb_desc_type
! contain decomposition informations
integer, pointer :: matrix_data(:)=>null()
! contain index of halo elements to send/receive
integer, pointer :: halo_index(:)=>null()
! contain indices of boundary elements
integer, pointer :: bnd_elem(:)=>null()
! contain index of overlap elements to send/receive
integer, pointer :: ovrlap_elem(:)=>null()
! contain for each local overlap element, the number of times
! that is duplicated
integer, pointer :: ovrlap_index(:)=>null()
! contain for each local element the corresponding global index
integer, pointer :: loc_to_glob(:)=>null()
! contain for each global element the corresponding local index,
! if exist.
integer, pointer :: glob_to_loc (:)=>null()
! local renumbering induced by sparse matrix storage.
integer, pointer :: lprm(:)=>null()
! index space in case it is not just the contiguous range 1:n
integer, pointer :: idx_space(:)=>null()
end type psb_desc_type
contains
subroutine psb_nullify_desc(desc)
type(psb_desc_type), intent(inout) :: desc
nullify(desc%matrix_data,desc%loc_to_glob,desc%glob_to_loc,&
&desc%halo_index,desc%bnd_elem,desc%ovrlap_elem,&
&desc%ovrlap_index, desc%lprm, desc%idx_space)
end subroutine psb_nullify_desc
logical function psb_is_ok_dec(dectype)
integer :: dectype
psb_is_ok_dec = ((dectype == desc_asb).or.(dectype == desc_bld).or.&
&(dectype == desc_upd).or.(dectype== desc_upd_asb))
end function psb_is_ok_dec
logical function psb_is_bld_dec(dectype)
integer :: dectype
psb_is_bld_dec = (dectype == desc_bld)
end function psb_is_bld_dec
logical function psb_is_upd_dec(dectype)
integer :: dectype
psb_is_upd_dec = (dectype == desc_upd)
end function psb_is_upd_dec
logical function psb_is_asb_upd_dec(dectype)
integer :: dectype
psb_is_asb_upd_dec = (dectype == desc_upd_asb)
end function psb_is_asb_upd_dec
logical function psb_is_asb_dec(dectype)
integer :: dectype
psb_is_asb_dec = (dectype == desc_asb)
end function psb_is_asb_dec
end module psb_descriptor_type

@ -0,0 +1,428 @@
module psb_error_mod
public psb_errpush, psb_error, psb_get_errstatus,&
& psb_get_errverbosity, psb_set_errverbosity,psb_errcomm, &
& psb_erractionsave, psb_erractionrestore, act_ret, act_abort, &
& no_err, psb_get_erraction, psb_set_erraction
interface psb_error
module procedure psb_serror
module procedure psb_perror
end interface
integer, parameter :: act_ret=0, act_abort=1, no_err=0
private
type psb_errstack_node
integer :: err_code=0 ! the error code
character(len=20) :: routine='' ! the name of the routine generating the error
integer,dimension(5) :: i_err_data=0 ! array of integer data to complete the error msg
! real(kind(1.d0))(dim=10) :: r_err_data=0.d0 ! array of real data to complete the error msg
! complex(dim=10) :: c_err_data=0.c0 ! array of complex data to complete the error msg
character(len=20) :: a_err_data='' ! array of character data to complete the error msg
type(psb_errstack_node), pointer :: next ! pointer to the next element in the stack
end type psb_errstack_node
type psb_errstack
type(psb_errstack_node), pointer :: top => null() ! pointer to the top element of the stack
integer :: n_elems=0 ! number of entries in the stack
end type psb_errstack
type(psb_errstack) :: error_stack ! the PSBLAS-2.0 error stack
integer :: error_status=0 ! the error status (maybe not here)
integer :: verbosity_level=1 ! the verbosity level (maybe not here)
integer :: err_action=1
contains
! saves action to support error traceback
! also changes error action to "return"
subroutine psb_erractionsave(err_act)
integer, intent(out) :: err_act
err_act=err_action
err_action=act_ret
end subroutine psb_erractionsave
! return the action to take upon error occurrence
subroutine psb_get_erraction(err_act)
integer, intent(out) :: err_act
err_act=err_action
end subroutine psb_get_erraction
! sets the action to take upon error occurrence
subroutine psb_set_erraction(err_act)
integer, intent(in) :: err_act
err_action=err_act
end subroutine psb_set_erraction
! restores error action previously saved with psb_erractionsave
subroutine psb_erractionrestore(err_act)
integer, intent(in) :: err_act
err_action=err_act
end subroutine psb_erractionrestore
! checks wether an error has occurred on one of the porecesses in the execution pool
subroutine psb_errcomm(icontxt, err)
integer, intent(in) :: icontxt
integer, intent(inout):: err
integer :: temp(2)
integer, parameter :: ione=1
call igamx2d(icontxt, 'A', ' ', ione, ione, err, ione,&
&temp ,temp,-ione ,-ione,-ione)
end subroutine psb_errcomm
! sets verbosity of the error message
subroutine psb_set_errverbosity(v)
integer, intent(in) :: v
verbosity_level=v
end subroutine psb_set_errverbosity
! returns verbosity of the error message
subroutine psb_get_errverbosity(v)
integer, intent(out) :: v
v=verbosity_level
end subroutine psb_get_errverbosity
! checks the status of the error condition
subroutine psb_get_errstatus(s)
integer, intent(out) :: s
s=error_status
end subroutine psb_get_errstatus
! pushes an error on the error stack
subroutine psb_errpush(err_c, r_name, i_err, a_err)
integer, intent(in) :: err_c
character(len=20), intent(in) :: r_name
character(len=20), optional :: a_err
integer, optional :: i_err(5)
type(psb_errstack_node), pointer :: new_node
allocate(new_node)
new_node%err_code = err_c
new_node%routine = r_name
if(present(i_err)) new_node%i_err_data = i_err
if(present(a_err)) new_node%a_err_data = a_err
new_node%next => error_stack%top
error_stack%top => new_node
error_stack%n_elems = error_stack%n_elems+1
if(error_status.eq.0) error_status=1
nullify(new_node)
end subroutine psb_errpush
! pops an error from the error stack
subroutine psb_errpop(err_c, r_name, i_e_d, a_e_d)
integer, intent(out) :: err_c
character(len=20), intent(out) :: r_name, a_e_d
integer, intent(out) :: i_e_d(5)
type(psb_errstack_node), pointer :: old_node
err_c = error_stack%top%err_code
r_name = error_stack%top%routine
i_e_d = error_stack%top%i_err_data
a_e_d = error_stack%top%a_err_data
old_node => error_stack%top
error_stack%top => old_node%next
error_stack%n_elems = error_stack%n_elems - 1
if(error_stack%n_elems.eq.0) error_status=0
deallocate(old_node)
end subroutine psb_errpop
! handles the occurence of an error in a parallel routine
subroutine psb_perror(icontxt)
integer, intent(in) :: icontxt
integer :: err_c
character(len=20) :: r_name, a_e_d
integer :: i_e_d(5)
integer :: nprow, npcol, me, mypcol, temp(2)
integer, parameter :: ione=1, izero=0
call blacs_gridinfo(icontxt, nprow, npcol, me, mypcol)
if(error_status.gt.0) then
if(verbosity_level.gt.1) then
do while (error_stack%n_elems.gt.izero)
write(0,'(50("="))')
call psb_errpop(err_c, r_name, i_e_d, a_e_d)
call psb_errmsg(err_c, r_name, i_e_d, a_e_d,me)
! write(0,'(50("="))')
end do
call blacs_abort(icontxt,-1)
else
call psb_errpop(err_c, r_name, i_e_d, a_e_d)
call psb_errmsg(err_c, r_name, i_e_d, a_e_d,me)
do while (error_stack%n_elems.gt.0)
call psb_errpop(err_c, r_name, i_e_d, a_e_d)
end do
call blacs_abort(icontxt,-1)
end if
end if
if(error_status.gt.izero) then
call blacs_abort(icontxt,err_c)
end if
end subroutine psb_perror
! handles the occurence of an error in a serial routine
subroutine psb_serror()
integer :: err_c
character(len=20) :: r_name, a_e_d
integer :: i_e_d(5)
integer :: nprow, npcol, me, mypcol, temp(2)
integer, parameter :: ione=1, izero=0
if(error_status.gt.0) then
if(verbosity_level.gt.1) then
do while (error_stack%n_elems.gt.izero)
write(0,'(50("="))')
call psb_errpop(err_c, r_name, i_e_d, a_e_d)
call psb_errmsg(err_c, r_name, i_e_d, a_e_d)
! write(0,'(50("="))')
end do
else
call psb_errpop(err_c, r_name, i_e_d, a_e_d)
call psb_errmsg(err_c, r_name, i_e_d, a_e_d)
do while (error_stack%n_elems.gt.0)
call psb_errpop(err_c, r_name, i_e_d, a_e_d)
end do
end if
end if
end subroutine psb_serror
! prints the error msg associated to a specific error code
subroutine psb_errmsg(err_c, r_name, i_e_d, a_e_d,me)
integer, intent(in) :: err_c
character(len=20), intent(in) :: r_name, a_e_d
integer, intent(in) :: i_e_d(5)
integer, optional :: me
if(present(me)) then
write(0,'("Process: ",i0,". PSBLAS Error (",i0,") in subroutine: ",a20)')me,err_c,r_name
else
write(0,'("PSBLAS Error (",i0,") in subroutine: ",a20)')err_c,r_name
end if
select case (err_c)
case(:0)
write (0,'("error on calling sperror. err_c must be greater than 0")')
case(2)
write (0,'("pivot too small")')
case(3)
write (0,'("Invalid number of ovr:",i0)')i_e_d(1)
case(5)
write (0,'("Invalid input")')
case(10)
write (0,'("input argument n. ",i0," cannot be less than 0")')i_e_d(1)
write (0,'("current value is ",i0)')i_e_d(2)
case(20)
write (0,'("input argument n. ",i0," cannot be greater than 0")')i_e_d(1)
write (0,'("current value is ",i0)')i_e_d(2)
case(30)
write (0,'("input argument n. ",i0," has an invalid value")')i_e_d(1)
write (0,'("current value is ",i0)')i_e_d(2)
case(35)
write (0,'("Size of input array argument n. ",i0," is invalid.")')i_e_d(1)
write (0,'("Current value is ",i0)')i_e_d(2)
case(40)
write (0,'("input argument n. ",i0," has an invalid value")')i_e_d(1)
write (0,'("current value is ",a)')a_e_d(2:2)
case(50)
write (0,'("input argument n. ",i0," must be equal or greater than input argument n. ",i0)') i_e_d(1), i_e_d(2)
write (0,'("current values are ",i0," < ",i0)') i_e_d(3),i_e_d(4)
case(60)
write (0,'("input argument n. ",i0," must be equal or greater than ",i0)')i_e_d(1),i_e_d(2)
write (0,'("current value is ",i0," < ",i0)')i_e_d(3), i_e_d(2)
case(70)
write (0,'("input argument n. ",i0," in entry # ",i0," has an invalid value")')i_e_d(1:2)
write (0,'("current value is ",a)')a_e_d
case(71)
write (0,'("Impossible error in ASB: nrow>ncol,")')
write (0,'("Actual values are ",i0," > ",i0)')i_e_d(1:2)
! ... csr format error ...
case(80)
write (0,'("input argument ia2(1) is less than 0")')
write (0,'("current value is ",i0)')i_e_d(1)
! ... csr format error ...
case(90)
write (0,'("indices in ia2 array are not in increasing order")')
case(91)
write (0,'("indices in ia1 array are not in increasing order")')
! ... csr format error ...
case(100)
write (0,'("indices in ia1 array are not within problem dimension")')
write (0,'("problem dimension is ",i0)')i_e_d(1)
case(110)
write (0,'("invalid combination of input arguments")')
case(115)
write (0,'("Invalid process identifier in input array argument n. ",i0,".")')i_e_d(1)
write (0,'("Current value is ",i0)')i_e_d(2)
case(120)
write (0,'("input argument n. ",i0," must be greater than input argument n. ",i0)')i_e_d(1:2)
write (0,'("current values are ",i0," < ",i0)') i_e_d(3:4)
! ... coo format error ...
case(130)
write (0,'("there are duplicated elements in coo format")')
write (0,'("please set repflag flag to 2 or 3")')
case(134)
write (0,'("Invalid input format ",a3)')a_e_d(1:3)
case(135)
write (0,'("Format ",a3," not yet supported here")')a_e_d(1:3)
case(136)
write (0,'("Format ",a3," is unknown")')a_e_d(1:3)
case(140)
write (0,'("indices in input array are not within problem dimension ",2(i0,2x))')i_e_d(1:2)
case(150)
write (0,'("indices in input array are not belonging to the calling process ",i0)')i_e_d(1)
case(290)
write (0,'("Is not possible to call this routine without calling before psdalloc on the same matrix")')
case(295)
write (0,'("Is not possible to call this routine without calling before psdspalloc on the same matrix")')
case(300)
write (0,'("Input argument n. ",i0," must be equal to entry n. ",i0," in array input argument n.",i0)') &
& i_e_d(1),i_e_d(4),i_e_d(3)
write (0,'("Current values are ",i0," != ",i0)')i_e_d(2), i_e_d(5)
case(400)
write (0,'("MPI error:",i0)')i_e_d(1)
case(550)
write (0,'("Parameter n. ",i0," must be equal on all BLACS processes. ",i0)')i_e_d(1)
case(570)
write (0,'("partition function passed as input argument n. ",i0," returns number of processes")')i_e_d(1)
write (0,'("greater than No of grid s processes on global point ",i0,". Actual number of grid s ")')i_e_d(4)
write (0,'("processes is ",i0,", number returned is ",i0)')i_e_d(2),i_e_d(3)
case(575)
write (0,'("partition function passed as input argument n. ",i0," returns number of processes")')i_e_d(1)
write (0,'("less or equal to 0 on global point ",i0,". Number returned is ",i0)')i_e_d(3),i_e_d(2)
case(580)
write (0,'("partition function passed as input argument n. ",i0," returns wrong processes identifier")')i_e_d(1)
write (0,'("on global point ",i0,". Current value returned is : ",i0)')i_e_d(3),i_e_d(2)
case(600)
write (0,'("Sparse Matrix and decsriptors are in an invalid state for this subroutine call: ",i0)')i_e_d(1)
case (1122)
write (0,'("Invalid state for DESC_A")')
case (1123)
write (0,'("Invalid combined state for A and DESC_A")')
case(1124:1999)
write (0,'("computational error. code: ",i0)')err_c
case(2010)
write (0,'("BLACS error. Number of processes=-1")')
case(2025)
write (0,'("Cannot allocate ",i0," bytes")')i_e_d(1)
case(2030)
write (0,'("BLACS ERROR: Number of grid columns must be equal to 1\nCurrent value is ",i4," != 1.")')i_e_d(1)
case(2231)
write (0,'("Invalid input state for matrix.")')
case(2232)
write (0,'("Input state for matrix is not adequate for regeneration.")')
case (2233:2999)
write(0,'("resource error. code: ",i0)')err_c
case(3000:3009)
write (0,'("sparse matrix representation ",a3," not yet implemented")')a_e_d(1:3)
case(3010)
write (0,'("Case lld not equal matrix_data[N_COL_] is not yet implemented.")')
case(3015)
write (0,'("transpose option for sparse matrix representation ",a3," not implemented")')a_e_d(1:3)
case(3020)
write (0,'("Case trans = C is not yet implemented.")')
case(3021)
write (0,'("Case trans /= N is not yet implemented.")')
case(3022)
write (0,'("Only unit diagonal so far for triangular matrices. ")')
case(3023)
write (0,'("Cases DESCRA(1:1)=S DESCRA(1:1)=T not yet implemented. ")')
case(3024)
write (0,'("Cases DESCRA(1:1)=G not yet implemented. ")')
case(3030)
write (0,'("Case ja/=ix or ia/=iy is not yet implemented.")')
case(3040)
write (0,'("Case ix /= 1 or iy /= 1 is not yet implemented.")')
case(3050)
write (0,'("Case ix /= iy is not yet implemented.")')
case(3060)
write (0,'("Case ix /= 1 is not yet implemented.")')
case(3070)
write (0,'("This operation is only implemented with no overlap.")')
case(3080)
write (0,'("Decompostion type ",i0," not yet supported.")')i_e_d(1)
case(3090)
write (0,'("Insert matrix mode not yet implemented.")')
case(3100)
write (0,'("Error on index. Element has not been inserted")')
write (0,'("local index is: ",i0," and global index is:",i0)')i_e_d(1:2)
case(3110)
write (0,'("Before you call this routine, you must assembly sparse matrix")')
case(3111:3999)
write(0,'("miscellaneus error. code: ",i0)')err_c
case(4000)
write(0,'("Allocation/deallocation error")')
case(4010)
write (0,'("Error from call to subroutine ",a)')a_e_d
case(4011)
write (0,'("Error from call to a subroutine ")')
case(4012)
write (0,'("Error ",i0," from call to a subroutine ")')i_e_d(1)
case (5001)
write (0,'("Invalid ISTOP: ",i0)')i_e_d(1)
case (5002)
write (0,'("Invalid PREC: ",i0)')i_e_d(1)
case (5003)
write (0,'("Invalid PREC: ",a3)')a_e_d(1:3)
case default
write(0,'("unknown error (",i0,") in subroutine ",a)')err_c,r_name
write(0,'(5(i0,2x))') i_e_d
write(0,'(a)') a_e_d
end select
end subroutine psb_errmsg
end module psb_error_mod

@ -0,0 +1,123 @@
Module psb_methd_mod
interface psb_cg
subroutine psb_dcg(a,prec,b,x,eps,&
& desc_a,info,itmax,iter,err,itrace,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
type(psb_dspmat_type), intent(in) :: a
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), intent(in) :: b(:)
real(kind(1.d0)), intent(inout) :: x(:)
real(kind(1.d0)), intent(in) :: eps
type(psb_dprec_type), intent(in) :: prec
integer, intent(out) :: info
integer, optional, intent(in) :: itmax, itrace,istop
integer, optional, intent(out) :: iter
real(kind(1.d0)), optional, intent(out) :: err
end subroutine psb_dcg
end interface
interface spb_bicg
subroutine psb_dbicg(a,prec,b,x,eps,&
& desc_a,info,itmax,iter,err,itrace,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
type(psb_dspmat_type), intent(in) :: a
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), intent(in) :: b(:)
real(kind(1.d0)), intent(inout) :: x(:)
real(kind(1.d0)), intent(in) :: eps
type(psb_dprec_type), intent(in) :: prec
integer, intent(out) :: info
integer, optional, intent(in) :: itmax, itrace,istop
integer, optional, intent(out) :: iter
real(kind(1.d0)), optional, intent(out) :: err
end subroutine psb_dbicg
end interface
interface ppsb_bicgstab
subroutine psb_dcgstab(a,prec,b,x,eps,&
& desc_a,info,itmax,iter,err,itrace,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
type(psb_dspmat_type), intent(in) :: a
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), intent(in) :: b(:)
real(kind(1.d0)), intent(inout) :: x(:)
real(kind(1.d0)), intent(in) :: eps
type(psb_dprec_type), intent(in) :: prec
integer, intent(out) :: info
integer, optional, intent(in) :: itmax, itrace,istop
integer, optional, intent(out) :: iter
real(kind(1.d0)), optional, intent(out) :: err
end subroutine psb_dcgstab
end interface
interface psb_bicgstabl
Subroutine psb_dcgstabl(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err, itrace,irst,istop)
use psb_serial_mod
use psb_descriptor_type
Use psb_prec_type
!!$ parameters
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_desc_type), Intent(in) :: desc_a
type(psb_dprec_type), intent(in) :: prec
Real(Kind(1.d0)), Intent(in) :: b(:)
Real(Kind(1.d0)), Intent(inout) :: x(:)
Real(Kind(1.d0)), Intent(in) :: eps
integer, intent(out) :: info
Integer, Optional, Intent(in) :: itmax, itrace, irst,istop
Integer, Optional, Intent(out) :: iter
Real(Kind(1.d0)), Optional, Intent(out) :: err
end subroutine psb_dcgstabl
end interface
interface psb_rgmres
Subroutine psb_dgmresr(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err,itrace,irst,istop)
use psb_serial_mod
use psb_descriptor_type
Use psb_prec_type
!!$ parameters
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_desc_type), Intent(in) :: desc_a
type(psb_dprec_type), intent(in) :: prec
Real(Kind(1.d0)), Intent(in) :: b(:)
Real(Kind(1.d0)), Intent(inout) :: x(:)
Real(Kind(1.d0)), Intent(in) :: eps
integer, intent(out) :: info
Integer, Optional, Intent(in) :: itmax, itrace, irst,istop
Integer, Optional, Intent(out) :: iter
Real(Kind(1.d0)), Optional, Intent(out) :: err
end subroutine psb_dgmresr
end interface
interface psb_cgs
subroutine psb_dcgs(a,prec,b,x,eps,desc_a,info,&
&itmax,iter,err,itrace,istop)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
!!$ parameters
type(psb_dspmat_type), intent(in) :: a
type(psb_desc_type), intent(in) :: desc_a
type(psb_dprec_type), intent(in) :: prec
real(kind(1.d0)), intent(in) :: b(:)
real(kind(1.d0)), intent(inout) :: x(:)
real(kind(1.d0)), intent(in) :: eps
integer, intent(out) :: info
integer, optional, intent(in) :: itmax, itrace,istop
integer, optional, intent(out) :: iter
real(kind(1.d0)), optional, intent(out) :: err
end subroutine psb_dcgs
end interface
end module psb_methd_mod

@ -0,0 +1,124 @@
module psb_prec_mod
use psb_prec_type
interface psb_bldaggrmat
subroutine psb_dbldaggrmat(a,desc_a,p,info)
use psb_prec_type
use psb_descriptor_type
use psb_spmat_type
type(psb_dspmat_type), intent(in), target :: a
type(psb_dbase_prec), intent(inout) :: p
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dbldaggrmat
end interface
interface psb_genaggrmap
subroutine psb_dgenaggrmap(aggr_type,a,desc_a,nlaggr,ilaggr,info)
use psb_spmat_type
use psb_descriptor_type
implicit none
integer, intent(in) :: aggr_type
type(psb_dspmat_type), intent(in) :: a
type(psb_desc_type), intent(in) :: desc_a
integer, pointer :: ilaggr(:),nlaggr(:)
integer, intent(out) :: info
end subroutine psb_dgenaggrmap
end interface
interface psb_precbld
subroutine psb_dprecbld(a,prec,desc_a,ierr,upd)
use psb_descriptor_type
use psb_prec_type
implicit none
integer, intent(out) :: ierr
type(psb_dspmat_type), intent(in), target :: a
type(psb_dprec_type), intent(inout) :: prec
type(psb_desc_type), intent(in) :: desc_a
character, intent(in),optional :: upd
end subroutine psb_dprecbld
end interface
interface psb_precset
subroutine psb_dprecset(prec,ptype,iv,rs,rv,ierr)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
implicit none
type(psb_dprec_type), intent(inout) :: prec
character(len=10), intent(in) :: ptype
integer, optional, intent(in) :: iv(:)
real(kind(1.d0)), optional, intent(in) :: rs
real(kind(1.d0)), optional, intent(in) :: rv(:)
integer, optional, intent(out) :: ierr
end subroutine psb_dprecset
end interface
interface psb_precfree
subroutine psb_dprecfree(p,info)
use psb_descriptor_type
use psb_serial_mod
use psb_const_mod
use psb_prec_type
type(psb_dprec_type), intent(inout) :: p
integer, intent(out) :: info
end subroutine psb_dprecfree
end interface
interface psb_cslu
subroutine psb_dcslu(a,desc_data,p,upd,info)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
integer, intent(out) :: info
type(psb_dspmat_type), intent(in), target :: a
type(psb_desc_type),intent(in) :: desc_data
type(psb_dbase_prec), intent(inout) :: p
character, intent(in) :: upd
end subroutine psb_dcslu
end interface
interface psb_csrsetup
Subroutine psb_dcsrsetup(ptype,novr,a,blk,desc_data,upd,desc_p,info,outfmt)
use psb_serial_mod
Use psb_descriptor_type
Use psb_prec_type
integer, intent(in) :: ptype,novr
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_dspmat_type), Intent(inout) :: blk
Type(psb_desc_type), Intent(inout) :: desc_p
Type(psb_desc_type), Intent(in) :: desc_data
Character, Intent(in) :: upd
integer, intent(out) :: info
character(len=5), optional :: outfmt
end Subroutine psb_dcsrsetup
end interface
interface psb_prcaply
subroutine psb_dprecaply(prec,x,y,desc_data,info,trans,work)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
type(psb_desc_type),intent(in) :: desc_data
type(psb_dprec_type), intent(in) :: prec
real(kind(0.d0)),intent(inout) :: x(:), y(:)
integer, intent(out) :: info
character(len=1), optional :: trans
real(kind(0.d0)),intent(inout), optional, target :: work(:)
end subroutine psb_dprecaply
subroutine psb_dprecaply1(prec,x,desc_data,info,trans)
use psb_serial_mod
use psb_descriptor_type
use psb_prec_type
type(psb_desc_type),intent(in) :: desc_data
type(psb_dprec_type), intent(in) :: prec
real(kind(0.d0)),intent(inout) :: x(:)
integer, intent(out) :: info
character(len=1), optional :: trans
end subroutine psb_dprecaply1
end interface
end module psb_prec_mod

@ -0,0 +1,373 @@
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! Module to define PREC_DATA, !!
!! structure for preconditioning. !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
module psb_prec_type
use psb_spmat_type
use psb_descriptor_type
integer, parameter :: min_prec_=0, noprec_=0, diagsc_=1, bja_=2,&
& asm_=3, ras_=5, ash_=4, rash_=6, ras2lv_=7, ras2lvm_=8,&
& lv2mras_=9, lv2smth_=10, lv2lsm_=11, sl2sm_=12, superlu_=13,&
& new_loc_smth_=14, new_glb_smth_=15, max_prec_=15
! Multilevel stuff.
integer, parameter :: no_ml_=0, add_ml_prec_=1, mult_ml_prec_=2
integer, parameter :: new_ml_prec_=3, max_ml_=new_ml_prec_
integer, parameter :: pre_smooth_=1, post_smooth_=2, smooth_both_=3,&
& max_smooth_=smooth_both_
integer, parameter :: loc_aggr_=0, glb_aggr_=1, new_loc_aggr_=2
integer, parameter :: new_glb_aggr_=3, max_aggr_=new_glb_aggr_
integer, parameter :: no_smth_=0, smth_omg_=1, smth_biz_=2
integer, parameter :: lib_choice_=0, user_choice_=1
integer, parameter :: mat_distr_=0, mat_repl_=1
! Entries in iprcparm: preconditioner type, factorization type,
! prolongation type, restriction type, renumbering algorithm,
! number of overlap layers, pointer to SuperLU factors,
! levels of fill in for ILU(N),
integer, parameter :: p_type_=1, f_type_=2, restr_=3, prol_=4
integer, parameter :: iren_=5, n_ovr_=6, slu_ptr_=7
integer, parameter :: ilu_fill_in_=8, jac_sweeps_=9, ml_type_=10
integer, parameter :: smth_pos_=11, aggr_alg_=12, smth_kind_=13
integer, parameter :: om_choice_=14, glb_smth_=15, coarse_mat_=16
integer, parameter :: ifpsz=20
! Entries in dprcparm: ILU(E) epsilon, smoother omega
integer, parameter :: fact_eps_=1, smooth_omega_=2
integer, parameter :: dfpsz=4
! Factorization types: none, ILU(N), ILU(E), SuperLU
integer, parameter :: f_none_=0,f_ilu_n_=1, f_ilu_e_=2,f_slu_=3
! Fields for sparse matrices ensembles:
integer, parameter :: l_pr_=1, u_pr_=2, bp_ilu_avsz=2
integer, parameter :: ap_nd_=3, ac_=4, sm_pr_t_=5, sm_pr_=6
integer, parameter :: smth_avsz=6
type psb_dbase_prec
type(psb_dspmat_type), pointer :: av(:) => null()
real(kind(1.d0)), pointer :: d(:) => null()
type(psb_desc_type), pointer :: desc_data => null()
integer, pointer :: iprcparm(:) => null()
real(kind(1.d0)), pointer :: dprcparm(:) => null()
integer, pointer :: perm(:) => null(), invperm(:) => null()
integer, pointer :: mlia(:) => null(), nlaggr(:) => null()
type(psb_dspmat_type), pointer :: aorig => null()
real(kind(1.d0)), pointer :: dorig(:) => null()
end type psb_dbase_prec
type psb_dprec_type
type(psb_dbase_prec), pointer :: baseprecv(:) => null()
! contain type of preconditioning to be performed
integer :: prec, base_prec
end type psb_dprec_type
character(len=15), parameter, private :: &
& smooth_names(1:2)=(/'Pre-smoothing ','Post-smoothing'/)
character(len=15), parameter, private :: &
& smooth_kinds(0:2)=(/'No smoother ','Omega smoother',&
& 'Bizr. smoother'/)
character(len=15), parameter, private :: &
& matrix_names(0:1)=(/'Distributed ','Replicated '/)
character(len=18), parameter, private :: &
& aggr_names(0:3)=(/'Local aggregation ','Global aggregation',&
& 'New local aggr. ','New global aggr. '/)
character(len=6), parameter, private :: &
& restrict_names(0:4)=(/'None ',' ',' ',' ','Halo '/)
character(len=12), parameter, private :: &
& prolong_names(0:3)=(/'None ','Sum ','Average ','Square root'/)
character(len=15), parameter, private :: &
& ml_names(0:3)=(/'None ','Additive ','Multiplicative',&
& 'New ML '/)
character(len=15), parameter, private :: &
& fact_names(0:3)=(/'None ','ILU(n) ',&
& 'ILU(eps) ','Sparse LU '/)
interface psb_base_precfree
module procedure psb_dbase_precfree
end interface
interface psb_check_def
module procedure psb_icheck_def, psb_dcheck_def
end interface
interface psb_prec_descr
module procedure psb_file_prec_descr
end interface
contains
subroutine psb_file_prec_descr(iout,p)
integer, intent(in) :: iout
type(psb_dprec_type), intent(in) :: p
write(iout,*) 'Preconditioner description'
if (associated(p%baseprecv)) then
if (size(p%baseprecv)>=1) then
write(iout,*) 'Base preconditioner'
select case(p%baseprecv(1)%iprcparm(p_type_))
case(noprec_)
write(iout,*) 'No preconditioning'
case(diagsc_)
write(iout,*) 'Diagonal scaling'
case(bja_)
write(iout,*) 'Block Jacobi with: ',&
& fact_names(p%baseprecv(1)%iprcparm(f_type_))
case(asm_,ras_,ash_,rash_)
write(iout,*) 'Additive Schwarz with: ',&
& fact_names(p%baseprecv(1)%iprcparm(f_type_))
write(iout,*) 'Overlap:',&
& p%baseprecv(1)%iprcparm(n_ovr_)
write(iout,*) 'Restriction: ',&
& restrict_names(p%baseprecv(1)%iprcparm(restr_))
write(iout,*) 'Prolongation: ',&
& prolong_names(p%baseprecv(1)%iprcparm(prol_))
end select
end if
if (size(p%baseprecv)>=2) then
if (.not.associated(p%baseprecv(2)%iprcparm)) then
write(iout,*) 'Inconsistent MLPREC part!'
return
endif
write(iout,*) 'Multilevel: ',ml_names(p%baseprecv(2)%iprcparm(ml_type_))
if (p%baseprecv(2)%iprcparm(ml_type_)>no_ml_) then
write(iout,*) 'Multilevel aggregation: ', &
& aggr_names(p%baseprecv(2)%iprcparm(aggr_alg_))
write(iout,*) 'Smoother: ', &
& smooth_kinds(p%baseprecv(2)%iprcparm(smth_kind_))
write(iout,*) 'Smoothing omega: ', p%baseprecv(2)%dprcparm(smooth_omega_)
write(iout,*) 'Smoothing position: ',&
& smooth_names(p%baseprecv(2)%iprcparm(smth_pos_))
write(iout,*) 'Coarse matrix: ',&
& matrix_names(p%baseprecv(2)%iprcparm(coarse_mat_))
write(iout,*) 'Factorization type: ',&
& fact_names(p%baseprecv(2)%iprcparm(f_type_))
select case(p%baseprecv(2)%iprcparm(f_type_))
case(f_ilu_n_)
write(iout,*) 'Fill level :',p%baseprecv(2)%iprcparm(ilu_fill_in_)
case(f_ilu_e_)
write(iout,*) 'Fill threshold :',p%baseprecv(2)%dprcparm(fact_eps_)
case(f_slu_)
case default
write(iout,*) 'Should never get here!'
end select
write(iout,*) 'Number of Jacobi sweeps: ', &
& (p%baseprecv(2)%iprcparm(jac_sweeps_))
end if
end if
else
write(iout,*) 'No Base preconditioner available, something is wrong!'
return
endif
end subroutine psb_file_prec_descr
function is_legal_base_prec(ip)
integer, intent(in) :: ip
logical :: is_legal_base_prec
is_legal_base_prec = ((ip>=noprec_).and.(ip<=rash_))
return
end function is_legal_base_prec
function is_legal_n_ovr(ip)
integer, intent(in) :: ip
logical :: is_legal_n_ovr
is_legal_n_ovr = (ip >=0)
return
end function is_legal_n_ovr
function is_legal_jac_sweeps(ip)
integer, intent(in) :: ip
logical :: is_legal_jac_sweeps
is_legal_jac_sweeps = (ip >= 1)
return
end function is_legal_jac_sweeps
function is_legal_prolong(ip)
integer, intent(in) :: ip
logical :: is_legal_prolong
is_legal_prolong = ((ip>=none_).and.(ip<=square_root_))
return
end function is_legal_prolong
function is_legal_restrict(ip)
integer, intent(in) :: ip
logical :: is_legal_restrict
is_legal_restrict = ((ip==nohalo_).or.(ip==halo_))
return
end function is_legal_restrict
function is_legal_ml_type(ip)
integer, intent(in) :: ip
logical :: is_legal_ml_type
is_legal_ml_type = ((ip>=no_ml_).and.(ip<=max_ml_))
return
end function is_legal_ml_type
function is_legal_ml_aggr_kind(ip)
integer, intent(in) :: ip
logical :: is_legal_ml_aggr_kind
is_legal_ml_aggr_kind = ((ip>=loc_aggr_).and.(ip<=max_aggr_))
return
end function is_legal_ml_aggr_kind
function is_legal_ml_smooth_pos(ip)
integer, intent(in) :: ip
logical :: is_legal_ml_smooth_pos
is_legal_ml_smooth_pos = ((ip>=pre_smooth_).and.(ip<=max_smooth_))
return
end function is_legal_ml_smooth_pos
function is_legal_ml_smth_kind(ip)
integer, intent(in) :: ip
logical :: is_legal_ml_smth_kind
is_legal_ml_smth_kind = ((ip>=no_smth_).and.(ip<=smth_biz_))
return
end function is_legal_ml_smth_kind
function is_legal_ml_coarse_mat(ip)
integer, intent(in) :: ip
logical :: is_legal_ml_coarse_mat
is_legal_ml_coarse_mat = ((ip>=mat_distr_).and.(ip<=mat_repl_))
return
end function is_legal_ml_coarse_mat
function is_legal_ml_fact(ip)
integer, intent(in) :: ip
logical :: is_legal_ml_fact
is_legal_ml_fact = ((ip>=f_ilu_n_).and.(ip<=f_slu_))
return
end function is_legal_ml_fact
function is_legal_ml_lev(ip)
integer, intent(in) :: ip
logical :: is_legal_ml_lev
is_legal_ml_lev = (ip>=0)
return
end function is_legal_ml_lev
function is_legal_omega(ip)
real(kind(1.d0)), intent(in) :: ip
logical :: is_legal_omega
is_legal_omega = ((ip>=0.0d0).and.(ip<=2.0d0))
return
end function is_legal_omega
function is_legal_ml_eps(ip)
real(kind(1.d0)), intent(in) :: ip
logical :: is_legal_ml_eps
is_legal_ml_eps = (ip>=0.0d0)
return
end function is_legal_ml_eps
subroutine psb_icheck_def(ip,name,id,is_legal)
integer, intent(inout) :: ip
integer, intent(in) :: id
character(len=*), intent(in) :: name
interface
function is_legal(i)
integer, intent(in) :: i
logical :: is_legal
end function is_legal
end interface
if (.not.is_legal(ip)) then
write(0,*) 'Illegal value for ',name,' :',ip, '. defaulting to ',id
ip = id
end if
end subroutine psb_icheck_def
subroutine psb_dcheck_def(ip,name,id,is_legal)
real(kind(1.d0)), intent(inout) :: ip
real(kind(1.d0)), intent(in) :: id
character(len=*), intent(in) :: name
interface
function is_legal(i)
real(kind(1.d0)), intent(in) :: i
logical :: is_legal
end function is_legal
end interface
if (.not.is_legal(ip)) then
write(0,*) 'Illegal value for ',name,' :',ip, '. defaulting to ',id
ip = id
end if
end subroutine psb_dcheck_def
subroutine psb_dbase_precfree(p,info)
use psb_serial_mod
use psb_descriptor_type
use psb_tools_mod
type(psb_dbase_prec), intent(inout) :: p
integer, intent(out) :: info
integer :: i
info = 0
if (associated(p%d)) then
deallocate(p%d,stat=info)
end if
if (associated(p%av)) then
do i=1,size(p%av)
call psb_spfree(p%av(i),info)
if (info /= 0) then
! Actually, we don't care here about this.
! Just let it go.
! return
end if
enddo
deallocate(p%av,stat=info)
p%av => null()
end if
if (associated(p%desc_data)) then
if (associated(p%desc_data%matrix_data)) then
call psb_dscfree(p%desc_data,info)
end if
deallocate(p%desc_data)
endif
if (associated(p%dprcparm)) then
deallocate(p%dprcparm,stat=info)
end if
if (associated(p%aorig)) then
! This is a pointer to something else, must not free it here.
nullify(p%aorig)
endif
if (associated(p%dorig)) then
deallocate(p%dorig,stat=info)
nullify(p%dorig)
endif
if (associated(p%mlia)) then
deallocate(p%mlia,stat=info)
endif
if (associated(p%nlaggr)) then
deallocate(p%nlaggr,stat=info)
endif
if (associated(p%iprcparm)) then
if (p%iprcparm(f_type_)==f_slu_) then
call fort_slu_free(p%iprcparm(slu_ptr_),info)
end if
deallocate(p%iprcparm,stat=info)
end if
call psb_nullify_baseprec(p)
end subroutine psb_dbase_precfree
subroutine psb_nullify_baseprec(p)
use psb_descriptor_type
type(psb_dbase_prec), intent(inout) :: p
nullify(p%av,p%d,p%iprcparm,p%dprcparm,p%perm,p%invperm,p%mlia,&
& p%nlaggr,p%aorig,p%dorig,p%desc_data)
end subroutine psb_nullify_baseprec
end module psb_prec_type

@ -0,0 +1,238 @@
module psb_psblas_mod
use psb_comm_mod
interface psb_dot
function psb_ddotv(x, y, desc_a,info)
use psb_descriptor_type
real(kind(1.d0)) :: psb_ddotv
real(kind(1.d0)), intent(in) :: x(:), y(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
end function psb_ddotv
function psb_ddot(x, y, desc_a, info, jx, jy)
use psb_descriptor_type
real(kind(1.d0)) :: psb_ddot
real(kind(1.d0)), intent(in) :: x(:,:), y(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, optional, intent(in) :: jx, jy
integer, intent(out) :: info
end function psb_ddot
end interface
interface psb_dots
subroutine psb_ddotvs(res,x, y, desc_a, info)
use psb_descriptor_type
real(kind(1.d0)), intent(out) :: res
real(kind(1.d0)), intent(in) :: x(:), y(:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
end subroutine psb_ddotvs
subroutine psb_dmdots(res,x, y, desc_a,info)
use psb_descriptor_type
real(kind(1.d0)), intent(out) :: res(:)
real(kind(1.d0)), intent(in) :: x(:,:), y(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dmdots
end interface
interface psb_axpby
subroutine psb_daxpbyv(alpha, x, beta, y,&
& desc_a, info)
use psb_descriptor_type
real(kind(1.d0)), intent (in) :: x(:)
real(kind(1.d0)), intent (inout) :: y(:)
real(kind(1.d0)), intent (in) :: alpha, beta
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end subroutine psb_daxpbyv
subroutine psb_daxpby(alpha, x, beta, y,&
& desc_a, info, n, jx, jy)
use psb_descriptor_type
real(kind(1.d0)), intent (in) :: x(:,:)
real(kind(1.d0)), intent (inout) :: y(:,:)
real(kind(1.d0)), intent (in) :: alpha, beta
type(psb_desc_type), intent (in) :: desc_a
integer, optional :: n, jx, jy
integer, intent(out) :: info
end subroutine psb_daxpby
end interface
interface psb_amax
function psb_damax(x, desc_a, info, jx)
use psb_descriptor_type
real(kind(1.d0)) psb_damax
real(kind(1.d0)), intent (in) :: x(:,:)
type(psb_desc_type), intent (in) :: desc_a
integer, optional, intent (in) :: jx
integer, intent(out) :: info
end function psb_damax
function psb_damaxv(x, desc_a,info)
use psb_descriptor_type
real(kind(1.d0)) psb_damaxv
real(kind(1.d0)), intent (in) :: x(:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end function psb_damaxv
end interface
interface psb_amaxs
subroutine psb_damaxvs(res,x,desc_a,info)
use psb_descriptor_type
real(kind(1.d0)), intent (out) :: res
real(kind(1.d0)), intent (in) :: x(:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end subroutine psb_damaxvs
subroutine psb_dmamax(res,x,desc_a,info)
use psb_descriptor_type
real(kind(1.d0)), intent (out) :: res(:)
real(kind(1.d0)), intent (in) :: x(:,:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dmamax
end interface
interface psb_asum
function psb_dasum(x, desc_a, info, jx)
use psb_descriptor_type
real(kind(1.d0)) psb_dasum
real(kind(1.d0)), intent (in) :: x(:,:)
type(psb_desc_type), intent (in) :: desc_a
integer, optional, intent (in) :: jx
integer, intent(out) :: info
end function psb_dasum
function psb_dasumv(x, desc_a, info)
use psb_descriptor_type
real(kind(1.d0)) psb_dasumv
real(kind(1.d0)), intent (in) :: x(:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end function psb_dasumv
end interface
interface psb_asums
subroutine psb_dasumvs(res,x,desc_a,info)
use psb_descriptor_type
real(kind(1.d0)), intent (out) :: res
real(kind(1.d0)), intent (in) :: x(:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dasumvs
subroutine psb_dmasum(res,x,desc_a,info)
use psb_descriptor_type
real(kind(1.d0)), intent (out) :: res(:)
real(kind(1.d0)), intent (in) :: x(:,:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dmasum
end interface
interface psb_nrm2
function psb_dnrm2(x, desc_a, info, jx)
use psb_descriptor_type
real(kind(1.d0)) psb_dnrm2
real(kind(1.d0)), intent (in) :: x(:,:)
type(psb_desc_type), intent (in) :: desc_a
integer, optional, intent (in) :: jx
integer, intent(out) :: info
end function psb_dnrm2
function psb_dnrm2v(x, desc_a, info)
use psb_descriptor_type
real(kind(1.d0)) psb_dnrm2v
real(kind(1.d0)), intent (in) :: x(:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end function psb_dnrm2v
end interface
interface psb_nrm2s
subroutine psb_dnrm2vs(res,x,desc_a,info)
use psb_descriptor_type
real(kind(1.d0)), intent (out) :: res
real(kind(1.d0)), intent (in) :: x(:)
type(psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dnrm2vs
end interface
interface psb_nrmi
function psb_dnrmi(a, desc_a,info)
use psb_serial_mod
use psb_descriptor_type
real(kind(1.d0)) :: psb_dnrmi
type (psb_dspmat_type), intent (in) :: a
type (psb_desc_type), intent (in) :: desc_a
integer, intent(out) :: info
end function psb_dnrmi
end interface
interface psb_spmm
subroutine psb_dspmm(alpha, a, x, beta, y, desc_a, info,&
&trans, k, jx, jy,work,doswap)
use psb_serial_mod
use psb_descriptor_type
type (psb_dspmat_type), intent(in) :: a
real(kind(1.d0)), intent(inout) :: x(:,:)
real(kind(1.d0)), intent(inout) :: y(:,:)
real(kind(1.d0)), intent(in) :: alpha, beta
type(psb_desc_type), intent(in) :: desc_a
character, optional, intent(in) :: trans
real(kind(1.d0)), optional, intent(inout) :: work(:)
integer, optional, intent(in) :: k, jx, jy,doswap
integer, intent(out) :: info
end subroutine psb_dspmm
subroutine psb_dspmv(alpha, a, x, beta, y,&
& desc_a, info, trans, work,doswap)
use psb_serial_mod
use psb_descriptor_type
type (psb_dspmat_type), intent(in) :: a
real(kind(1.d0)), intent(inout) :: x(:)
real(kind(1.d0)), intent(inout) :: y(:)
real(kind(1.d0)), intent(in) :: alpha, beta
type(psb_desc_type), intent(in) :: desc_a
character, optional, intent(in) :: trans
real(kind(1.d0)), optional, intent(inout) :: work(:)
integer, optional, intent(in) :: doswap
integer, intent(out) :: info
end subroutine psb_dspmv
end interface
interface psb_spsm
subroutine psb_dspsm(alpha, t, x, beta, y,&
& desc_a, info, trans, unit, choice,&
& diag, n, jx, jy, work)
use psb_serial_mod
use psb_descriptor_type
type (psb_dspmat_type), intent(in) :: t
real(kind(1.d0)), intent(in) :: x(:,:)
real(kind(1.d0)), intent(inout) :: y(:,:)
real(kind(1.d0)), intent(in) :: alpha, beta
type(psb_desc_type), intent(in) :: desc_a
character, optional, intent(in) :: trans, unit
integer, optional, intent(in) :: n, jx, jy
integer, optional, intent(in) :: choice
real(kind(1.d0)), optional, intent(in) :: work(:), diag(:)
integer, intent(out) :: info
end subroutine psb_dspsm
subroutine psb_dspsv(alpha, t, x, beta, y,&
& desc_a, info, trans, unit, choice,&
& diag, work)
use psb_serial_mod
use psb_descriptor_type
type (psb_dspmat_type), intent(in) :: t
real(kind(1.d0)), intent(in) :: x(:)
real(kind(1.d0)), intent(inout) :: y(:)
real(kind(1.d0)), intent(in) :: alpha, beta
type(psb_desc_type), intent(in) :: desc_a
character, optional, intent(in) :: trans, unit
integer, optional, intent(in) :: choice
real(kind(1.d0)), optional, intent(in) :: work(:), diag(:)
integer, intent(out) :: info
end subroutine psb_dspsv
end interface
end module psb_psblas_mod

@ -0,0 +1,398 @@
module psb_realloc_mod
implicit none
Interface psb_realloc
module procedure psb_dreallocate1i
module procedure psb_dreallocate2i
module procedure psb_dreallocate2i1d
module procedure psb_dreallocate1d
module procedure psb_dreallocated2
end Interface
Interface psb_realloc1it
module procedure psb_dreallocate1it
end Interface
Contains
Subroutine psb_dreallocate1i(len,rrax,info,pad)
use psb_error_mod
! ...Subroutine Arguments
Integer,Intent(in) :: len
Integer,pointer :: rrax(:)
integer :: info
integer, optional, intent(in) :: pad
! ...Local Variables
Integer,Pointer :: tmp(:)
Integer :: dim, err_act, err,i
character(len=20) :: name
name='psb_dreallocate1i'
call psb_erractionsave(err_act)
info=0
if (associated(rrax)) then
dim=size(rrax)
If (dim /= len) Then
Allocate(tmp(len),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
!!$ write(0,*) 'IA: copying ',len,dim
if (.true.) then
do i=1, min(len,dim)
tmp(i)=rrax(i)
end do
else
tmp(1:min(len,dim))=rrax(1:min(len,dim))
end if
!!$ write(0,*) 'IA: copying done'
Deallocate(rrax,stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
rrax=>tmp
End If
else
!!$ write(0,*) 'IA: allocating ',len
allocate(rrax(len),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
endif
if (present(pad)) then
!!$ write(0,*) 'IA: padding'
rrax(dim+1:len) = pad
endif
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_ret) then
return
else
call psb_error()
end if
return
End Subroutine psb_dreallocate1i
Subroutine psb_dreallocate1d(len,rrax,info,pad)
use psb_error_mod
! ...Subroutine Arguments
Integer,Intent(in) :: len
Real(kind(1.d0)),pointer :: rrax(:)
integer :: info
real(kind(1.d0)), optional, intent(in) :: pad
! ...Local Variables
Real(kind(1.d0)),Pointer :: tmp(:)
Integer :: dim,err_act,err,i, m
character(len=20) :: name
name='psb_dreallocate1d'
call psb_erractionsave(err_act)
if (associated(rrax)) then
dim=size(rrax)
If (dim /= len) Then
Allocate(tmp(len),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
m = min(dim,len)
!!$ write(0,*) 'DA: copying ',min(len,dim)
if (.true.) then
do i=1,m
tmp(i) = rrax(i)
end do
else
tmp(1:m) = rrax(1:m)
end if
!!$ write(0,*) 'DA: copying done ',m
Deallocate(rrax,stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
rrax=>tmp
End If
else
dim = 0
Allocate(rrax(len),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
endif
if (present(pad)) then
rrax(dim+1:len) = pad
endif
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_ret) then
return
else
call psb_error()
end if
return
End Subroutine psb_dreallocate1d
Subroutine psb_dreallocated2(len1,len2,rrax,info,pad)
use psb_error_mod
! ...Subroutine Arguments
Integer,Intent(in) :: len1,len2
Real(kind(1.d0)),pointer :: rrax(:,:)
integer :: info
real(kind(1.d0)), optional, intent(in) :: pad
! ...Local Variables
Real(kind(1.d0)),Pointer :: tmp(:,:)
Integer :: dim,err_act,err,i, m
character(len=20) :: name
name='psb_dreallocated2'
call psb_erractionsave(err_act)
if (associated(rrax)) then
dim=size(rrax,1)
If (dim /= len1) Then
Allocate(tmp(len1,len2),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
m = min(dim,len1)
!!$ write(0,*) 'DA: copying ',min(len,dim)
if (.true.) then
do i=1,m
tmp(i,:) = rrax(i,:)
end do
else
tmp(1:m,:) = rrax(1:m,:)
end if
!!$ write(0,*) 'DA: copying done ',m
Deallocate(rrax,stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
rrax=>tmp
End If
else
dim = 0
Allocate(rrax(len1,len2),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
endif
if (present(pad)) then
rrax(dim+1:len1,:) = pad
endif
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_ret) then
return
else
call psb_error()
end if
return
End Subroutine psb_dreallocated2
Subroutine psb_dreallocate2i(len,rrax,y,info,pad)
use psb_error_mod
! ...Subroutine Arguments
Integer,Intent(in) :: len
Integer,pointer :: rrax(:),y(:)
integer :: info
integer, optional, intent(in) :: pad
character(len=20) :: name
integer :: err_act, err
name='psb_dreallocate2i'
call psb_erractionsave(err_act)
info=0
call psb_dreallocate1i(len,rrax,info,pad=pad)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
call psb_dreallocate1i(len,y,info,pad=pad)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_ret) then
return
else
call psb_error()
end if
return
End Subroutine psb_dreallocate2i
Subroutine psb_dreallocate2i1d(len,rrax,y,z,info)
use psb_error_mod
! ...Subroutine Arguments
Integer,Intent(in) :: len
Integer,pointer :: rrax(:),y(:)
Real(Kind(1.d0)),pointer :: z(:)
integer :: info
character(len=20) :: name
integer :: err_act, err
name='psb_dreallocate2i1d'
call psb_erractionsave(err_act)
info = 0
call psb_dreallocate1i(len,rrax,info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
call psb_dreallocate1i(len,y,info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
call psb_dreallocate1d(len,z,info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_ret) then
return
else
call psb_error()
end if
return
End Subroutine psb_dreallocate2i1d
Subroutine psb_dreallocate1it(len,rrax,info,pad)
use psb_error_mod
! ...Subroutine Arguments
Integer,Intent(in) :: len
Integer,pointer :: rrax(:)
integer :: info
integer, optional, intent(in) :: pad
! ...Local Variables
Integer,Pointer :: tmp(:)
Integer :: dim,err_act,err
character(len=20) :: name
name='psb_dreallocate1it'
call psb_erractionsave(err_act)
info=0
if (associated(rrax)) then
dim=size(rrax)
If (dim /= len) Then
Allocate(tmp(len),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
!!$ write(0,*) 'IA: copying ',min(len,dim)
tmp(1:min(len,dim))=rrax(1:min(len,dim))
!!$ write(0,*) 'IA: copying done'
Deallocate(rrax,stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
rrax=>tmp
End If
else
allocate(rrax(len),stat=info)
if (info /= 0) then
err=4000
call psb_errpush(err,name)
goto 9999
end if
endif
if (present(pad)) then
!!$ write(0,*) 'IA: padding'
rrax(dim+1:len) = pad
endif
call psb_erractionrestore(err_act)
return
9999 continue
call psb_erractionrestore(err_act)
if (err_act.eq.act_ret) then
return
else
call psb_error()
end if
return
End Subroutine psb_dreallocate1it
end module psb_realloc_mod

@ -0,0 +1,211 @@
module psb_serial_mod
use psb_spmat_type
use psb_string_mod
interface psb_csdp
subroutine psb_dcsdp(a, b,info,ifc,check,trans,unitd)
use psb_spmat_type
type(psb_dspmat_type), intent(in) :: a
type(psb_dspmat_type), intent(inout) :: b
integer, intent(out) :: info
integer, intent(in), optional :: ifc
character, intent(in), optional :: check,trans,unitd
end subroutine psb_dcsdp
end interface
interface psb_csrws
subroutine psb_dcsrws(rw,a,info,trans)
use psb_spmat_type
type(psb_dspmat_type) :: a
real(kind(1.d0)), pointer :: rw(:)
integer :: info
character, optional :: trans
end subroutine psb_dcsrws
end interface
interface psb_cssm
subroutine psb_dcssm(alpha,t,b,beta,c,info,trans,unitd,d)
use psb_spmat_type
type(psb_dspmat_type) :: t
real(kind(1.d0)) :: alpha, beta, b(:,:), c(:,:)
integer :: info
character, optional :: trans, unitd
real(kind(1.d0)), optional, target :: d(:)
end subroutine psb_dcssm
subroutine psb_dcssv(alpha,t,b,beta,c,info,trans,unitd,d)
use psb_spmat_type
type(psb_dspmat_type) :: t
real(kind(1.d0)) :: alpha, beta, b(:), c(:)
integer :: info
character, optional :: trans, unitd
real(kind(1.d0)), optional, target :: d(:)
end subroutine psb_dcssv
end interface
interface psb_csmm
subroutine psb_dcsmv(alpha,a,b,beta,c,info,trans)
use psb_spmat_type
type(psb_dspmat_type) :: a
real(kind(1.d0)) :: alpha, beta, b(:), c(:)
integer :: info
character, optional :: trans
end subroutine psb_dcsmv
subroutine psb_dcsmm(alpha,a,b,beta,c,info,trans)
use psb_spmat_type
type(psb_dspmat_type) :: a
real(kind(1.d0)) :: alpha, beta, b(:,:), c(:,:)
integer :: info
character, optional :: trans
end subroutine psb_dcsmm
end interface
interface psb_fixcoo
subroutine psb_dfixcoo(a,info)
use psb_spmat_type
type(psb_dspmat_type), intent(inout) :: a
integer, intent(out) :: info
end subroutine psb_dfixcoo
end interface
interface psb_ipcoo2csr
subroutine psb_dipcoo2csr(a,info,rwshr)
use psb_spmat_type
type(psb_dspmat_type), intent(inout) :: a
integer, intent(out) :: info
logical, optional :: rwshr
end subroutine psb_dipcoo2csr
end interface
interface psb_ipcsr2coo
subroutine psb_dipcsr2coo(a,info)
use psb_spmat_type
type(psb_dspmat_type), intent(inout) :: a
integer, intent(out) :: info
end subroutine psb_dipcsr2coo
end interface
interface psb_csprt
subroutine psb_dcsprt(iout,a,iv,irs,ics,head,ivr,ivc)
use psb_spmat_type
integer, intent(in) :: iout
type(psb_dspmat_type), intent(in) :: a
integer, intent(in), optional :: iv(:)
integer, intent(in), optional :: irs,ics
character(len=*), optional :: head
integer, intent(in), optional :: ivr(:),ivc(:)
end subroutine psb_dcsprt
end interface
interface psb_spgtdiag
subroutine psb_dspgtdiag(a,d,info)
use psb_spmat_type
type(psb_dspmat_type), intent(in) :: a
real(kind(1.d0)), intent(inout) :: d(:)
integer, intent(out) :: info
end subroutine psb_dspgtdiag
end interface
interface psb_spscal
subroutine psb_dspscal(a,d,info)
use psb_spmat_type
type(psb_dspmat_type), intent(inout) :: a
real(kind(1.d0)), intent(in) :: d(:)
integer, intent(out) :: info
end subroutine psb_dspscal
end interface
interface psb_spinfo
subroutine psb_dspinfo(ireq,a,ires,info,iaux)
use psb_spmat_type
type(psb_dspmat_type), intent(in) :: a
integer, intent(in) :: ireq
integer, intent(out) :: ires
integer, intent(out) :: info
integer, intent(in), optional :: iaux
end subroutine psb_dspinfo
end interface
interface psb_spgtrow
subroutine psb_dspgtrow(irw,a,b,info,append,iren,lrw)
use psb_spmat_type
type(psb_dspmat_type), intent(in) :: a
integer, intent(in) :: irw
type(psb_dspmat_type), intent(inout) :: b
logical, intent(in), optional :: append
integer, intent(in), target, optional :: iren(:)
integer, intent(in), optional :: lrw
integer, intent(out) :: info
end subroutine psb_dspgtrow
end interface
interface psb_neigh
subroutine psb_dneigh(a,idx,neigh,n,info,lev)
use psb_spmat_type
type(psb_dspmat_type), intent(in) :: a
integer, intent(in) :: idx
integer, intent(out) :: n
integer, pointer :: neigh(:)
integer, intent(out) :: info
integer, optional, intent(in) :: lev
end subroutine psb_dneigh
end interface
interface psb_coins
subroutine psb_dcoins(nz,ia,ja,val,a,gtl,imin,imax,jmin,jmax,info)
use psb_spmat_type
integer, intent(in) :: nz, imin,imax,jmin,jmax
integer, intent(in) :: ia(:),ja(:),gtl(:)
real(kind(1.d0)), intent(in) :: val(:)
type(psb_dspmat_type), intent(inout) :: a
integer, intent(out) :: info
end subroutine psb_dcoins
end interface
interface psb_symbmm
subroutine psb_dsymbmm(a,b,c)
use psb_spmat_type
type(psb_dspmat_type) :: a,b,c
end subroutine psb_dsymbmm
end interface
interface psb_numbmm
subroutine psb_dnumbmm(a,b,c)
use psb_spmat_type
type(psb_dspmat_type) :: a,b,c
end subroutine psb_dnumbmm
end interface
interface psb_transp
subroutine psb_dtransp(a,b,c,fmt)
use psb_spmat_type
type(psb_dspmat_type) :: a,b
integer, optional :: c
character(len=*), optional :: fmt
end subroutine psb_dtransp
end interface
interface psb_rwextd
subroutine psb_drwextd(nr,a,info,b)
use psb_spmat_type
integer, intent(in) :: nr
type(psb_dspmat_type), intent(inout) :: a
integer, intent(out) :: info
type(psb_dspmat_type), intent(in), optional :: b
end subroutine psb_drwextd
end interface
interface psb_csnmi
real(kind(1.d0)) function psb_dcsnmi(a,info,trans)
use psb_spmat_type
type(psb_dspmat_type), intent(in) :: a
integer, intent(out) :: info
character, optional :: trans
end function psb_dcsnmi
end interface
end module psb_serial_mod

@ -0,0 +1,16 @@
module psb_sparse_mod
use psb_typedesc
use psb_typeprec
use psb_serial_mod
use psb_tools_mod
use psb_psblas_mod
use psb_prec_mod
use psb_methd_mod
use psb_error_mod
use psb_string
end module psb_sparse_mod

@ -0,0 +1,341 @@
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! Module to define D_SPMAT, structure !!
!! for sparse matrix. !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
module psb_spmat_type
use psb_error_mod
use psb_realloc_mod
use psb_const_mod
! Typedef: psb_dspmat_type
! Contains a sparse matrix
type psb_dspmat_type
! Rows & columns
integer :: m, k
! Identify the representation method. Es: CSR, JAD, ...
character(len=5) :: fida
! describe some chacteristics of sparse matrix
character(len=11) :: descra
! Contains some additional informations on sparse matrix
integer :: infoa(10)
! Contains sparse matrix coefficients
real(kind(1.d0)), pointer :: aspk(:)=>null()
! Contains indeces that describes sparse matrix structure
integer, pointer :: ia1(:)=>null(), ia2(:)=>null()
! Permutations matrix
integer, pointer :: pl(:)=>null(), pr(:)=>null()
end type psb_dspmat_type
interface psb_nullify_sp
module procedure psb_nullify_dsp
end interface
interface psb_spclone
module procedure psb_dspclone
end interface
interface psb_spreall
module procedure psb_dspreallocate, psb_dspreall3
end interface
interface psb_spall
module procedure psb_dspallocate, psb_dspall3, psb_dspallmk, psb_dspallmknz
end interface
interface psb_spfree
module procedure psb_dspfree
end interface
interface psb_spreinit
module procedure psb_dspreinit
end interface
contains
subroutine psb_nullify_dsp(mat)
implicit none
type(psb_dspmat_type), intent(inout) :: mat
nullify(mat%aspk,mat%ia1,mat%ia2,mat%pl,mat%pr)
mat%m=0
mat%k=0
mat%fida=''
mat%descra=''
end subroutine psb_nullify_dsp
Subroutine psb_dspreinit(a)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(inout) :: A
!locals
logical, parameter :: debug=.false.
if (debug) write(0,*) 'spreinit init ',a%fida,a%infoa(psb_nnz_)
if (a%fida=='COO') a%infoa(psb_nnz_) = 0
if (associated(a%aspk)) a%aspk(:) = 0.d0
if (debug) write(0,*) 'spreinit end ',a%fida,a%infoa(psb_nnz_)
end Subroutine psb_dspreinit
Subroutine psb_dspallocate(a, nnz,info)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(inout) :: A
Integer, intent(in) :: nnz
integer, intent(out) :: info
!locals
logical, parameter :: debug=.false.
info = 0
if (nnz.lt.0) then
info=45
return
Endif
if (debug) write(0,*) 'SPALL : NNZ ',nnz,a%m,a%k
call psb_spreall(a,nnz,info)
a%pl(1)=0
a%pr(1)=0
! set INFOA fields
a%fida = 'COO'
a%descra = 'GUN'
a%infoa(:) = 0
a%m = 0
a%k = 0
if (debug) write(0,*) 'SPALL : end'
Return
End Subroutine psb_dspallocate
Subroutine psb_dspallmk(m,k,a,info)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(inout) :: A
Integer, intent(in) :: m,k
Integer, intent(out) :: info
!locals
logical, parameter :: debug=.false.
integer :: nnz
INFO = 0
nnz = 2*max(1,m,k)
if (debug) write(0,*) 'SPALL : NNZ ',nnz,a%m,a%k
a%m=max(0,m)
a%k=max(0,k)
call psb_spreall(a,nnz,info)
a%pl(1)=0
a%pr(1)=0
! set INFOA fields
a%fida = 'COO'
a%descra = 'GUN'
a%infoa(:) = 0
if (debug) write(0,*) 'SPALL : end'
Return
end subroutine psb_dspallmk
Subroutine psb_dspallmknz(m,k,a, nnz,info)
implicit none
!....parameters...
type(psb_dspmat_type), intent(inout) :: a
integer, intent(in) :: m,k,nnz
integer, intent(out) :: info
!locals
logical, parameter :: debug=.false.
info = 0
if (nnz.lt.0) then
info=45
return
endif
if (debug) write(0,*) 'spall : nnz ',nnz,a%m,a%k
a%m=max(0,m)
a%k=max(0,k)
call psb_spreall(a,nnz,info)
a%pl(1)=0
a%pr(1)=0
! set infoa fields
a%fida = 'COO'
a%descra = 'GUN'
a%infoa(:) = 0
if (debug) write(0,*) 'spall : end'
return
end subroutine psb_dspallmknz
subroutine psb_dspall3(a, ni1,ni2,nd,info)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(inout) :: A
Integer, intent(in) :: ni1,ni2,nd
Integer, intent(out) :: info
!locals
logical, parameter :: debug=.false.
info = 0
call psb_spreall(a, ni1,ni2,nd,info)
a%pl(1)=0
a%pr(1)=0
! set INFOA fields
a%fida = 'COO'
a%descra = 'GUN'
a%infoa(:) = 0
a%m = 0
a%k = 0
if (debug) write(0,*) 'SPALL : end'
Return
End Subroutine psb_dspall3
subroutine psb_dspreallocate(a, nnz,info,ifc)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(inout) :: A
Integer, intent(in) :: NNZ
Integer, intent(out) :: info
!
! ifc is used here to allocate space in IA1 for smart
! regeneration. This probably ought to be changed,
! by adding a new component to d_spmat, or by making
! infoa a pointer.
!
Integer, intent(in), optional :: ifc
integer :: ifc_
!locals
logical, parameter :: debug=.false.
info = 0
if (nnz.lt.0) then
info=45
return
endif
if (present(ifc)) then
ifc_ = max(1,ifc)
else
ifc_ = 1
endif
if (ifc_ == 1) then
call psrealloc(nnz,a%ia1,a%ia2,a%aspk,info)
else
call psrealloc(nnz,a%aspk,info)
if (info /= 0) return
call psrealloc(nnz,a%ia2,info)
if (info /= 0) return
call psrealloc(ifc*nnz+200,a%ia1,info)
if (info /= 0) return
end if
if (info /= 0) return
call psrealloc(max(1,a%m),a%pl,info)
if (info /= 0) return
call psrealloc(max(1,a%k),a%pr,info)
if (info /= 0) return
Return
End Subroutine psb_dspreallocate
subroutine psb_dspreall3(a, ni1,ni2,nd,info)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(inout) :: A
Integer, intent(in) :: ni1,ni2,nd
Integer, intent(inout) :: info
!locals
logical, parameter :: debug=.false.
info = 0
call psrealloc(nd,a%aspk,info)
if (info /= 0) return
call psrealloc(ni2,a%ia2,info)
if (info /= 0) return
call psrealloc(ni1,a%ia1,info)
if (info /= 0) return
call psrealloc(max(1,a%m),a%pl,info)
if (info /= 0) return
call psrealloc(max(1,a%k),a%pr,info)
if (info /= 0) return
Return
End Subroutine psb_dspreall3
subroutine psb_dspclone(a, b,info)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(in) :: A
Type(psb_dspmat_type), intent(out) :: B
Integer, intent(out) :: info
!locals
Integer :: nza,nz1, nz2, nzl, nzr
logical, parameter :: debug=.false.
INFO = 0
nza = size(a%aspk)
nz1 = size(a%ia1)
nz2 = size(a%ia2)
nzl = size(a%pl)
nzr = size(a%pr)
allocate(b%aspk(nza),b%ia1(nz1),b%ia2(nz2),&
& b%pl(nzl),b%pr(nzr),stat=info)
if (info /= 0) then
info=2023
return
Endif
b%aspk(:) = a%aspk(:)
b%ia1(:) = a%ia1(:)
b%ia2(:) = a%ia2(:)
b%pl(:) = a%pl(:)
b%pr(:) = a%pr(:)
b%infoa(:) = a%infoa(:)
b%fida = a%fida
b%descra = a%descra
b%m = a%m
b%k = a%k
Return
End Subroutine psb_dspclone
subroutine psb_dspfree(a,info)
implicit none
!....Parameters...
Type(psb_dspmat_type), intent(inout) :: A
Integer, intent(out) :: info
!locals
logical, parameter :: debug=.false.
INFO = 0
deallocate(a%aspk,a%ia1,a%ia2,a%pr,a%pl,STAT=INFO)
call psb_nullify_sp(a)
Return
End Subroutine psb_dspfree
end module psb_spmat_type

@ -0,0 +1,20 @@
module psb_string_mod
interface tolower
function tolowerc(string)
character(len=*), intent(in) :: string
character(len=len(string)) :: tolowerc
end function tolowerc
end interface
interface toupper
function toupperc(string)
character(len=*), intent(in) :: string
character(len=len(string)) :: toupperc
end function toupperc
end interface
interface touppers
subroutine sub_toupperc(string,strout)
character(len=*), intent(in) :: string
character(len=*), intent(out) :: strout
end subroutine sub_toupperc
end interface
end module psb_string_mod

@ -0,0 +1,504 @@
Module psb_tools_mod
use psb_const_mod
interface psb_alloc
! 2-D double precision version
subroutine psb_dalloc(m, n, x, desc_a, info, js)
use psb_descriptor_type
implicit none
integer, intent(in) :: m,n
real(kind(1.d0)), pointer :: x(:,:)
type(psb_desc_type), intent(inout) :: desc_a
integer :: info
integer, optional, intent(in) :: js
end subroutine psb_dalloc
! 1-D double precision version
subroutine psb_dallocv(m, x, desc_a,info)
use psb_descriptor_type
integer, intent(in) :: m
real(kind(1.d0)), pointer :: x(:)
type(psb_desc_type), intent(in):: desc_a
integer :: info
end subroutine psb_dallocv
! 2-D integer version
subroutine psb_ialloc(m, n, x, desc_a, info,js)
use psb_descriptor_type
integer, intent(in) :: m,n
integer, pointer :: x(:,:)
type(psb_desc_type), intent(inout) :: desc_a
integer, intent(out) :: info
integer, optional, intent(in) :: js
end subroutine psb_ialloc
subroutine psb_iallocv(m, x, desc_a,info)
use psb_descriptor_type
integer, intent(in) :: m
integer, pointer :: x(:)
type(psb_desc_type), intent(in):: desc_a
integer :: info
end subroutine psb_iallocv
end interface
interface psb_asb
! 2-D double precision version
subroutine psb_dasb(x, desc_a, info)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), pointer :: x(:,:)
integer, intent(out) :: info
end subroutine psb_dasb
! 1-D double precision version
subroutine psb_dasbv(x, desc_a, info)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), pointer :: x(:)
integer, intent(out) :: info
end subroutine psb_dasbv
! 2-D integer version
subroutine psb_iasb(x, desc_a, info)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
integer, pointer :: x(:,:)
integer, intent(out) :: info
end subroutine psb_iasb
! 1-D integer version
subroutine psb_iasbv(x, desc_a, info)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
integer, pointer :: x(:)
integer, intent(out) :: info
end subroutine psb_iasbv
end interface
interface psb_csrovr
Subroutine psb_dcsrovr(a,desc_a,blk,info,rwcnv,clcnv,outfmt)
use psb_descriptor_type
use psb_spmat_type
Type(psb_dspmat_type),Intent(in) :: a
Type(psb_dspmat_type),Intent(inout) :: blk
Type(psb_desc_type),Intent(in) :: desc_a
integer, intent(out) :: info
logical, optional, intent(in) :: rwcnv,clcnv
character(len=5), optional :: outfmt
end Subroutine psb_dcsrovr
end interface
interface psb_csrp
subroutine psb_dcsrp(trans,iperm,a, desc_a, info)
use psb_descriptor_type
use psb_spmat_type
type(psb_dspmat_type), intent(inout) :: a
type(psb_desc_type), intent(in) :: desc_a
integer, intent(inout) :: iperm(:), info
character, intent(in) :: trans
end subroutine psb_dcsrp
end interface
interface psb_descasb
Subroutine psb_descasb(n_ovr,desc_p,desc_a,a,&
& l_tmp_halo,l_tmp_ovr_idx,lworks,lworkr,info)
use psb_descriptor_type
use psb_spmat_type
type(psb_dspmat_type),intent(in) :: a
type(psb_desc_type),intent(in) :: desc_a
type(psb_desc_type),intent(inout) :: desc_p
integer,intent(in) :: n_ovr
Integer, Intent(in) :: l_tmp_halo,l_tmp_ovr_idx
Integer, Intent(inout) :: lworks, lworkr
integer, intent(out) :: info
end Subroutine psb_descasb
end interface
interface psb_descprt
subroutine psb_descprt(iout,desc_p,glob,short)
use psb_const_mod
use psb_descriptor_type
implicit none
type(psb_desc_type), intent(in) :: desc_p
integer, intent(in) :: iout
logical, intent(in), optional :: glob,short
end subroutine psb_descprt
end interface
interface psb_free
! 2-D double precision version
subroutine psb_dfree(x, desc_a, info)
use psb_descriptor_type
real(kind(1.d0)),pointer :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer :: info
end subroutine psb_dfree
! 1-D double precision version
subroutine psb_dfreev(x, desc_a, info)
use psb_descriptor_type
real(kind(1.d0)),pointer :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer :: info
end subroutine psb_dfreev
! 2-D integer version
subroutine psb_ifree(x, desc_a, info)
use psb_descriptor_type
integer,pointer :: x(:,:)
type(psb_desc_type), intent(in) :: desc_a
integer :: info
end subroutine psb_ifree
! 1-D integer version
subroutine psb_ifreev(x, desc_a, info)
use psb_descriptor_type
integer, pointer :: x(:)
type(psb_desc_type), intent(in) :: desc_a
integer :: info
end subroutine psb_ifreev
end interface
interface psb_gelp
! 2-D version
subroutine psb_dgelp(trans,iperm,x,desc_a,info)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), intent(inout) :: x(:,:)
integer, intent(inout) :: iperm(:),info
character, intent(in) :: trans
end subroutine psb_dgelp
! 1-D version
subroutine psb_dgelpv(trans,iperm,x,desc_a,info)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)), intent(inout) :: x(:)
integer, intent(inout) :: iperm(:), info
character, intent(in) :: trans
end subroutine psb_dgelpv
end interface
interface psb_ins
! 2-D double precision version
subroutine psb_dins(m, n, x, ix, jx, blck, desc_a, info,&
& iblck, jblck)
use psb_descriptor_type
integer, intent(in) :: m,n
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)),pointer :: x(:,:)
integer, intent(in) :: ix,jx
real(kind(1.d0)), intent(in) :: blck(:,:)
integer,intent(out) :: info
integer, optional, intent(in) :: iblck,jblck
end subroutine psb_dins
! 2-D double precision square version
subroutine psb_dinsvm(m, x, ix, jx, blck, desc_a,info,&
& iblck)
use psb_descriptor_type
integer, intent(in) :: m
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)),pointer :: x(:,:)
integer, intent(in) :: ix,jx
real(kind(1.d0)), intent(in) :: blck(:)
integer, intent(out) :: info
integer, optional, intent(in) :: iblck
end subroutine psb_dinsvm
! 1-D double precision version
subroutine psb_dinsvv(m, x, ix, blck, desc_a, info,&
& iblck,insflag)
use psb_descriptor_type
integer, intent(in) :: m
type(psb_desc_type), intent(in) :: desc_a
real(kind(1.d0)),pointer :: x(:)
integer, intent(in) :: ix
real(kind(1.d0)), intent(in) :: blck(:)
integer, intent(out) :: info
integer, optional, intent(in) :: iblck
integer, optional, intent(in) :: insflag
end subroutine psb_dinsvv
! 2-D integer version
subroutine psb_iins(m, n, x, ix, jx, blck, desc_a, info,&
& iblck, jblck)
use psb_descriptor_type
integer, intent(in) :: m,n
type(psb_desc_type), intent(in) :: desc_a
integer,pointer :: x(:,:)
integer, intent(in) :: ix,jx
integer, intent(in) :: blck(:,:)
integer,intent(out) :: info
integer, optional, intent(in) :: iblck,jblck
end subroutine psb_iins
! 2-D integer square version
subroutine psb_iinsvm(m, x, ix, jx, blck, desc_a,info,&
& iblck)
use psb_descriptor_type
integer, intent(in) :: m
type(psb_desc_type), intent(in) :: desc_a
integer, pointer :: x(:,:)
integer, intent(in) :: ix,jx
integer, intent(in) :: blck(:)
integer, intent(out) :: info
integer, optional, intent(in) :: iblck
end subroutine psb_iinsvm
! 1-D integer version
subroutine psb_iinsvv(m, x, ix, blck, desc_a, info,&
& iblck,insflag)
use psb_descriptor_type
integer, intent(in) :: m
type(psb_desc_type), intent(in) :: desc_a
integer, pointer :: x(:)
integer, intent(in) :: ix
integer, intent(in) :: blck(:)
integer, intent(out) :: info
integer, optional, intent(in) :: iblck
integer, optional, intent(in) :: insflag
end subroutine psb_iinsvv
end interface
interface psb_ptins
subroutine psb_dptins(ia,ja,blck,desc_a,info)
use psb_descriptor_type
use psb_spmat_type
implicit none
type(psb_desc_type), intent(inout) :: desc_a
integer, intent(in) :: ia,ja
type(psb_dspmat_type), intent(in) :: blck
integer,intent(out) :: info
end subroutine psb_dptins
end interface
interface psb_dscall
subroutine psb_dscall(m, n, parts, icontxt, desc_a, info)
use psb_descriptor_type
Integer, intent(in) :: M,N,ICONTXT
Type(psb_desc_type), intent(out) :: desc_a
integer, intent(out) :: info
end subroutine psb_dscall
end interface
interface psb_scalv
subroutine psb_dscalv(m, v, icontxt, desc_a, info, flag)
use psb_descriptor_type
Integer, intent(in) :: m,icontxt, v(:)
integer, intent(in), optional :: flag
integer, intent(out) :: info
Type(psb_desc_type), intent(out) :: desc_a
end subroutine psb_dscalv
end interface
interface psb_dscasb
subroutine psb_dscasb(desc_a,info)
use psb_descriptor_type
Type(psb_desc_type), intent(inout) :: desc_a
integer, intent(out) :: info
end subroutine psb_dscasb
end interface
interface psb_dsccpy
subroutine psb_dsccpy(desc_out, desc_a, info)
use psb_descriptor_type
type(psb_desc_type), intent(out) :: desc_out
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dsccpy
end interface
interface psb_dscfree
subroutine psb_dscfree(desc_a,info)
use psb_descriptor_type
type(psb_desc_type), intent(inout) :: desc_a
integer, intent(out) :: info
end subroutine psb_dscfree
end interface
interface psb_dscins
subroutine psb_dscins(nz,ia,ja,desc_a,info,is,js)
use psb_descriptor_type
type(psb_desc_type), intent(inout) :: desc_a
Integer, intent(in) :: nz,IA(:),JA(:)
integer, intent(out) :: info
integer, intent(in), optional :: is,js
end subroutine psb_dscins
end interface
interface psb_dscov
Subroutine psb_dscov(a,desc_a,novr,desc_ov,info)
use psb_descriptor_type
Use psb_spmat_type
integer, intent(in) :: novr
Type(psb_dspmat_type), Intent(in) :: a
Type(psb_desc_type), Intent(in) :: desc_a
Type(psb_desc_type), Intent(inout) :: desc_ov
integer, intent(out) :: info
end Subroutine psb_dscov
end interface
interface psb_dscren
subroutine psb_dscren(trans,iperm,desc_a,info)
use psb_descriptor_type
type(psb_desc_type), intent(inout) :: desc_a
integer, intent(inout) :: iperm(:)
character, intent(in) :: trans
integer, intent(out) :: info
end subroutine psb_dscren
end interface
interface psb_spalloc
subroutine psb_dspalloc(a, desc_a, info, nnz)
use psb_descriptor_type
use psb_spmat_type
type(psb_desc_type), intent(inout) :: desc_a
type(psb_dspmat_type), intent(out) :: a
integer, intent(out) :: info
integer, optional, intent(in) :: nnz
end subroutine psb_dspalloc
end interface
interface psb_spasb
subroutine psb_dspasb(a,desc_a, info, afmt, up, dup)
use psb_descriptor_type
use psb_spmat_type
type(psb_dspmat_type), intent (inout) :: a
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer,optional, intent(in) :: dup
character, optional, intent(in) :: afmt*5, up
end subroutine psb_dspasb
end interface
interface psb_spcnv
subroutine psb_dspcnv(a,b,desc_a,info)
use psb_descriptor_type
use psb_spmat_type
type(psb_dspmat_type), intent(in) :: a
type(psb_dspmat_type), intent(out) :: b
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
end subroutine psb_dspcnv
end interface
interface psb_spfree
subroutine psb_dspfree(a, desc_a,info)
use psb_descriptor_type
use psb_spmat_type
type(psb_desc_type), intent(in) :: desc_a
type(psb_dspmat_type), intent(inout) ::a
integer, intent(out) :: info
end subroutine psb_dspfree
end interface
interface psb_spins
subroutine psb_dspins(nz,ia,ja,val,a,desc_a,info,is,js)
use psb_descriptor_type
use psb_spmat_type
type(psb_desc_type), intent(inout) :: desc_a
type(psb_dspmat_type), intent(inout) :: a
integer, intent(in) :: nz,ia(:),ja(:)
real(kind(1.d0)), intent(in) :: val(:)
integer, intent(out) :: info
integer, intent(in), optional :: is,js
end subroutine psb_dspins
end interface
interface psb_sprn
subroutine psb_dsprn(a, desc_a,info)
use psb_descriptor_type
use psb_spmat_type
type(psb_desc_type), intent(in) :: desc_a
type(psb_dspmat_type), intent(inout) :: a
integer, intent(out) :: info
end subroutine psb_dsprn
end interface
interface psb_spupdate
subroutine psb_dspupdate(a, ia, ja, blck, desc_a,info,ix,jx,updflag)
use psb_descriptor_type
use psb_spmat_type
type(psb_desc_type), intent(in) :: desc_a
type(psb_dspmat_type), intent(inout) :: a
integer, intent(in) :: ia,ja
type(psb_dspmat_type), intent(in) :: blck
integer, intent(out) :: info
integer, optional, intent(in) :: ix,jx
integer, optional, intent(in) :: updflag
end subroutine psb_dspupdate
end interface
interface psb_glob_to_loc
subroutine psb_glob_to_loc2(x,y,desc_a,info,iact)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
integer,intent(in) :: x(:)
integer,intent(out) :: y(:)
integer, intent(out) :: info
character, intent(in), optional :: iact
end subroutine psb_glob_to_loc2
subroutine psb_glob_to_loc(x,desc_a,info,iact)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
integer,intent(inout) :: x(:)
integer, intent(out) :: info
character, intent(in), optional :: iact
end subroutine psb_glob_to_loc
end interface
interface psb_loc_to_glob
subroutine psb_loc_to_glob2(x,y,desc_a,info,iact)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
integer,intent(in) :: x(:)
integer,intent(out) :: y(:)
integer, intent(out) :: info
character, intent(in), optional :: iact
end subroutine psb_loc_to_glob2
subroutine psb_loc_to_glob(x,desc_a,info,iact)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
integer,intent(inout) :: x(:)
integer, intent(out) :: info
character, intent(in), optional :: iact
end subroutine psb_loc_to_glob
end interface
interface psb_ptasb
subroutine psb_ptasb(desc_a,info)
use psb_descriptor_type
type(psb_desc_type), intent(inout) :: desc_a
integer,intent(out) :: info
end subroutine psb_ptasb
end interface
interface psb_dscrep
subroutine psb_dscrep(m, icontxt, desc_a,info)
use psb_descriptor_type
Integer, intent(in) :: m,icontxt
Type(psb_desc_type), intent(out) :: desc_a
integer, intent(out) :: info
end subroutine psb_dscrep
end interface
interface psb_dscdec
subroutine psb_dscdec(nloc, icontxt, desc_a,info)
use psb_descriptor_type
Integer, intent(in) :: nloc,icontxt
Type(psb_desc_type), intent(out) :: desc_a
integer, intent(out) :: info
end subroutine psb_dscdec
end interface
end module psb_tools_mod

@ -0,0 +1,161 @@
! Module containing interfaces for subroutine in SRC/F90/INTERNALS
module psi_mod
use psb_descriptor_type
interface
subroutine psi_compute_size(desc_data,&
& index_in, dl_lda, info)
integer :: info, dl_lda
integer :: desc_data(:), index_in(:)
end subroutine psi_compute_size
end interface
interface
subroutine psi_crea_bnd_elem(desc_a,info)
use psb_descriptor_type
type(psb_desc_type) :: desc_a
integer, intent(out) :: info
end subroutine psi_crea_bnd_elem
end interface
interface
subroutine psi_crea_index(desc_a,index_in,index_out,glob_idx,info)
use psb_descriptor_type
type(psb_desc_type), intent(in) :: desc_a
integer, intent(out) :: info
integer, intent(in) :: index_in(:)
integer, intent(out) :: index_out(:)
logical :: glob_idx
end subroutine psi_crea_index
end interface
interface
subroutine psi_crea_ovr_elem(desc_overlap,ovr_elem)
integer :: desc_overlap(:)
integer, pointer :: ovr_elem(:)
end subroutine psi_crea_ovr_elem
end interface
interface
subroutine psi_desc_index(desc_data,index_in,dep_list,&
& length_dl,loc_to_glob,glob_to_loc,desc_index,&
& isglob_in,info)
integer :: desc_data(:),index_in(:),dep_list(:)
integer :: loc_to_glob(:),glob_to_loc(:)
integer,pointer :: desc_index(:)
integer :: length_dl, info
logical :: isglob_in
end subroutine psi_desc_index
end interface
interface
subroutine psi_sort_dl(dep_list,l_dep_list,np,info)
integer :: np,dep_list(:,:), l_dep_list(:), info
end subroutine psi_sort_dl
end interface
interface psi_swapdata
subroutine psi_dswapdatam(flag,n,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag, n
integer, intent(out) :: info
real(kind(1.d0)) :: y(:,:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_dswapdatam
subroutine psi_dswapdatav(flag,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag
integer, intent(out) :: info
real(kind(1.d0)) :: y(:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_dswapdatav
subroutine psi_iswapdatam(flag,n,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag, n
integer, intent(out) :: info
integer :: y(:,:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_iswapdatam
subroutine psi_iswapdatav(flag,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag
integer, intent(out) :: info
integer :: y(:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_iswapdatav
end interface
interface psi_swaptran
subroutine psi_dswaptranm(flag,n,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag, n
integer, intent(out) :: info
real(kind(1.d0)) :: y(:,:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_dswaptranm
subroutine psi_dswaptranv(flag,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag
integer, intent(out) :: info
real(kind(1.d0)) :: y(:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_dswaptranv
subroutine psi_iswaptranm(flag,n,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag, n
integer, intent(out) :: info
integer :: y(:,:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_iswaptranm
subroutine psi_iswaptranv(flag,beta,y,desc_a,work,info)
use psb_descriptor_type
integer, intent(in) :: flag
integer, intent(out) :: info
integer :: y(:), beta, work(:)
type(psb_desc_type) :: desc_a
end subroutine psi_iswaptranv
end interface
interface psi_gth
subroutine psi_dgthm(n,k,idx,x,y)
integer :: n, k, idx(:)
real(kind(1.d0)) :: x(:,:), y(:)
end subroutine psi_dgthm
subroutine psi_dgthv(n,idx,x,y)
integer :: n, idx(:)
real(kind(1.d0)) :: x(:), y(:)
end subroutine psi_dgthv
subroutine psi_igthm(n,k,idx,x,y)
integer :: n, k, idx(:)
integer :: x(:,:), y(:)
end subroutine psi_igthm
subroutine psi_igthv(n,idx,x,y)
integer :: n, idx(:)
integer :: x(:), y(:)
end subroutine psi_igthv
end interface
interface psi_sct
subroutine psi_dsctm(n,k,idx,x,beta,y)
integer :: n, k, idx(:)
real(kind(1.d0)) :: beta, x(:), y(:,:)
end subroutine psi_dsctm
subroutine psi_dsctv(n,idx,x,beta,y)
integer :: n, idx(:)
real(kind(1.d0)) :: beta, x(:), y(:)
end subroutine psi_dsctv
subroutine psi_isctm(n,k,idx,x,beta,y)
integer :: n, k, idx(:)
integer :: beta, x(:), y(:,:)
end subroutine psi_isctm
subroutine psi_isctv(n,idx,x,beta,y)
integer :: n, idx(:)
integer :: beta, x(:), y(:)
end subroutine psi_isctv
end interface
end module psi_mod

@ -0,0 +1,26 @@
INTEGER MINJDROWS, MAXJDROWS
PARAMETER (MINJDROWS=4, MAXJDROWS=8)
DOUBLE PRECISION PERCENT
INTEGER DBLEINT_
INTEGER DCMPLXINT_
C ... This parameter represent sizeof(DOUBLE)/sizeof(INTEGER) ...
PARAMETER (PERCENT=0.7,DBLEINT_=2)
PARAMETER (DCMPLXINT_ = 4)
character fidef*5
parameter (fidef='CSR')
integer, parameter :: nnz_=1
integer, parameter :: del_bnd_=6, srtd_=7
integer, parameter :: state_=8, upd_=9
integer, parameter :: upd_pnt_=10, ifasize_=10
integer, parameter :: spmat_null=0, spmat_bld=1
integer, parameter :: spmat_asb=2, spmat_upd=4
integer perm_update
parameter (perm_update=98765)
integer isrtdcoo
parameter (isrtdcoo=98764)
integer ireg_flgs
parameter (ireg_flgs=10)
integer ip2_, iflag_, ipc_, ichk_, nnzt_, zero_
parameter (ip2_=0, iflag_=2, ichk_=3)
parameter ( nnzt_=4, zero_=5,ipc_=6)

@ -0,0 +1,31 @@
include ../../Make.inc
LIBDIR=../../lib/
LIBNAME=$(LIBDIR)/$(F90LIB)
HERE=.
MPFOBJS=dcslu.o psbdbldaggrmat.o
F90OBJS= dcsrsetup.o dcsrlu.o f90_psdprec.o \
dprecbld.o zprecbld.o gps.o psdprecfree.o dprecset.o \
psbdgenaggrmap.o $(MPFOBJS)
#dcoocp.o dcoocpadd.o dcoofact.o dcoolu.o dcooluadd.o\
COBJS=fort_slu_impl.o
INCDIRS=-I. -I.. -I$(LIBDIR)
OBJS=$(F90OBJS) $(COBJS)
lib: mpobjs $(OBJS)
ar -cur $(LIBNAME) $(OBJS)
ranlib $(LIBNAME)
#$(F90OBJS): $(MODS)
mpobjs:
(make $(MPFOBJS) F90="$(MPF90)" F90COPT="$(F90COPT)")
veryclean: clean
/bin/rm -f $(LIBNAME)
clean:
/bin/rm -f $(OBJS) $(LOCAL_MODS)

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save