Commit Graph

62 Commits (34a2a7ddbc9c9d3339265bc602aaa5144200d09f)

Author SHA1 Message Date
gabrielequatrana cfa0a785c5 Fixed multiple bugs 2 years ago
gabrielequatrana 448533effb Merge remote-tracking branch 'origin/cuda-multivect' into psblas-bgmres 2 years ago
gabrielequatrana efccdc88cd Fixed another typo 2 years ago
gabrielequatrana fabf53a225 Fixed some typos 2 years ago
gabrielequatrana c08431d71e Merge remote-tracking branch 'origin/cuda-multivect' into psblas-bgmres 2 years ago
gabrielequatrana 4ff0f112a9 SpMM HDIAG working 2 years ago
gabrielequatrana bdd04a6911 Adding support to HDIAG SpMM 2 years ago
gabrielequatrana 0490dd77db ELG SpMM now working 2 years ago
gabrielequatrana dc6e5bb942 ELG SpMM (compiling but not working) 2 years ago
gabrielequatrana 6b8199f84b ELG SpMM (not compiling) 2 years ago
gabrielequatrana 9daa04c3dc Updated HLG SpMM (s,d,c,z) 2 years ago
Salvatore Filippone ccb4f73dca Choose version of HLG-multivect kernel 2 years ago
sfilippone 148a5e5e14 Check for shmemsize 2 years ago
Salvatore Filippone 378c126055 Use MMBSZ=8, prepare to check shmem size 2 years ago
Salvatore Filippone 897cfb4028 Multicolumn HLG product 2 years ago
gabrielequatrana cf315660e1 Updated tests 2 years ago
Salvatore Filippone 5b95f1920c Regenerate configure, fix typo in hlg_vect_mv 2 years ago
gabrielequatrana 08984619dc Fixed SpMM for HLG 2 years ago
gabrielequatrana beb418e00b Fixed SpMM for ELG (AXPBY GPU not working) 2 years ago
gabrielequatrana c807d88c57 SpMM using Cusparse dedicated routine (CSRG) 2 years ago
Salvatore Filippone 399818a482 Never do arithmetic on a (void *) 2 years ago
Salvatore Filippone 1ecf36c9c3 Merge branch 'psblas-bgmres' of github.com:sfilippone/psblas3 into psblas-bgmres 2 years ago
Salvatore Filippone 7d74ebf5c4 Make multivectors work 2 years ago
gabrielequatrana 82715fec9b Fixed SpMM for CSRG 2 years ago
gabrielequatrana 409b51e609 Try to fix SpMM 2 years ago
gabrielequatrana ee140bc8dd Read/Write multivect fixed (SpMM bug) 2 years ago
gabrielequatrana a624b7098b Cuda multivect methods implementation 2 years ago
sfilippone 0760e4d553 Fix C function declarations for compilation with LLVM/clang in CUDA 2 years ago
sfilippone 3a25d7b04a Fixes for LLVM compilation 2 years ago
sfilippone e0a4d362fa Define flag TRACK_CUDA_MALLOC 2 years ago
Salvatore Filippone b5f1442ac8 Merge branch 'nond-rep' into repackage 2 years ago
sfilippone 48455190ec Add GPU version of XYZW 2 years ago
sfilippone a11f328e62 Added CUDA version of XYZW 2 years ago
sfilippone b5d5f97661 Improve cuda%zero() 2 years ago
sfilippone 0e269ed641 typo in Cabgdxyz 2 years ago
Salvatore Filippone d95077ffd6 Fix typo in vectordev_mod 2 years ago
Salvatore Filippone 2d3773df98 CUDA kernels for ABGDXYZ 2 years ago
Salvatore Filippone 2a75d677d0 ABGDXYZ in vectordev_mod 2 years ago
sfilippone 2391f64df6 X_cuda_vect%abgdxyz 2 years ago
sfilippone 93c71c4316 Fix %ZERO() on cuda 2 years ago
sfilippone 0568a83734 Fix ifdef and old code 2 years ago
Salvatore Filippone 35d68aa4e3 Reuse calls to getDeviceProperties done at init time 2 years ago
Salvatore Filippone 1ba8dfc7b7 Switch FOR and IF in AXPBY 2 years ago
Salvatore Filippone f9677bc892 Enabled new CUDA version of ABGDXYZ 2 years ago
Salvatore Filippone 4681767ef8 New implementation for ABGDXYZ in CUDA 2 years ago
Salvatore Filippone 105aa3c570 Intermediate impl of ABGDXYZ 2 years ago
Salvatore Filippone 864872ecac Intermediate implementation of abgdxyz on cuda 2 years ago
Salvatore Filippone a41b209144 Better AXPBY implementation in CUDA. 2 years ago
Salvatore Filippone ebc7c6b3b4 Fix call to base%abgdxyz 2 years ago
Salvatore Filippone 14c4ff0f32 Added new methd for two combined axpbys 2 years ago