I'm not very familiar with Clojure, but are the numerics done with unboxed numbers? In the end, what every language that wants to do scientific programming right is expose some sort of c-array like data structure that you can pass onto c/fortran code.
Do not worry. Neanderthal works with barest primitives, and uses Intel MKL, cuBLAS, CLBlast, and custom kernels at the low level. This is practicaly as fast as you can get in a general library.
How do you use MKL and friends on objects in the Java heap? It seems like you'd have to copy to the native heap, do your linear algebra operation, and then copy the result back from the native heap.
I do not understand what the form of that matrix has to do with java or native heap. This seems to me as a completely orthogonal issue. As for the triangular form, this is supported in Neanderthal, as well as the rest of special structural sparse shapes.
> I do not understand what the form of that matrix has to do with java or native heap. This seems to me as a completely orthogonal issue. As for the triangular form, this is supported in Neanderthal, as well as the rest of special structural sparse shapes.
You support triangular matrices. However, you can solve a linear system of the form I gave in linear time, while it takes quadratic space and time to form the corresponding triangular matrix and do a triangular solve.
Not all special matrices are supported by BLAS/LAPACK. Other common examples might be block Toeplitz/Hankel matrices for which fast multiplication and fast solvers are available. In order to support special (not in BLAS/LAPACK) matrix operations naturally, you'd want natural, no-extra-copying access to the vectors within Java or Clojure so that you can write the good algorithm manually, as you'd do in C or Fortran.
Sure! If you need to efficiently implement that yourself, you can use OpenCL to implement the kernels on the CPU, or OpenCL & CUDA on the GPU (If that makes sense). Check out ClojureCUDA and ClojureCL.