Hacker News new | past | comments | ask | show | jobs | submit login

While I did not succeed in making the matmul code from https://github.com/Mozilla-Ocho/llamafile/blob/main/llamafil... work in isolation, I compared eigen, openblas, and mkl: https://gist.github.com/Dobiasd/e664c681c4a7933ef5d2df7caa87...

In this (very primitive!) benchmark, MKL was a bit better than eigen (~10%) on my machine (i5-6600).

Since the article https://justine.lol/matmul/ compared the new kernels with MLK, we can (by transitivity) compare the new kernels with Eigen this way, at least very roughly for this one use-case.




Here's a complete working example for POSIX systems on how to reproduce my llamafile tinyBLAS vs. MKL benchmarks: https://gist.github.com/jart/640231a627dfbd02fb03e23e8b01e59... This new generalized kernel does even better than what's described in the blog post. It works well on oddly shaped matrices. It needs however a good malloc function, which I've included in the gist. Since having the good memory allocator is what makes the simple implementation possible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: