Hacker News new | past | comments | ask | show | jobs | submit | Dobiasd's comments login

Are there any benchmarks on the performance of these new matrix multiplication kernels compared to the Eigen library (ideally for float32)?


While I did not succeed in making the matmul code from https://github.com/Mozilla-Ocho/llamafile/blob/main/llamafil... work in isolation, I compared eigen, openblas, and mkl: https://gist.github.com/Dobiasd/e664c681c4a7933ef5d2df7caa87...

In this (very primitive!) benchmark, MKL was a bit better than eigen (~10%) on my machine (i5-6600).

Since the article https://justine.lol/matmul/ compared the new kernels with MLK, we can (by transitivity) compare the new kernels with Eigen this way, at least very roughly for this one use-case.


Here's a complete working example for POSIX systems on how to reproduce my llamafile tinyBLAS vs. MKL benchmarks: https://gist.github.com/jart/640231a627dfbd02fb03e23e8b01e59... This new generalized kernel does even better than what's described in the blog post. It works well on oddly shaped matrices. It needs however a good malloc function, which I've included in the gist. Since having the good memory allocator is what makes the simple implementation possible.


Thanks for the feedback! I just extended the comment to touch on the "why" part. :) https://github.com/Dobiasd/articles/commit/53f360259ad8e64ca...


Don't leave me hanging, too specific about what?



That is fantastic. Much appreciated on the quick response. You're hired. =)


I'm not sure if you're actually suggesting that the author expound further on the concept of pseudocode, or if this is just humor.


Mostly humor, but also a suggestion. Why? Because if it was real code I'd want the 'why' to be as specific as possible. It doesn't have to be a novel, but it also shouldn't be left open for interpretation.

Too many times have I gotten some ancient piece of code and cursed the author (sometimes this is even myself) for either not documenting it well or it had a bug or whatnot. "OMG, what were they thinking when they wrote this code?!" I never want to be in the position where someone is cursing me.


Thanks for the feedback! My code snippets in the article don't use any real/existing language. C# for example, is quite explicit with the transformation of generated to state machines, but also does not provide such methods, as far as I know. I've just added a comment explaining this choice: https://github.com/Dobiasd/articles/commit/f44b897f2a4d20aa9...


Thanks! Yeah, that was one of the intentions. :)


The interview for my current job first went mediocre, but by talking about frugally-deep (a side project of mine) I was able to excite my (now) employer. :-)

https://github.com/Dobiasd/frugally-deep


Hoogle is really amazing!

Inspired by it, I implemented something similar for FunctionalPlus (a functional-programming library for C++): https://www.editgym.com/fplus-api-search/

I'd love to see more projects taking this path too. :)


> I spent some time over those 30 years looking at new IDEs, trying them out, configuring them, and each and every time this was time wasted, because the IDE was discontinued, abandoned, or otherwise became useless.

"wasted" sounds a bit too hard to me. If you learn and use some tool productively for some years, even if this stops, the years were not wasted. Ephemeral things can also have value.


Yeah, this one does something much less insane, i.e., it converts the paths to the tree outputs into their corresponding DNS (disjunctive normal form) and represents each term as a node (side by side in the same layer) in the NN, as described by Arunava Banerjee in "Initializing Neural Networks using Decision" [1]. The resulting NN architecture is much more reasonable than the one that treebomination produces.

[1]: https://www.cise.ufl.edu/~arunava/papers/clnl94.pdf


Thank! Yes, in contrast to treebomination, using TF-DF can actually make sense. ;)


Thanks! This looks interesting. Some of the main differences I can spot so far are: - Hummingbird does not construct a NN with an architecture isomorphic to the source decision tree but instead cleverly compiled it into other (more sane) tensor computations. - Hummingbird is actually useful. ;)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: