More

Dobiasd · on April 3, 2024

Are there any benchmarks on the performance of these new matrix multiplication kernels compared to the Eigen library (ideally for float32)?

Dobiasd · on April 4, 2024

While I did not succeed in making the matmul code from https://github.com/Mozilla-Ocho/llamafile/blob/main/llamafil... work in isolation, I compared eigen, openblas, and mkl: https://gist.github.com/Dobiasd/e664c681c4a7933ef5d2df7caa87...

In this (very primitive!) benchmark, MKL was a bit better than eigen (~10%) on my machine (i5-6600).

Since the article https://justine.lol/matmul/ compared the new kernels with MLK, we can (by transitivity) compare the new kernels with Eigen this way, at least very roughly for this one use-case.

jart · on April 7, 2024

Here's a complete working example for POSIX systems on how to reproduce my llamafile tinyBLAS vs. MKL benchmarks: https://gist.github.com/jart/640231a627dfbd02fb03e23e8b01e59... This new generalized kernel does even better than what's described in the blog post. It works well on oddly shaped matrices. It needs however a good malloc function, which I've included in the gist. Since having the good memory allocator is what makes the simple implementation possible.

Dobiasd · on March 20, 2024

Thanks for the feedback! I just extended the comment to touch on the "why" part. :) https://github.com/Dobiasd/articles/commit/53f360259ad8e64ca...

latchkey · on March 20, 2024

Don't leave me hanging, too specific about what?

Dobiasd · on March 20, 2024

https://github.com/Dobiasd/articles/commit/b5342dd5072480800...

Better? :)

latchkey · on March 20, 2024

That is fantastic. Much appreciated on the quick response. You're hired. =)

happytoexplain · on March 20, 2024

I'm not sure if you're actually suggesting that the author expound further on the concept of pseudocode, or if this is just humor.

latchkey · on March 20, 2024

Mostly humor, but also a suggestion. Why? Because if it was real code I'd want the 'why' to be as specific as possible. It doesn't have to be a novel, but it also shouldn't be left open for interpretation.

Too many times have I gotten some ancient piece of code and cursed the author (sometimes this is even myself) for either not documenting it well or it had a bug or whatnot. "OMG, what were they thinking when they wrote this code?!" I never want to be in the position where someone is cursing me.

Dobiasd · on March 20, 2024

Thanks for the feedback! My code snippets in the article don't use any real/existing language. C# for example, is quite explicit with the transformation of generated to state machines, but also does not provide such methods, as far as I know. I've just added a comment explaining this choice: https://github.com/Dobiasd/articles/commit/f44b897f2a4d20aa9...

Dobiasd · on March 20, 2024

Thanks! Yeah, that was one of the intentions. :)

Dobiasd · on Dec 4, 2023

The interview for my current job first went mediocre, but by talking about frugally-deep (a side project of mine) I was able to excite my (now) employer. :-)

https://github.com/Dobiasd/frugally-deep

Dobiasd · on Aug 25, 2023

Hoogle is really amazing!

Inspired by it, I implemented something similar for FunctionalPlus (a functional-programming library for C++): https://www.editgym.com/fplus-api-search/

I'd love to see more projects taking this path too. :)

Dobiasd · on Aug 2, 2023

> I spent some time over those 30 years looking at new IDEs, trying them out, configuring them, and each and every time this was time wasted, because the IDE was discontinued, abandoned, or otherwise became useless.

"wasted" sounds a bit too hard to me. If you learn and use some tool productively for some years, even if this stops, the years were not wasted. Ephemeral things can also have value.

Dobiasd · on June 12, 2023

Yeah, this one does something much less insane, i.e., it converts the paths to the tree outputs into their corresponding DNS (disjunctive normal form) and represents each term as a node (side by side in the same layer) in the NN, as described by Arunava Banerjee in "Initializing Neural Networks using Decision" [1]. The resulting NN architecture is much more reasonable than the one that treebomination produces.

[1]: https://www.cise.ufl.edu/~arunava/papers/clnl94.pdf

Dobiasd · on June 12, 2023

Thank! Yes, in contrast to treebomination, using TF-DF can actually make sense. ;)

Dobiasd · on June 12, 2023

Thanks! This looks interesting. Some of the main differences I can spot so far are: - Hummingbird does not construct a NN with an architecture isomorphic to the source decision tree but instead cleverly compiled it into other (more sane) tensor computations. - Hummingbird is actually useful. ;)