I could see myself using some of the source code in the classroom to explain
how transformers "really" work; code is more concrete/detailed than all those
pictures of attention heads etc.
Two points of minor criticism/suggestions for improvement:
- libraries should not print to stdout, as that output may detroy application output (imagine I want to use the library in a text editor to offer style checking). So best to write to a string buffer owned by a logging class instance associated with a lm.rs object.
- Is it possible to do all this without "unsafe" without twisting one's arm? I see there are uses of "unsafe" e.g. to force data alignment in the model reader.
In fairness it's already not really “zero dependency” since it uses rayon (for easy multithreading) and wide (for easy SIMD), using log would make total sense I think (not the main author, just a contributor).
I could see myself using some of the source code in the classroom to explain how transformers "really" work; code is more concrete/detailed than all those pictures of attention heads etc.
Two points of minor criticism/suggestions for improvement:
- libraries should not print to stdout, as that output may detroy application output (imagine I want to use the library in a text editor to offer style checking). So best to write to a string buffer owned by a logging class instance associated with a lm.rs object.
- Is it possible to do all this without "unsafe" without twisting one's arm? I see there are uses of "unsafe" e.g. to force data alignment in the model reader.
Again, thanks and very impressive!