> usually if you want to do that sort of thing quickly then you'd use ndarray which can use BLAS
In C++ I’d usually use Eigen, because these expression templates are saving memory allocations and bandwidth storing/reloading temporary matrices. Sometimes much faster than BLAS libraries with C API. I’m not sure Rust has an equivalent.
> indices into arrays instead, which can be folded into the addressing mode on most architectures
For some applications of graphs and trees it’s useful to have nodes polymorphic. An example is a visual tree in GUI: different nodes are instances of different classes. Array elements are of the same type.
On AMD64 that’s only true when the size of the elements being addressed is 1/2/4/8 bytes, the SIB bytes only have 2 bits for the scale. For any other use case, addressing these arrays requires to multiply (or if you’re lucky at least left shift) these integers
Even when the elements are 8 bytes so the indexing can be merged, you need to either spend a register for the base address, or load it from memory with another instruction.
It’s relatively expensive to split or merge linked lists/trees/graphs stored that way. If the tree/graph is long lived, mutable, and changes a lot, eventually you might need to compact or even garbage collect these arrays.
> In C++ I’d usually use Eigen, because these expression templates are saving memory allocations and bandwidth storing/reloading temporary matrices. Sometimes much faster than BLAS libraries with C API. I’m not sure Rust has an equivalent.
That equivalent would be ndarray.
> For some applications of graphs and trees it’s useful to have nodes polymorphic. An example is a visual tree in GUI: different nodes are instances of different classes. Array elements are of the same type.
And in that case you can use Box (or Rc/Arc).
> Even when the elements are 8 bytes so the indexing can be merged, you need to either spend a register for the base address, or load it from memory with another instruction.
I've never seen this be a performance problem in practice; the cost of doing a shift and add is incredibly low compared to the cost of actually fetching the memory.
> It’s relatively expensive to split or merge linked lists/trees/graphs stored that way. If the tree/graph is long lived, mutable, and changes a lot, eventually you might need to compact or even garbage collect these arrays.
Which is the same thing modern thread-caching mallocs also have to do, except that compacting and garbage collecting is actually possible with the arena approach (not that I think it's terribly important either way).
It seems ndarray is conceptually similar to C libraries, it doesn’t have expression templates.
When you write r = a * b * c with ndarray, you allocate, store and then load a temporary array with a * b. When you write r = a * b * c with Eigen, depending on the types it sometimes skips the temporary, and instead computes the complete expression in one shot. For some use cases, the tactic causes substantial performance win.
> Box (or Rc/Arc)
An array of boxes will cause another pointer chase: first to load the pointer, another one to reach the payload.
I wonder have they fixed the performance? https://www.reddit.com/r/rust/comments/bto10h/update_a_scali...
> usually if you want to do that sort of thing quickly then you'd use ndarray which can use BLAS
In C++ I’d usually use Eigen, because these expression templates are saving memory allocations and bandwidth storing/reloading temporary matrices. Sometimes much faster than BLAS libraries with C API. I’m not sure Rust has an equivalent.
> indices into arrays instead, which can be folded into the addressing mode on most architectures
For some applications of graphs and trees it’s useful to have nodes polymorphic. An example is a visual tree in GUI: different nodes are instances of different classes. Array elements are of the same type.
On AMD64 that’s only true when the size of the elements being addressed is 1/2/4/8 bytes, the SIB bytes only have 2 bits for the scale. For any other use case, addressing these arrays requires to multiply (or if you’re lucky at least left shift) these integers
Even when the elements are 8 bytes so the indexing can be merged, you need to either spend a register for the base address, or load it from memory with another instruction.
It’s relatively expensive to split or merge linked lists/trees/graphs stored that way. If the tree/graph is long lived, mutable, and changes a lot, eventually you might need to compact or even garbage collect these arrays.