- Create the list in some weird order. You know you're going to traverse it a bunch, so you sort it.
- The list is in creation order, and creation is in allocation order. Once in a while you go back and insert or delete a small handful of nodes, hence the linked list.
- There is a natural sort order that you can make contiguous, but you support relatively rare alternative orderings and optimize for the natural order.
Then again, I pretty much agree with you. I think it's a clever trick, but I can't come up with a time when I would use it. Largely that's probably because if the memory is contiguous most of the time, then you should probably be using a vector instead. You can insert/remove by shifting stuff around to handle the rare cases that require the linked list. If performance cliffs are a problem, you can mitigate with a segmented array.
I do think there are examples out there for this value prediction trick that could make sense, which is why I was a bit frustrated by the example they chose.
Someone mentioned sparse vectors for example, where you are guessing 0 rather than pointer++. Anytime where a value is somewhat predictable this could come in handy.
Yeah, this layout of a linked list within one big allocation seems like a niche thing at best.
* I tend to default to a simple vector because it's denser, more CPU cache friendly, less pointer chasy.
* If I really wanted a linked list to avoid occasional expensive ops, I probably would want to be able to grow it without reallocation (much less touching up all the old pointers), so they might all be in separate allocations, and most general-purpose memory allocators won't give me the nice sequential-ness.
* And then I'd probably end up with an unrolled linked list (probably the same thing you're describing as a "segmented array"?), which would reduce the pointer chasing and also make better use of the memory bandwidth through better density, so it'd outperform this trick.
* I guess if I really wanted to be able to reorder stuff within a vector cheaply but was still comfortable with the amortized constant time for growth, I might have represent the links as indices within the vector (or in a parallel vector). It'd be useful then. Maybe something like a linked hash map too. But if those ops are common, the 100% accurate prediction of this benchmark is way too optimistic. Could sort it sometimes as you mentioned, but that seems even more niche and fussy.
There might be other cases of this trick that I'm more likely to use than this data structure, but I'm scratching my head to think of them.
- Create the list in some weird order. You know you're going to traverse it a bunch, so you sort it.
- The list is in creation order, and creation is in allocation order. Once in a while you go back and insert or delete a small handful of nodes, hence the linked list.
- There is a natural sort order that you can make contiguous, but you support relatively rare alternative orderings and optimize for the natural order.
Then again, I pretty much agree with you. I think it's a clever trick, but I can't come up with a time when I would use it. Largely that's probably because if the memory is contiguous most of the time, then you should probably be using a vector instead. You can insert/remove by shifting stuff around to handle the rare cases that require the linked list. If performance cliffs are a problem, you can mitigate with a segmented array.