Indeed, if you an want immediate error on every out-of-bounds read, this won't b...

jart · 2024-07-30T18:43:46 1722365026

What's wrong with assembly? What's wrong with aligning a pointer and turning the sanitizer off if need be? If you're making machine specific assumptions then you should be programming against the machine rather than the language.

dzaima · 2024-07-30T20:59:51 1722373191

Assembly is a good fallback, but it requires that individual programmers special-case individual architectures (thus each likely ending up with an incomplete list, and definitely incomplete as soon as it stops being updated) instead of having a standard thing everyone can rely on and the compiler/sanitizers could actually understand, and which automatically extends to more architectures.

Not using sanitizers is, of course, an option, but, as should be obvious, is very much not optimal. Having a single "questionable" operation in one place does not mean that the programmer intends for the entire function or program to be all "programmed against the machine"; the rest of the code, and, to an extent, even the fancy operation in question, could still very much benefit from regular language tooling.

Such operations needn't even be machine-specific.

"try loading N bytes, accepting garbage out-of-bounds; return None if cannot be done without potentially faulting" says and requires nothing about the architecture, and is implementable everywhere (even if as always returning None).

Aligning pointers is another option, but is fundamentally in no way different from the memory-protection-boundary-based version in how much it relies on hardware specifics, and compiler/language builtins could still be made that allow for sanitizer-friendly usage. It might be more or less efficient depending on use-case, both are useful to have.

Of course, the best option would be that malloc, the linker, etc work together to guarantee at least N bytes of addressable memory past all user-accessible pointers, at which point architecture specifics completely stop mattering. This needn't change any behavior around sanitizers or regular loads; all it'd mean is that the "load N bytes with trailing garbage" operation can always succeed. Sanitizers could error on said op reading outside of the guaranteed readability size, and regular loads of course continue erroring on any out-of-bounds read. Compilers could even use this guarantee themselves to emit unmasked loads for loop tails.