This technique is mostly a garbage collector, as I see it. Postponing memory destruction until the stack is empty is a special case of deferred reference counting [1], where sweep can only happen with an empty stack. If the "soft pointers" are implemented with reference counting, that's also a type of GC.
On the other hand, the tagged pointer implementation strategy for "soft pointers" isn't really garbage collection, but it does have much of the same overhead. Pointer reads must check the tag ID and throw, which is like a read barrier [2]. Writes through a pointer must do the same, similar to a write barrier [3]. And that's not getting into the overhead of multithreading; I see no reasonable way to implement this scheme in a multithreaded world. I expect that a fast GC without read barriers will significantly outperform this scheme. As much as everyone complains about the speed of GC, garbage collection is hard to beat!
> Pointer reads must check the tag ID and throw, which is like a read barrier
Usually, "read barrier" is understood as a multithreaded stuff - and OP has nothing to do with MT. In other words, no "read fence" is necessary (simply because it lives in a perfect single-threaded world). And from this POV, it is extremely difficult to beat this schema with any popular-multithreaded-GC. As a side note, proposed schema DOES allow 'naked' pointers, so relatively-expensive (costing ~4CPU cycles, which is not much to start with) conversion from 'soft' into 'naked' has to be done only _very_ occasionally, and after the conversion, we're working with good old plain pointers, which just happen to be safe due to the way they're used.
On the other hand, the tagged pointer implementation strategy for "soft pointers" isn't really garbage collection, but it does have much of the same overhead. Pointer reads must check the tag ID and throw, which is like a read barrier [2]. Writes through a pointer must do the same, similar to a write barrier [3]. And that's not getting into the overhead of multithreading; I see no reasonable way to implement this scheme in a multithreaded world. I expect that a fast GC without read barriers will significantly outperform this scheme. As much as everyone complains about the speed of GC, garbage collection is hard to beat!
[1]: http://www.memorymanagement.org/glossary/d.html#term-deferre...
[2]: http://www.memorymanagement.org/glossary/r.html#term-read-ba...
[3]: http://www.memorymanagement.org/glossary/w.html#term-write-b...