You don't need kernel assistance to make user-land RCU-like structures work well enough. I've done it twice. The price you pay for not having kernel assistance is that some threads sometimes have to do extra work, like signal waiting writers, loop over a very short read, write to hazard pointer, read again until the value read is stable, or do a garbage collection, but this price is not too heavy.