Would it be possible to write the check so that the null test overlaps with the rest of the instructions? If the test is anyways assumed to pass, you should only get a 1-instruction overhead, right?
1 instruction != 1 clock cycle. In particular that would utterly kill the usage of unique_ptr on things like small embedded processors or microcontrollers that either lack speculative execution entirely or do not have the level of branch predictor & speculative execution capabilities of a high-end x86 or ARMv8 CPU.
FWIW: in general, simpler controllers (especially those in-order ones) tend to be much more friendly to branching (exactly because they're not out-of-order). NB: I am not arguing whether unique_ptr<> should check or not: if I want checked version, I will write my own wrapper, it is not a rocket science.
(That'll only work for null.)