The block is 'published' when the store to the write pointer is performed. That's a store release which will have an implicit barrier before the store.
Okay yeah, you're definitely right. I didn't realize that the stdatomic load/stores were by default seq_cst. The libraries I'm familiar with have macros for standalone atomic reads and writes, but you have to place any barriers manually. If you were using a library like that, you'd place them in between the memcpy and the pointer update, but here you get that for free.