Two interesting posts on the order in which reads and writes happen and are made visible on shared memory multi-processors.  X86 has a strong memory model in which if X=Y=0 and then processor core 1 writes 1 to  X and then Y, if processor core 2 reads X and sees a 1, then its read of Y will find 1 not 0.  This “publication safety” issue is different on other multiprocessors and must be hell to implement on x86.

Fences 

Multicores and publication safety 

 

Relaxed memory order