What is the fundamental performance limiting factor that has dominated that last 30 years of computer architecture? The obvious answer is the disparity between processor and memory/storage speed. We connect processors to cache, to more cache, to even more cache, to some bus, to memory, to disk/network, suffering costs at each step as we try to keep the damn processor busy. My guess is that memory locality is totally different now than it was back when virtual memory systems were first designed – much less pronounced. My short review of the literature shows little research on, for example, memory locality of databases or operating systems since the 1980s. (Is that wrong?) But we do know enough to understand why the entire concept of distributed shared memory is wrong although there needs to be a catchy name for this category of error. The error is to see that some feature of computer hardware makes some type of software engineering difficult and then provide a software mechanism that makes that type of engineering impossible. The complexity of the memory hierarchy and the complexity of making locality work pose enormous challenges for software designers. The purpose of distributed shared memory is to make it impossible for software designers to control the cache performance of their programs by hiding caching behavior under a non-deterministic opaque layer.
(see comment by John Regehr)