Tim Bradshaw <tfb+google@tfeb.org> wrote:
+---------------
| My argument is that physically, these machines actually are distributed
| memory systems, but their programming model is that of a shared memory
| system. And this illusion is maintained by a combination of hardware
| (route requests to non-local memory over the interconnect, deal with
| cache-coherency etc) and system-level software (arrange life so that
| memory is local to the threads which are using it where that is
| possible etc).
|
| Of course these machines typically are not MPP systems, and are also
| typically not HPC-oriented. Though I think SGI made NUMA systems with
| really quite large numbers of processors, and a Sun E25K can have 144
| cores (72 2-core processors), though I think it would be quite unusual
| to run a configuration like that as a single domain.
+---------------
SGI *still* makes large ccNUMA systems, the Altix 4700 series, which
offer a very large global main memeory, up to 128 TB(!), with global
cache coherency (sequential consistency, to be specific) and with
up to 512 Itanium CPUs standard (up to 1024 by special-order) in a
*single* domain, that is, a single instance of Linux, see:
http://www.sgi.com/products/servers/altix/4000/
Two things make this scale well:
1. A directory-based cache coherency system, which keeps cache
line ownership information with the memory subsystem the
cache line is in.
2. Compared to other large ccNUMA or NUMA systems, a really low
ratio of remote to local memory access times, varying between
3:1 to 4:1 for large to very-large systems.
And, yes, there a quite a few HPC customers who run systems that
large as single images for SMP-style codes which don't convert to
MPI style very well.
-Rob
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607