SPARC Enterprise M-class Servers The Secrets of Olympus

Tuesday May 08, 2007

I was asked couple times by "Trainers" and OPL Training Writers
about OPL Cache Coherency but I don't have any info myself.

Here is a summary emails on the question:


------------------------------------------------------


In the Serengeti, it's the Sunfire Snoopy coherency.
In the Starcat, It's memory management between
the CPU memory and the L2. In the Starcat, it's
scalable shared memory where cache coherency is
maintained within the board set and referenced out to
the L3 when needed.

What's the cache coherency model for OPL?


------------------------------------------------------

Anyone has info to share?

8-)

Comments:

Here's some information that we do know about cache coherency:

The Jupiter bus supports either 64 byte transactions or 256 byte transactions. The 64 byte transactions are used for block stores and for filling cache misses when the data is in the cache of another processor. The 256 byte transactions are used when filling cache misses or prefetches when the data is in memory.

There is also a system wide limit of 1 transaction per SC clock cycle or 960 MHz times 256 bytes or 245 Gbytes/second. Since we've measured 224 Gbytes/second using the Stream benchmark, we know the system can really deliver data at that rate.

We also know that minimum memory latency is higher for larger systems even if the memory is local to the same system board as the requesting processor. That suggests resolving a cache miss requires communication with all SC chips. Larger systems have more SC chips and more layers of communication, so it takes longer for the furtherest ones to respond.

Putting these facts together leads me to speculate that the SC chip is maintaining some sort of directory information about cache lines that its processors are using, and when a cache miss occurs, a message goes to all SC chips asking if the cache line is in use by any processor. If no SC chip claims the cache line is in use, then the local memory controller is allowed to supply the data. I repeat that is speculation. As far as I know, Fujitsu has not published the cache coherency specification for the Jupiter bus. Perhaps now that the product is shipping, they might be willing to publish that information.

- Patrick McGehearty

Posted by Patrick McGehearty on May 10, 2007 at 08:08 PM EDT #

Post a Comment:
Comments are closed for this entry.