Darryl Gove's blog

Wednesday May 07, 2008

When to use membars

membar instructions are SPARC assembly language instructions that enforce memory ordering. They tell the processor to ensure that memory operations are completed before it continues execution. However, the basic rule is that the instructions are usually only necessary in "unusual" circumstances - which fortunately will mean that most people don't encounter them.

The UltraSPARC Architecture manual documents the situation very well in section 9.5. It gives these rules which cover the default behaviour:

  • Each load instruction behaves as if it were followed by a MEMBAR #LoadLoad and #LoadStore.
  • Each store instruction behaves as if it were followed by a MEMBAR #StoreStore.
  • Each atomic load-store behaves as if it were followed by a MEMBAR #LoadLoad, #LoadStore, and #StoreStore.

There's a table in section 9.5.3 which covers when membars are necessary. Basically, membars are necessary for ordering of block loads and stores, and for ordering non-cacheable loads and stores. There is an interesting note where it indicates that a membar is necessary to order a store followed by a load to a different addresses; if the address is the same the load will get the correct data. This at first glance seems odd - why worry about whether the store is complete if the load is of independent data. However, I can imagine this being useful in situations where the same physical memory is mapped using different virtual address ranges - not something that happens often, but could happen in the kernel.

As a footnote, the equivalent x86 instruction is the mfence. There's a good discussion of memory ordering in section 7.2 of the Intel Systems Programming Guide.

There's some more discussion of this topic on Dave Dice's weblog.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed

Calendar

Search this blog

About

Solaris Application Programming

Book resources

Recent entries

Custom search

Tag cloud

ats bit book c++ cmt communityone compiler cooltools cpu2006 developers dtrace gccfss hpc multithreading openmp opensparc parallelisation parallelization performance performanceanalyzer secondlife solaris solarisapplicationprogramming sparc spot sunstudio t2 ultrasparc ultrasparct2 x86

Links

Webcasts

Articles

Presentations

Navigation

Referers

Feeds