Darryl Gove's blog

Friday Mar 27, 2009

When to add a membar (an example)

I was recently having a discussion on one of the OpenSolaris lists on the topic of when to use the volatile keyword, and when it was necessary to use membars.

So volatile is a clue to the compiler to load the variable from memory and immediately store it back to memory. What it does not do is to tell the hardware anything. So the application can perform the store, but that store may not be immediately visible to the rest of the system. Most of the time this is not a problem - so long as the store is visible to the processor on which the thread is executing it's fine. Variability of when the store is visible to other processors may also be fine. There is one clear situation where the ordering of store operations could be a problem - and that's unlocking mutexes.

The problem here is best illustrated by the following scenario. I lock some data structure, then store new values into it, then unlock the structure. Immediately another thread comes along and uses the values in that structure. Not an uncommon situation. Unlocking a mutex is often just a case of storing a value (of zero) into the mutex structure. And here's the potential problem. In some weaker ordering architectures there is no guarantee that other processors see the stores in the same order that they are performed. So if you have Store A followed by Store B it may be possible for other processors to observe the change in the value of B before they see the change in the value of A. In the case of mutex unlock, the store of B would be the action that unlocked the mutex, enabling other threads to access the variable A... and there could be problems if they see the old value of A.

The solution to this is to put a membar in before the store that unlocks the mutex. You can see this happening in the OpenSolaris code:

     41 /*
     42  * lock_clear(lp)
     43  *	- clear lock.
     44  */
     45 	ENTRY(_lock_clear)
     46 	membar	#LoadStore|#StoreStore
     47 	retl
     48 	  clrb	[%o0]
     49 	SET_SIZE(_lock_clear)

The membar ensures that all the pending stores are visible to other processors before the store that releases the lock becomes visible to them.

Comments:

Why is clrb used instead of sub %o0, %o0?

I always thought that sub is faster?

Posted by UX-admin on March 27, 2009 at 10:44 AM PDT #

The clrb is actually stb %g0,[%o0]. It's clearing the byte at that address, not clearing the register. That would be mov %g0,%o0, or as you suggest sub %o0,%o0. Generally on SPARC, there's some instructions (like clr) which are basically alternative names for particular versions of more general instructions.

Regards,

Darryl.

Posted by Darryl Gove on March 27, 2009 at 10:52 AM PDT #

Post a Comment:
Comments are closed for this entry.

Calendar

Search this blog

About

Solaris Application Programming

Book resources

The Developer's Edge

Book resources

OpenSPARC Internals

Book resources

Recent entries

Custom search

Tag cloud

book cmt communityone compiler cooltools cpu2006 dtrace gcc libraries linker openmp opensolaris opensparc optimisation optimization parallelisation parallelization performance performanceanalyzer programming solaris solarisapplicationprogramming sparc spec spot sunstudio t2 ultrasparc ultrasparct2 x86

Links

Webcasts

Articles

Presentations

Interesting docs

Navigation

Referers

Feeds