Jonathan Adams's Weblog
macros and powers of two
Unlike many userland applications, there are large portions of the kernel that need to do bit-level manipulations. These generally fall into a few distinct areas:
- Drivers communicating with hardware
- Hash functions, bitmasks, and other high-performance algorithms
- Memory allocators, page manipulators, etc.
The main low-level allocators in Solaris are kmem and vmem; kmem handles your standard page-backed memory allocation, while vmem handles the more abstract concept of allocating from spaces of integers. kmem actually layers on top of vmem, but the details are beyond the scope of this blog entry.
While implementing vmem, Jeff Bonwick introduced a set of handy macros for dealing with things relative to powers of two. The macros live in uts/common/sys/sysmacros.h, and are defined thusly:
/* * Macros for various sorts of alignment and rounding when the alignment * is known to be a power of 2. */ #define P2ALIGN(x, align) ((x) & -(align)) #define P2PHASE(x, align) ((x) & ((align) - 1)) #define P2NPHASE(x, align) (-(x) & ((align) - 1)) #define P2ROUNDUP(x, align) (-(-(x) & -(align))) #define P2END(x, align) (-(~(x) & -(align))) #define P2PHASEUP(x, align, phase) ((phase) - (((phase) - (x)) & -(align))) #define P2CROSS(x, y, align) (((x) ^ (y)) > (align) - 1) /* * Determine whether two numbers have the same high-order bit. */ #define P2SAMEHIGHBIT(x, y) (((x) ^ (y)) < ((x) & (y)))The purpose of this blog entry is to talk about what they do and how they work; if you don't want the challenge spoiled, you should stop reading now.
Alright, now that they've all left, we can dive in. First, remember the basic equation of two's complement arithmetic:
-x = ~x + 1That is, to negate a number, complement all of its bits and add one. If I assume that the number has a lowest set bit,[1] and capital Xs represent complements of the corresponding xs, here are some relations for any non-zero binary number:
x = ...xxxxx10000...
x-1 = ...xxxxx01111...
~x = ...XXXXX01111...
-x = ~x+1 = ...XXXXX10000...
Now, for powers of two, there is only one set bit, so all of the
xs are 0. This means that (x-1) is a mask for the offset
modulo x, and -x is a mask which clears the offset. In addition,
all numbers are unsigned, align is a power-of-2, and
offset is less than align. These relations in mind,
we can look at what each of the above macros do:
| macro | formula | action |
|---|---|---|
| P2ALIGN | x & -align | clears all bits below align, rounding x down to the next lower multiple of align |
| P2PHASE | x & (align - 1) | clears all bits above or equal to align, getting (x % align), the phase of x with regards to align |
| P2NPHASE | (-x) & (align - 1) | This is the first tricky formula. We negate x, and get its phase. The result is the same as (align - P2PHASE(x, align)) % align, but is much faster to compute. |
| P2ROUNDUP | -((-x) & -align) | This can be re-written as -P2ALIGN(-x, align), which shows how the operation works; "negate, round down, negate" is the same as "round up". So we round x up to the next align boundary. |
| P2END | -((~x) & -align) | This can be re-written as -((-x-1) & -align) = P2ROUNDUP(x+1, align); round up (x+1) to the next align boundary. This gets the end address of the current align block. |
| P2PHASEUP |
(phase - ((phase - x) & -align)) |
This can be re-written as (phase + P2ROUNDUP(x - phase, align)). So what we've done is rounded up x to have a specific phase with regards to align. (If you've got some linear algebra under your belt, this is similar to transforming f(y) = P2ROUNDUP(y, align) into a coordinate system with x = y + phase) |
| P2CROSS | (x ^ y) > (align-1) | (x ^ y) has bits set wherever x and y differ. This just tests to see if they differ in any position above the alignment, which will only be the case if [x, y] crosses an align boundary. |
| P2SAMEHIGHBIT | (x ^ y) < (x & y) | Similar to P2CROSS; here, we're checking that the first bit that differs is less than the highest set bit the two have in common. |