back to the future: "timewarp zones"
You've just finished setting up the some new time-based system. Now you need to wait until it triggers tomorrow, next week, next month, etc. and see if it explodes. Or do you? Maybe you could just install a second copy on another machine, turn the clock forward, and debug it now. Even better: what if you didn't need a new machine, because you could just use a Solaris zone operating in its own time settings? Meet timewarp zones.
The idea:
Have a by-zone sense of time, offset from zone-specific data in time-retrieving calls. Allow this time to be set either as a zone property (think zoneadm) or through time-setting functions in the zone itself. Note that zones don't currently have permissions to change the system time, so this would alter a certain functionality: instead of EPERMing on time-setting calls, the zone-local time offset would be changed.
The implementation:
Though it seems like system calls would be a logical place to checkpoint (we really only want to change observable times), there's a slight snag: a few time commands are fast-trap syscalls, meaning that they circumvent the normal path and skip straight to machine code. Fast-traps save the overhead of full-blown kernel mode because they are certain to not block. For these, the checkpoint has to be at a deeper level, which is unfortunately at a level called by the non-fast-trap syscalls (so this could introduce double-offsetting). Also, at this level, time functions diverge to their architecture-specific implementations, meaning that there is no one "time" checkpoint. It will likely be simple enough to stick a common call into all of these to offset the time as necessary, but ultimately it might be nice to insert an extra common layer which unites all of them (still need to think about that one a little).
There's two classes of time-accessors to look out for: time of day (TOD) and high-resolution (hrestime). TOD is the information which is persistent from boot to boot--this is the kind that's backed by batteries. hrestime carries nanosecond precision and is derived from the system's high-resolution timer (though not directly). When you call stime, your kernel is setting both. When you call time, your kernel is actually only checking hrestime. A third class of time-accessor, hrtime, does exist, but it is used only for relative timing purposes and doesn't have any correlation to world time (it is, however, guaranteed to be monotonically increasing). Since nobody checks it for datetime and it shouldn't be compared to any other types of time, it will be exempt from the zone offset.
I'm just getting into implementation so we'll see how this goes. If there's something I'm missing in my dissection of the time subsystem, feel free to let me know--I just started looking at it about 8 hours ago.
The idea:
Have a by-zone sense of time, offset from zone-specific data in time-retrieving calls. Allow this time to be set either as a zone property (think zoneadm) or through time-setting functions in the zone itself. Note that zones don't currently have permissions to change the system time, so this would alter a certain functionality: instead of EPERMing on time-setting calls, the zone-local time offset would be changed.
The implementation:
Though it seems like system calls would be a logical place to checkpoint (we really only want to change observable times), there's a slight snag: a few time commands are fast-trap syscalls, meaning that they circumvent the normal path and skip straight to machine code. Fast-traps save the overhead of full-blown kernel mode because they are certain to not block. For these, the checkpoint has to be at a deeper level, which is unfortunately at a level called by the non-fast-trap syscalls (so this could introduce double-offsetting). Also, at this level, time functions diverge to their architecture-specific implementations, meaning that there is no one "time" checkpoint. It will likely be simple enough to stick a common call into all of these to offset the time as necessary, but ultimately it might be nice to insert an extra common layer which unites all of them (still need to think about that one a little).
There's two classes of time-accessors to look out for: time of day (TOD) and high-resolution (hrestime). TOD is the information which is persistent from boot to boot--this is the kind that's backed by batteries. hrestime carries nanosecond precision and is derived from the system's high-resolution timer (though not directly). When you call stime, your kernel is setting both. When you call time, your kernel is actually only checking hrestime. A third class of time-accessor, hrtime, does exist, but it is used only for relative timing purposes and doesn't have any correlation to world time (it is, however, guaranteed to be monotonically increasing). Since nobody checks it for datetime and it shouldn't be compared to any other types of time, it will be exempt from the zone offset.
I'm just getting into implementation so we'll see how this goes. If there's something I'm missing in my dissection of the time subsystem, feel free to let me know--I just started looking at it about 8 hours ago.
Posted by Tim Foster on June 16, 2007 at 07:20 AM PDT #