20081118 Tuesday November 18, 2008

Have you tried x86 Fast Reboot?

I was doing some non-scientific testing with the recent x86 Fast Reboot option which was added to reboot, thanks to Sherry Moore.

On one of my test systems I'm seeing about 32 seconds to reboot -f (1M) , opposed to 1 minute and 20 seconds for a typical reboot (1M). Systems are limited to being supported, since you need quiesce (9E) added to all drivers to allow the Fast Reboot.

If the only drivers without quiesce() support on the system are display and/or audio, you can set force_fastreboot=1 in /etc/system. If the NIC (wired or wireless) driver doesn't have quiesce(), you can unplumb the interface first then do fast reboot.

This is about 35% of a full reboot through BIOS on the machine I'm testing on. This is a system not running the desktop GUI, so mileage may vary.

This is very useful! The more we are able to work with systems to manage devices, we will have a more effecient system. Kudos to Sherry for this effort!

( Nov 18 2008, 02:39:20 PM PST ) Permalink Comments [8]
Comments:

But what good is this feature, when reboot(1M) doesn't shut services down?

Just imagine what would happen on a system where for example Oracle, or even better Sun Cluster + RAC were running, and one used reboot(1M)!

I don't get it. What good is this feature? I asked Sherry Moore the same question, and didn't get an answer. Seems that nobody really gave any thought to the fact that reboot(1M) doesn't shut services down, but everybody trompets how great it is that the system can be rebooted fast, very fast!

For me as a system engineer, these things are frustrating to no end. It's like Menlo Park is its own kindergarten, in its own world, with no idea what is going on in the outside world!

Posted by UX-admin on November 19, 2008 at 02:06 AM PST #

Everyone is not running RAC though, and one can run a lot of solutions using MySQL, rather than being held hostage by someone's ego.

I think it's very useful to have such a feature, and see how it could be useful to manage VMs for instance, where most all the devices might be virtual.

The good thing is that for someone like yourself, there's no harm as you don't need to execute it, AFAICT.

If you don't care to use the technology, there is no need to beat up on the engineers that do work on the system. What have you done lately that is so great?

Posted by Alan Duboff on November 19, 2008 at 01:07 PM PST #

All I wanted to know is why is this feature so great? Why? Nobody seems to know, yet everyone who's written about it raves how great it is!

And it's not about ego, look what you wrote up there: do you think MySQL won't break and bust if you just do a reboot(1M) (no matter how fast it is!), that MySQL doesn't need to be shut down properly?

As for myself, I've done many things lately that have caused dropped jaws, but I can't write about any of them, because my livelyhood depends on keeping my mouth shut.

What I can write is that I'm just finishing the foundations of a self-administering, self-healing network (!!!), and that I've written a ton of code to get to this point.

So just because I don't blog and brag about it, don't you think for a second that I don't crank out code -- and not just crank out code, but *design* things as well!

Posted by UX-admin on November 20, 2008 at 12:20 AM PST #

UX-admin: I am sure you are familiar with the fact that "init 6" calls uadmin(2), which means enabling Fast Reboot for "init 6" is a one line change (see 6768468 "init 6" should invoke fast reboot on x86). However, we have decided that such invocation should be SMF driven, ie, to provide SMF methods to enable and disable the ability to choose Fast Reboot or not when typing commands such as "reboot" or "init 6", so that users like yourself who don't care for it don't have to use it.

Posted by Sherry Moore on November 20, 2008 at 08:59 AM PST #

UX-admin,

I am not sure why you feel nobody seems to know why this is so great, I stated in this blog post that the better we are able to work with devices, the better we can manage the system.

Consider where OpenSolaris is today?

We have power management working on a small set of laptops. I know that might not sound like a very big accomplishment to you, but it is. We are moving along with power management and fast reboot is yet another piece to the puzzle that could be very useful.

If we compare where OpenSolaris is today vs. 4 years ago, we have much better device support than ever, including wifi drivers. Much of this work is possible through the engagement of our partners and vendors. This is key for OpenSolaris, IMO, because the better we understand and know how to work with devices, the better off we are as a whole.

From my view you owe the engineers an apology for slapping them upside the face with your comment about Menlo Park, that was uncalled for. Most of the engineers have thick skin, so no worries, but some have pretty good memories and will remember this type of comment for a long time to come...

Posted by Alan Duboff on November 20, 2008 at 10:06 AM PST #

@Sherry: I still don't quite get it, but thank you for finally answering my question; I see that 6768468 is fresh, fresher then back then when you first blogged about it. Believe me when I write, *I want* to understand! I'll take my questions to your blog directly, because there is no sense in going over Alan's.

@Alan: regarding long memories & thick skin, I stand behind everything I wrote (and I'm convinced that my day of reckoning will come!)

Sun engineers are numero uno, but even they sometimes need to be reminded that there is a real world out there, and that the Silicon Valley shouldn't be a kindergarten: just look at Java and RBAC, the two greatest botches that came out Sun, and perhaps it'll be clear what I mean (although apropos RBAC, technically speaking, Darren isn't in Menlo Park, but it's still SUNW.) So not everything is rosy and great.

And you'll have to admit, although they're the best, Sun engineers have a long history of suffering from the "not invented here" syndrome. Need I remind you of your days with me on the newsgroups? If my memory serves me correctly, you used to get frustrated about that at times too. Now that you are on the other side of the fence, you don't look at it the same way. How quickly you forgot! Or shall we give your post about /bin/sudo in OpenSolaris a second look again for a reminder?

When it is my time to get slapped up and down, I will not cringe, or whimper, or beg for mercy!
In fact, I'm quite grateful when my peers look at my work and criticize it. I botch stuff too, and sometimes I need to be slapped up and down in order for me to move from the "2nd system effect" to a 3rd generation system, or in order to get back to reality. That's what peers are for!

I accept that I might be obnoxious, but just for the record, I mean well.

And finally, all I did was ask a simple question. I wrote several times that *I don't get it*. Now, which part of *I don't get it* was unclear?

Posted by UX-admin on November 20, 2008 at 12:25 PM PST #

UX-admin,

Yep, I know as well as anyone else of some of the problems that have plagued Sun in the past and some that still do plague them. I am doing what I can to try and change some of those problems, not just spouting off about it.

There is no question that Sun is one of the best engineering departments to exist. Some of the industries brightest folks are here. But we are also going through some massive changes with our company.

Slapping the Solaris engineers (i.e., your reference to MPK) in the face and slashing them with being Kindergarten is just unacceptable.

By all counts, the reading and writing levels within Solaris Engineering are clearly at the 2nd grade level, if you exclude me, and even I'm at the 1st grade level! :)

Posted by Alan Duboff on November 20, 2008 at 01:08 PM PST #

"Slapping the Solaris engineers (i.e., your reference to MPK) in the face and slashing them with being Kindergarten is just unacceptable."

If I am wrong, I am prepared to answer for what I wrote, and suffer the consequences, such as they might be.

If however I am correct... you've got another thing coming.

Most engineers don't engineer for engineering's sake, but to solve customer's problems. If a customer isn't satisfied with the solution, then either

a) you try to explain why the solution is good for them

or

b) you change the solution for your customer.

Like I wrote previously, the explanation from Sherry still left questions in my mind. By all accounts, what I've read so far means that UNIX, and the way init works can, and has been changed in fundamental way, or ways.

I recognize that change is inevitable, and most of the time, change is good... but some changes are fatal.

Posted by UX-admin on November 20, 2008 at 01:28 PM PST #

Post a Comment:

Comments are closed for this entry.