Derrick's Security Weblog
Why did that security patch take so long? part 2
Speed vs. Quality
I think this is one of the biggest decision points in producing patches. Do you put something out as quickly as possible, or do you test the heck out of it and make sure the chance of a bad patch is as low as you can possibly make it?
Long ago Sun made a decision that quality was the most important feature of our patches. As a result we have a pretty extensive and thorough process that code changes must go through before they ultimately wind up in a nice little patch for customers to put on their systems. As business decisions go, I think this is a pretty wise thing to do. Few things can aggravate a customer more than rolling a patch onto hundreds or thousands of systems only to find out that it broke something or you have to take them all back off. Plus, when you consider that many customers test patches anyway, or only have limited maintenance windows during which to install patches, adding an extra week or two before it rolls out of Sun’s doors really didn’t make all that much of a difference.
The problem is, making changes to operating system code can be dangerous. Sure, once you’ve identified a bug the fix for that particular issue may seem simple, but an OS is a big complicated thing. There are many interdependencies between libraries and protocols and the kernel, etc. Add to that Sun’s commitment to full backward compatibility and support for two (sparc & x86) hardware platforms and it just takes a while to test patches.
But, and this isn’t the only time I’ve had to say this, security is different. A customer who runs a normal patch through a 3 month testing cycle will put a security patch on today if it fixes a nasty vulnerability. Plus, many security vulnerabilities are really not all that difficult to fix, once you find them. Yes, it might take some extremely elegant coding or aligning of planets to make an attack work correctly, but in most cases (the recent hyperthreading issue is definitely an exception) once you’ve found the problem, fixing it is relatively straightforward, and most of the time there is little danger of the security fix breaking anything else. So, does it make sense to run a security patch through the same rigor as normal patches?
Add to that the fact that many security fixes are in common code (sendmail, kerberos, BIND, etc.) that many of the Unix/Linux flavors use. It’s pretty hard to explain why Sun would take a month to fix something the Linux vendors would fix in a matter of hours. Sure there were cases when the hour-to-develop fixes had problems, sometimes significant ones, but that alone wasn’t enough to stop the rising criticism directed at our long security turnaround times.
Enter ISRs (Interim Security Relief). When one of our software engineers makes a code change, they can bundle that change up in a tidy little package called an IDR (Interim Diagnostic and Relief). This IDR will install like a patch, is trackable through showrev, but must be removed before the files it modifies are modified again. It’s a really handy way of distributing either a diagnostic binary for gathering extra data on a system or a potential fix that we want to test in a customer’s environment. Basically, the software engineer simply takes their code changes and packages them up into an IDR. What a perfect middle ground for security fixes. Now, when we have a sudden security fix (aka public), our software engineers can fix it on the spot and produce a security version of an IDR. We can even make them available on "sunsolve":http://sunsolve.sun.com/pub-cgi/show.pl?target=security/tpatches&nav=tpatches .
There are a bunch of things to realize about these ISRs. They are only lightly tested. They aren’t an official patch, nor are they even a guarantee that a patch will come out to replace it. They have to be manually removed before you install a real patch (or even another ISR/IDR) over the same files. But, we can put one of these things out within hours of discovering a problem. In other words, and ISR is no different in quality and testing that what you’d get from an hour-to-develop fix for a Linux distribution.
What we’ve done is given customers a choice in the speed vs. quality decision for security fixes. If you want to wait for a fully tested patch (and yes, we’ll be working hard to get that out as quickly as possible) you can, but if you have to have a fast, lightly tested, fix on your machine immediately, you can do that as well.
I haven’t heard from a single customer who thinks ISRs are a bad thing.
Posted at 09:27AM Jun 13, 2005 by drscholl in Security Coordination | Comments[0]
Monday Jun 13, 2005