Just Throw Storage At The Problem
Most people are very surprised when they learn how cheap Google's infrastructure costs are (a 2003 paper estimated a 3-1 advantage over their competitors). They do this by building a system out of uniform, general purpose boxes that are simple and inexpensive to acquire. Where they really save money, however, is in the administration of those systems. Their hard drives fail just like everybody else, their boxes die just as often, but they don't have to run around getting them back up, replacing components and dealing with down time for their applications. They have implemented a solution that assumes failure as a steady state rather than trying to detect it, fight it, and spend money to prevent it. They are fairly secretive about this system and how they manage it, but details have leaked out in the papers that they have presented to conferences.
You don't need to reinvent the Google wheel however to realize the benefits of the Google approach to creating solutions for enterprise computing. They have a highly vertically integrated operation that has them building their own systems and creating their own infrastructure software (including operating systems). You just need a systems vendor that will sell you boxes that can work the same way so you can monetize the benefits of Google's approach. Suppose that instead of having to run out and fix problem disk drives every time they fail (why? because in a RAID5 or RAID1 approach, one more failed drive loses data), you just buy more drives than you are actually going to need with the expectation that the steady state is periodic drive failures. Now suppose that you bought enough extra drives so that you wouldn't need to go out and replace any drives for a whole year - how much money would that save you? After all, you are likely provisioning more storage than you need anyway. Who keeps their allocations so tight that they need to constantly expand volumes and filesystems? That's expensive. This paradigm is really just an expansion of that money saving concept, applied to the system as a whole. Allocate more disks to the system than you will need so that you can let some of them fail while you spend your time on proactively improving your customer's operating environment. Of course this would not work if every system only has 4 drives in it. You would need a lot more for the law of averages to help you out. How about 48 drives? The Sun Fire x4500 Server is just such a system. The drives can be replaced while the system remains online, but the point is that you don't need to do so very often. ZFS is the secret sauce we use to make sure the system keeps running and does not lose your data as the drives fail. Capacity? Well you should look at capacity more like a sawtooth wave where capacity decreases over time until you make a drive swap "run" on your systems, spiking back up to the max, eventually decreasing again until the next run. The planned capacity that you use for allocating storage is the low point of this sawtooth curve, on average. Because performance is also a function of how many spindles you have working on a given problem, it also looks like a sawtooth as well. Are you comfortable with this notion? Many people likely would not be, were it not for the Google proof that has already demonstrated significant cost savings.
You don't need to reinvent the Google wheel however to realize the benefits of the Google approach to creating solutions for enterprise computing. They have a highly vertically integrated operation that has them building their own systems and creating their own infrastructure software (including operating systems). You just need a systems vendor that will sell you boxes that can work the same way so you can monetize the benefits of Google's approach. Suppose that instead of having to run out and fix problem disk drives every time they fail (why? because in a RAID5 or RAID1 approach, one more failed drive loses data), you just buy more drives than you are actually going to need with the expectation that the steady state is periodic drive failures. Now suppose that you bought enough extra drives so that you wouldn't need to go out and replace any drives for a whole year - how much money would that save you? After all, you are likely provisioning more storage than you need anyway. Who keeps their allocations so tight that they need to constantly expand volumes and filesystems? That's expensive. This paradigm is really just an expansion of that money saving concept, applied to the system as a whole. Allocate more disks to the system than you will need so that you can let some of them fail while you spend your time on proactively improving your customer's operating environment. Of course this would not work if every system only has 4 drives in it. You would need a lot more for the law of averages to help you out. How about 48 drives? The Sun Fire x4500 Server is just such a system. The drives can be replaced while the system remains online, but the point is that you don't need to do so very often. ZFS is the secret sauce we use to make sure the system keeps running and does not lose your data as the drives fail. Capacity? Well you should look at capacity more like a sawtooth wave where capacity decreases over time until you make a drive swap "run" on your systems, spiking back up to the max, eventually decreasing again until the next run. The planned capacity that you use for allocating storage is the low point of this sawtooth curve, on average. Because performance is also a function of how many spindles you have working on a given problem, it also looks like a sawtooth as well. Are you comfortable with this notion? Many people likely would not be, were it not for the Google proof that has already demonstrated significant cost savings.

