A colleague of mine recently pointed me at a blog entry on the subject of automated flexing of service infrastructure in response to a rapid increase in demand. In it, he sites evidence that demand can spike nearly an order of magnitude in a period of less than a minute. He, as well his sources, conclude that the ability for clouds to respond to such a rapid and massive increase demand is "unrealistic". Food for thought. Are they right? Is all lost? Or perhaps, is their angst the result of a myopic view that provisioning a new server has to mean deploying a new Zen container with the requisite gigabytes of OS and application images?
A few weeks ago, I penned an entry on the details of how we flex the CWS web tier. This morning I did some manual testing of our new ReSTful control document interface (due out later this month) and found that a command issued from off-grid to start a new server instance takes between 10-20 secs for the web server instance to start running. A number of things are working in our favor here. Beyond the general speed of platform provisioning provided through pC zone model, as CWS instances share a common application image, there is no need for any significant data motion when provisioning a new web server instance. Starting a new web server instance is pretty much just that, create a new process registration, replumb the load balancer and tell the system run the server, effectively, java -jar carolinewebserver.jar. Viola!
There are a couple of things currently in the way of the current version of CWS responding to an order in magnitude demand spike. The first impediment is that currently the CWS agent server allocates new servers
sequentially. Here, however, its a simple matter parallelize the allocation and in fact I had already been planning to do so anyway in the near future. This change would likely need to be paired with a few tweaks to our flexing algorithm as suggested in my previous article, i.e. cutting the polling interval down a few ticks and implementing a weighting scheme for the number sampling more geared toward anticipating spikes in demand. With these changes, we likely can make a credible attempt at handling a 10 fold increase in traffic and provision the necessary resources in well under a minute. Of course, whether we can do this at the scale suggested in this article will likely remain an open question for some time.
Can pC provide a solution where others might fall short? As we've yet to deal with the demand levels in the outlined scenario, it is far too early to say. But do we hold out the promise of an architecture capable of such feats? Personally, I'm more concerned about rapid, accurate and timely detection of demand increase, not to mention the raw throughput of our individual servers than I am about how quickly we are able to deploy new ones. So in my mind, the answer is a qualified, "yes". Hopefully we will find out over the coming months.