Hal Stern's thoughts on the economy, software, services, technology, and snowmen. Hal Stern: The Morning Snowman

Thursday May 28, 2009

I love when customers play "stump the geek" and ask really insightful, serious questions. It's partly what makes being a systems engineer at Sun challenging and fun (and yes, I consider myself an SE within my own group, but I'll pass on the is-a has-a polymorphism jokes, thank you). Yesterday's question scored an 8 for style and 9 for terseness (usually a difficult combination to execute):

What are the top developer problems we haven't run into yet? I gave an answer in three parts.

1. Unstructured data management and non-POSIX semantics. Increasingly, data reliability is taking the shape of replication handled by a data management layer, using RESTful syntax to store, update, and delete items with explicit redundancy control. If you're thinking of moving an application into a storage cloud, you're going to run into this. Applications thriving on read()/write() syntax are wonderful when you have a highly reliable POSIX environment against which to run them. And no, don't quote me as saying POSIX filesystem clusters are dead - the Sun Storage 7310C is an existence proof to the contrary. Filesystems we loved as kids are going to be around as adults, and probably with the longevity of the mainframe and COBOL: they'll either engineer or survive the heat death of the universe. There is an increasing trend, however, toward WebDAV, Mogile, SimpleDB, HDFS and other data management systems that intermediate the block level from the application. New platforms, not at the expense of old ones.

2. Software reliability trumps hardware replacement. An application analog to the first point. Historically, we've used high availability clusters, RAID disk configurations and redundant networks to remove single points of failure, and relied on an active/active or active/passive application cluster to fail users from one node over to a better, more healthy one. But what if the applications are highly distributed, recognize failure, and simply restart a task or request as needed, routing around failure? IP networks work (quite well) in that sense. It requires writing applications that package up their state, so that the recovery phase doesn't involve recreating, copying or otherwise precipitating state information on the new application target system. There's a reason REST is popular - the ST stands for "state transfer". And yes, this worked really well for NFS for a long time. Can I get an "idempotent" from the crowd?

3. Parallelism. If not bound by single thread, what would you waste, pre-compute, or do in parallel? This isn't about parallelizing loops or using multi-threaded libraries; it's about analyzing large-scale compute tasks to determine what tasks could be partitioned and done in parallel. I call this "lemma computing" -- in more pure mathematics, a lemma is a partial result that you assume true; someone spent a lot of time figuring out the lemma so that you can leverage the intermediate proof point. When you have a surfeit of threads in a single processor, you need to consider what sidebar computation can be done with those threads that will speed up the eventual result bound by single-thread performance. This isn't the way we "think" computer science; we either think single threaded or multiple copies of the same single thread.

That was my somewhat top of mind list, based partly on the talk I gave at CloudSlam 09 which will be updated for SIFMA in New York later this month.

Thursday May 07, 2009

Sys-Con.TV has posted the video of our "Cloud Computing Round Table" held on March 29 in Times Square. It was a fun exchange with a lot of sharp dialogue and discussion about reliability, application fit and function, and whether or not amazon.com is going to eat your data center. Of course, having amazon.com CTO Werner Vogels on the panel made that discussion topical and lively.

Tuesday May 05, 2009

Had breakfast with a friend this morning who commented on the state of the economy in and around our neighborhood by saying that "there are many free agents available." He wasn't talking about the Yankees, Mets, Devils, Rangers, Knicks, Nets, or any other sports franchise that funnels ticket revenue into the hands of free agent players who haven't delivered a local championship since 2003 (Devils, Stanley Cup). His perception was that with many technology people on the move, the market is ripe for new ideas coming to fruition in new (and old) companies; cyclical unemployment injects strategy and experience into companies that invest in newly available players. Friend's summary comment: "In two years, we'll see another wave of breakthrough innovations." It would be an early indicator of technology helping the economy innovate its way out of the current slump.

Why would this work for technology companies and not sports franchises? Quite simply, the acquisition of a free agent is unlikely to change the basic strategy of a team or the rules of a game. Strategic changes in a game almost always result from a lack of talent, not the sudden availability of creative people.

Int this current NHL season, the NJ Devils changed from a defensive-minded style to a goal-scoring, offensive strategy when goaltender Martin Brodeur suffered an injury requiring four months of recovery. Late San Francisco 49ers coach Bill Walsh perfected the West Coast Offense (read Michael Lewis' The Blind Side for a compelling story wrapped in a West Coast Offense) and forced strategic defensive changes in the game. And the grandfather of several current NBA offensive schemes is Pete Carrill's Princeton Offense. What do all three have in common? They were designed to deal with a deficit of talent or skill: goaltending and first-rate defense (Devils), rushing (West Coast Offense), and height (Princeton Offense).

The barriers to entry for new ideas have never been lower: you can develop your idea using a wealth of open source software, deploy it to test in a cloud infrastructure and leverage social networking mechanisms to spread awareness. It's a ripe environment for engineers to give us something (locally) to celebrate.