How to tell a hardware from a software person
They both write code on a screen at a very high abstraction level, they test their code before integrating into a larger blob through highly abstracted interfaces. Verilog looks just like a structured programming language to the uninitiated, and tools keep most coders equally removed from the ultimate assembly language and transistor level details.
Some argue that the large costs of tooling modern semiconductors put a perfection burden on the hardware designers that, through insomnia, molds them into somber personalities. Software engineers are pictured, by opposition, as carefree characters always able to land on their feet by recompiling and patching. This is all passe'. Hardware design has adopted a train model where fixes are phased in at multiple pre-defined points, and the impact of software defects can result in damages exceeding semiconductor tooling costs.
To tell software and hardware people apart just ask QUI BONO?
Specifically: Qui bono from Moore's law? Who benefits from Moore's law?
Moore's law renders hardware achievements obsolete, while turning slow or bloated software into achievements. A colleague and myself designed the first LAN controller with embedded memory, a first that enabled packaging the entire controller in 24 pins. We put 2 kilobytes of RAM buffers, solved the embedded memory yield issues, went for beers and felt great about our achievement. Our bragging rights for that chip lasted as little as our beers. Darn law. So the test is simple, if you are a victim of Moore's law you are in hardware; if you are a beneficiary of Moore's law you are in software.
Hardware is further victimized by Moore's law constant pressure on price. Sometimes this leads to spiraling prices for a given hardware function, and other times to increased capacity for a roughly constant price. Server processors have followed the latter, namely the speed bump regime. Successive processors push the clock frequency higher and thus deliver a performance benefit instead of a cost reduction. Software executes faster, subsequently making a bigger and more complex software edifice viable.
But if all good things must come to an end, how long will this regime last? Moore's law is not out of gas yet, but cranking up processor clocks is getting harder and less productive. You heard the reasons, power dissipation vs. frequency and the impact of system memory latencies. Interestingly, the UltraSPARC T1 CMT anticipates this new regime of exponential growth in transistors without a good incentive to push the frequency further. Will customers demand a price reduction now that the speed bump is dead? Not if we transition instead to a thread bump regime. UltraSPARC T1 inaugurates this transition to a thread bump regime, and consecutive CMT generations offering thread increases commensurate with Moore's law should provide the bumps. (Note to self: Contact Niagara add agency with idea "We are the Bumps in Thread Bumps".)
But wait a minute, does this mean that our carefree software developers are no longer automatic beneficiaries? Indeed, this time around they may have to sweat a bit more to turn additional threads into software achievements. Oh, and maybe multi-core processor designers can have simpler lives now that there is more repetition and less unique circuits to design, verify, and lose sleep over. Not quite a reversal between victim and beneficiary, but we may need a better test than QUI BONO in the future.
[ Technorati: NiagaraCMT, Solaris ]
Posted at 08:32AM Dec 07, 2005 by Ariel Hendel in Sun | Comments[1]
The warmth of vacuum tubes
I grew up listening to vacuum tube nostalgia. Radio technicians could diagnose a radio receiver with just a screwdriver, and sometimes even fix it. But beyond that, the transition from vacuum tubes to semiconductors was a religious topic within the Radio Amateur circles. It got harder to build your own gear, some said it didn't sound the same, the non-linearities of transistor amplifiers, you know. But the main complaint was not technical. Radio Amateur operators missed them because vacuum tubes kept their hands warm during the cold winter nights.
Radio Amateur anecdotes are some of the most memorable stories I could tell, maybe some other day, on some other blog. And I would also join the mourning of the vacuum tube, if it weren't for a more profound and recent displacement I need to mourn. The displacement of the HF radio Amateur at the hands of the Internet... A 3khz voice channel, shared, that may or may not work on a given day to a given place, displaced by a DSL line and a browser. I am not the only deserter, just look at the roofs of a city like Montevideo, once the highest density of Yagi antennas on the planet, you walk its streets today and there are few antennas to be seen. Victims of the ubiquitous Internet.
Yet I appreciate irony, and with modern life's Internet addiction filling my once HAM radio nights, I discover that the Athlon laptop gets warm just like the old vacuum transceivers. Deja Vu. We replaced hot vacuum tubes with cooler solid state radio, then things got pretty hot when we put lots of transistors in NMOS integrated circuits. I recall my first IC design in NMOS, clocked at a meager 10MHz, it required a ceramic package and got too hot to touch. They got cooler again with CMOS, so we started building bigger and faster semiconductors. Up to the point that the semiconductors running a lowly laptop keep my hands warm. The Internet server infrastructure (replacing the Ham radio ether) requires major ventilation and air-conditioning. At the rate we are going we might have to host the planet's infrastructure at the poles. How is that for an idea, dual home the entire net infrastructure to the North and South poles, no single point of failure, affordable land, maximum redundancy, cooled by keeping the windows open, and solar cells for 24 hours a day (well, half a year). I digress, but you read it here first. Hosting the net at the poles...
What is next? How do we pull the CMOS cool device trick again? For the moniker we can certainly reuse the letters CM, that is a start. As for the substance, let me narrate a customer lab visit we had here in Newark. We were showing off our first UltraSPARC T1 bringup machine, verbally conveying how naturally Horizontal Micro-scaling fits telephony infrastructure network elements. Brought Solaris up, showed our demo, and asked the customer to do the honors and check how many processors Solaris reported. Impressing somebody by printing the number 32 on a screen may not get you very far socially with your friends, or at bar, but for a skeptical techie the number 32 out of a single processor socket was meaningful. He lived on that side of the fine line between skepticism and paranoia. He touched the processor, and feeling it cold, accused us of smoke and mirrors; basically of running the demo from a different machine. We proved the accuser wrong, and ended up making the unintended point that Niagara really is a Cooler technology. We earned the right to reuse the letters for the next cool technology, CMT.
But before you start asking about how to keep your hands warm in the CMT era, fear not, there is still Memory, that is, plenty of DIMMs to keep the operator warm. What a coincidence, every train of thought takes me back to the Memory theme.
[ Technorati: NiagaraCMT, Solaris ]
Posted at 12:14PM Dec 06, 2005 by Ariel Hendel in Sun | Comments[1]
Turning the tables on Horizontal Scaling
While previewing the UltraSPARC T1 CMT processor to many customers I have been saying that it turns the tables on horizontal scaling, and here I am to expand on what was just a bullet in my slides through this and other CMT related musings.
We hear about Google indexing and search server farms as the ultimate in horizontal scaling, 5000, 6000, 10000 servers. Every time I hear about Google the number goes up. Not the stock, the number of servers in the farm. I also remember Subodh stating that whatever can be horizontally scaled, will. Essentially the wisdom of Horizontal Scaling is to realize that the cost per processor tends to be higher in large SMP systems than uni or dual processor systems. If the application (or the workload) is amenable to separation as loosely coupled networked processing, then the Horizontal Scaling (a.k.a. scale out) architecture is appealing based on cost, and service availability metrics.
We all know that web facing workloads are horizontally scalable by virtualizing the service IP address through a load balancer box, or using Round Robin DNS. Wireless telephony servers also tend to use scaling out for most network elements, generally with a clustering or HA layer instead of the simplistic intercepting load balancer used for IP networks. Sunray servers are hybrid species, they are deployed horizontally as groups of servers, but users also want fast Sunray sessions on fast SMPs. This point is proven by my colleague Jochen, who has mastered the art of manual load balancing, that is, repeatedly sliding a Javacard until you get the fastest Sunray server in the building.
Does Subodh's prediction mean that the world is inexorably converging on server farms or clusters? Does it mean that commoditized generic 1P/2P whiteboxes will underpin our Internet and telephone networks? Not if we realize that there is one act left to Horizontal Scaling, that is, scaling processors inside a chip rather than scaling them out across the sheetmetal of discrete servers. CMT is exactly that, turning the tables on Horizontal Scaling by keeping the cost and availability benefits, but throwing out its drawbacks. And one of the drawbacks of the scale out model is the theme that got my blogging started: Memory.
Next time somebody tells you about the Google server farm, forget the number of servers they claim by then, just visualize the number of memory DIMMs, imagine them all lined up and warm just in case you want to search something at that very moment. Now ask yourselves, is there a way of getting the computation to scale horizontally without having to spread all these memory DIMMs all over the place? CMT is exactly that, distributing computation while consolidating the system memory. Need a shorter description of that? How about Horizontal Micro-scaling . You heard it first in my blog.
Now I'll stop and wait for a comment or two before I go on. And the one comment I need is: But, ain't memory cheap anyways?
[ Technorati: NiagaraCMT, Solaris ]
Posted at 09:05AM Dec 06, 2005 by Ariel Hendel in Sun | Comments[2]

