|
Tuesday December 28, 2004 | Scott & The CEO | General |
A couple years ago, a senior field systems engineer wrote to me said, in part:
I recently had lunch with Scott McNealy and the CEO of one of our largest customers. Scott launched into his BFWTS pitch and I watched the CEO play with his pasta salad. When Scott switched topics to talk about what we can do together to increase the CEO's competitive situation, the CEO paid close attention, and offered to pay for lunch!
I still find this interesting and enlightening. Too often, I think, we (Sun's field) focus on the wrong value proposition when we talk to business execs. Scott can teach us an important lesson.
We know how to talk to IT execs....
There are those who need to solve a specific point problem. We seek to understand their challenges and show how our (and our partners') products and technologies and services can be tied together to provide a complete solution to that problem. We make sure the customer appreciates why Sun's solution is superior to what our competitors might be proposing. We can also describe a collection of proof points related to our vision and experience and market share and investments and quality initiatives and technologies, etc, that paints a picture of a full solution provider that "gets it".
There are those IT customers who are focused on big picture. We describe how we can work together to create a more efficient IT organization that is more tightly bound to the business. We give examples of what we've accomplished with others, and explain how our partnership and initiatives will create a more robust and adaptable infrastructure with reduced IT cost, freeing up cash to fund incremental innovation and progress. We can provide ideas about initiatives they might like to investigate, and offer to develop a Proof of Concept around some of them.
But, outside of IT, how do you help a CEO/COO/CFO increase their success on *their* competitive battleground? Discussing the latest in dual-core superscalar CPU designs or OS partition strategies probably isn't the right approach :-)
A good start may be to help the business exec understand how others are effectively using IT as a competitive weapon, rather than just a captive overhead cost-center. We need to offer C-level ideas about how IT can be retooled to make their company's core value proposition more attractive and/or accessible in the marketplace. Ultimately this is what IT can do to increase their competitive situation. IT can help generate wealth. In addition, we can also show how IT can drive cost out of operations. New services that streamline or provide new B2C and B2B opportunities.
While Sun is a powerful crucible of IT innovation, we (each one of us, talking to our customers) need to spend more time demonstrating how our technology has transformed businesses into leaders in their market segment. And how we can do the same for the customers with whom we are interacting. Technology discussions and sales (the cart) will follow the value proposition (the horse).
December 28, 2004 09:20 AM EST Permalink
Sunday December 26, 2004 | NextGen MP3 Player | Personal |
There are several devices that have features that I desire. But they need to converge. I'm betting they will by next Christmas. For example, I'd like a Digital Audio Player that has:
This is closer than you might think... I bought a tiny 512MB SanDisk MP3 player with an FM radio and voice recorder. Perfect for commute time and jogging. But the voice recorder is poor, and it records voice to WAV, not MP3. It also doesn't have a flash expansion slot or a line-in jack. Some other MP3 players have flash expansion slots (eg: Rio). The iRiver MP3 players have great MP3 encoders and line-in jacks that encode up to 320Mbps quality MP3s. Those also have a mic that works well for dictation *and* recording general meetings. Boomgear (and others) have started to add Bluetooth to their MP3 players... but the current generation simply routes BT-enabled cellphone audio to your MP3 player's headset... They can't send your music content to a stereo BT headset.
Of course, by then, I'll have added more to my list, and will have to wait until the following Christmas!
December 26, 2004 11:43 PM EST Permalink
| Santa & CS Lewis | Personal |
Being Christmas, I can relate to the old "joke" that there are three stages in a man's life:
I'm in Stage 2, trying my best to delay the onset of Stage 3 :-)
On a more serious note... and speaking of Christmas and Santa... I'm sure we've all found ourselves, at some point this season, reflecting on that epoch event in history for which the Christmas holiday (root: Holy Day) is fashioned. Church services around the world are standing room only events, filled with many folks who will not set foot in a church again until next Christmas... unless of course someone they care about gets married or dies in the meantime.
We all learned in The Great Election (Bush -vs- Kerry) what a farce exit polls can be, but I was just thinking that it would be interesting to perform an exit poll following Christmas Eve church services, with a single multiple choice question. A question that CS Lewis developed in his classic book "Mere Christianity" [1952], particularly in Chapter 4. It might go something like this:
Christmas Eve Church Service Exit Poll Questionnaire:
Q: Which of these best describe your opinion of who Jesus was?
And compare the results to the same population as they existed from work some mid-week day in, say, March. Hmmm.
I added choice #3 to Lewis' choices, because I think some might actually pick that as the "safe" option. Even though most serious scholars and historians, even those who are devout atheists, would not deny that Jesus existed, was a great teacher, made claims of divinity and equality with God, performed wonders and healings, and was eventually crucified, ostensibly because his teachings threatened the elite power base. Regardless, it is a reveling question... rather the answer to it is revealing. But it's hard for many to think about, because the answer has profound implications for the rest of one's life (and beyond). There is actually a Stage #4 in the "Santa Progression"... when the bell tolls, the No. 2 pencil is finally put down, and the test is handed in for grading. This is one test where Gene Kranz's (NASA, Apollo 13) famous saying rings true... "Failure is not an option". Thankfully, we've all been given the answer in advance, and it's an open book exam!
December 26, 2004 09:50 PM EST Permalink
Friday December 24, 2004 | /kevin: A BSC Giant | General |
Kevin (BSC's /kevin) helped me figure out how to customize my Weblog files and the look of my Blog page. Thanks a lot Kevin! It's easy, once you get pointed in the right direction.
I find that to be the case in many of life's challenges. Just dive in, and don't be afraid to ask a few questions of those who have gone before. And then be willing to help others who will follow, expecting/hoping that many will exceed even your own contributions.
One of my faviorite expressions came from a letter Sir Isaac Newton wrote to his friend and colleague Robert Hooke in 1676:
"If I have seen further it is by standing on the shoulders of Giants"
I like to think that, combined, we all make up the shoulders on which each one of us stands!
http://scienceworld.wolfram.com/biography/Newton.html
December 24, 2004 12:03 PM EST Permalink
Thursday December 23, 2004 | Big Sun Clusters!! | Computers |
The Center for Computing and Communication (CCC) at the RWTH Aachen University has recently published details about two interesting clusters they operate using Sun technology. RWTH Aachen is the largest university of technology in Germany and one of the most renowned technical universities in Europe, with around 28,000 students, more than half of which are in engineering (according to their website).
Check this out!
First, there is a huge Opteron-Linux-Cluster that consists of 64 of Sun's V40z servers, each with four Opteron CPUs. The 256 processors total 1.1TFlop/s (peak) and have a pool of RAM equal to 512GB. Each node runs a 64-bit version of Linux. Hybrid Programs use a combination of MPI and OpenMP, where each MPI process is multi-threaded. The hybrid parallelization approach uses a combination of coarse grained parallelism with MPI and underlying fine-grained parallelism with OpenMP in order to use as many processors efficiently as possible. For shared memory programming, OpenMP is becoming the de facto standard.
See: http://www.rz.rwth-aachen.de/computing/info/linux/primer/opteron_primer_V1.1.pdf
Another Cluster is based on 768 UltraSPARC-IV processors, with an accumulated peak performance of 3.5 TFlop/s and a total main memory capacity of 3 TeraByte. The Operating System's view of each of the two cores of the UltraSPARC IV processors is as if they are separate processors. Therefore from the user's perspective the Sun Fire E25Ks have 144 âprocessorsâ, the Sun Fire E6900s have 48 âprocessorsâ and the Sun Fire E2900s have 24 âprocessorsâ each. All compute nodes also have direct access to all work files via a fast storage area network (SAN) using the QFS file system. High IO bandwidth is achieved by striping multiple RAID systems.
See: http://www.rz.rwth-aachen.de/computing/info/sun/primer/primer_V4.0.pdf
December 23, 2004 06:34 PM EST Permalink
| Big -vs- Small Servers? | Computers |
Big Iron -vs- Blades. Mainframe -vs- Micro. Hmmm. We're talking Aircraft Carriers -vs- Jet Skis, right?
Sun designs and sell servers that cost from ~$1000 to ~$10 million. Each! We continue to pour billions into R&D and constantly raise the bar on the quality and performance and reliability and feature set that we deliver in our servers. No wonder we lead in too many categories to mention. Okay, I'll mention some :-)

While
the bar keeps rising on our "Enterprise Class", the Commodity/Volume
Class is never too far behind. In fact, I think it may be inappropriate
to continue to refer to our high-end as our Enterprise-class Servers,
because that could imply that our "Volume" Servers are only for
workgroups or non-mission-critical services. That is hardly the case.
Both are important and play a role in even the most critical service
platforms.
Let's look at the next generation Opterons...
which are only months away. And how modern S/W Architectures are
fueling the adoption of these types of servers...
Today's AMD
CPUs, with on-board hypertransport pathways, can handle up to 8 CPUs
per server! And in mid-2005, AMD will ship dual-core Opterons. That
means that it is probable for a server, by mid-2005 or so, to have 16
Opteron cores (8 dual-core sockets) in just a few rack units of space!!
If you compare SPECrate values, such a server would have the raw
compute performance capability of a full-up $850K E6800. Wow!
AMD CPU Roadmap: http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_608,00.html
AMD 8-socket Support: http://www.amd.com/us-en/Corporate/VirtualPressRoom/0,,51_104_543~72268,00.html
SPECint:_Rate: http://www.spec.org/cpu2000/results/rint2000.html
E6800 Price: http://tinyurl.com/3xbq2
Clearly,
there are many reasons why our customers are and will continue to buy
our large SMP servers. They offer Mainframe-class on-line maintenance,
redundancy, upgradability. They even exceed the ability of a Mainframe
in terms of raw I/O capability, compute density, on-the-fly expansion,
etc.
But, H/W RAS continue to improve in the Opteron line as
well. One feature I hope to see soon is on-the-fly
PFA-orchestrated CPU off-lining. If this is delivered, it'll be Solaris
x86 rather than Linux. Predictive Fault Analysis detecting if one of
those 16 cores or 32 DIMMs starts to experience soft errors in time to
fence off that component before the server and all the services
crash. The blacklisted component could be serviced at the
next scheduled maintenance event. We can already do that on our
Big Iron. But with that much power, and that many stacked services in a
16-way Opteron box, it would be nice not to take a node panic and
extended node outage.
On the other hand, 80% of the service layers we deploy are already
or are attempting to move to the horizontal model. And modern S/W
architectures are increasingly designed to provide continuity of
service level even in the presence of various fault scenarios. Look at
Oracle RAC, replicated state App Servers with Web-Server plug-ins to
seamlessly transfer user connections, Load Balanced web services, TP
monitors, Object Brokers, Grid Engines and Task Dispatchers, and SOA
designs in which an alternate for a failed dependency is rebound
on-the-fly.
These kinds of things, and many others, are used to
build resilient services that are much more immune to component or node
failures. In that regard, node level RAS is less critical to achieving
a service level objective. Recovery Oriented Computing admits that H/W
fails [http://roc.cs.berkeley.edu/papers/ROC_TR02-1175.pdf].
We do need to reduce the failure rate at the node/component level...
but as Solution Architects, we need to design services such that
node/component failure can occur, if possible, without a service
interruption or degradation of "significance".
In the brave new
world (or, the retro MF mindset) we'll stack services in partitions
across a grid of servers. Solaris 10 gives us breakthrough new
Container technology that will provide this option. Those servers might
be huge million dollar SMP behemoths, or $2K Opteron blades... doesn't
matter from the architectural perspective. We could have dozens of
services running on each server... however, most individual services
will be distributed across partitions (Containers) on multiple servers,
such that a partition panic or node failure has minimal impact. This is
"service consolidation" which includes server consolidation as a side
effect. Not into one massive server, but across a limited set of
networked servers that balance performance, adaptability, service
reliability, etc.

Server RAS matters. Competitive pressure will drive continuous
improvement in quality and feature sets in increasingly powerful and
inexpensive servers. At the same time, new patterns in S/W architecture
will make "grids" of these servers work together to deliver
increasingly reliable services. Interconnect breakthroughs will only
accelerate this trend.
The good news for those of us who love the big iron is that there will always be a need for aircraft carriers even in an age of powerful jet skis.
December 23, 2004 04:57 PM EST Permalink
Tuesday December 21, 2004 | Hotel Soap: Humor | Humor |
Comic Shelley Berman wrote this back in the mid-70s. Maybe it's because I travel so much for Sun (Delta Platinum for 5 years now) that I find this so funny. Enjoy!
Dear Maid,
Please do not leave any more of those little bars of soap in my bathroom since I have brought my own bath-sized Dial. Please remove the six unopened little bars from the shelf under the medicine chest and another three in the shower soap dish. They are in my way.
Thank you,
S. Berman
Dear Room 635,
I am not your regular maid. She will be back tomorrow, Thursday, from her day off. I took the 3 hotel soaps out of the shower soap dish as you requested. The 6 bars on your shelf I took out of your way and put on top of your Kleenex dispenser in case you should change your mind. This leaves only the 3 bars I left today which my instructions from the management is to leave 3 soaps daily. I hope this is satisfactory.
Kathy, Relief Maid
Dear Maid - I hope you are my regular maid.
Apparently Kathy did not tell you about my note to her concerning the little bars of soap. When I got back to my room this evening I found you had added 3 little Camays to the shelf under my medicine cabinet. I am going to be here in the hotel for two weeks and have brought my own bath-size Dial so I won't need those 6 little Camays which are on the shelf. They are in my way when shaving, brushing teeth, etc. Please remove them.
S. Berman
Dear Mr. Berman,
My day off was last Wed. so the relief maid left 3 hotel soaps which we are instructed by the management. I took the 6 soaps which were in your way on the shelf and put them in the soap dish where your Dial was. I put the Dial in the medicine cabinet for your convenience. I didn't remove the 3 complimentary soaps which are always placed inside the medicine cabinet for all new check-ins and which you did not object to when you checked in last Monday. Please let me know if I can of further assistance.
Your regular maid,
Dotty
Dear Mr. Berman,
The assistant manager, Mr. Kensedder, informed me this morning that you called him last evening and said you were unhappy with your maid service. I have assigned a new girl to your room. I hope you will accept my apologies for any past inconvenience. If you have any future complaints please contact me so I can give it my personal attention. Call extension 1108 between 8 AM and 5 PM.Thank you.
Elaine Carmen
Housekeeper
Dear Miss Carmen,
It is impossible to contact you by phone since I leave the hotel for business at 7:45 AM and don't get back before 5:30 or 6 PM. That's the reason I called Mr. Kensedder last night. You were already off duty. I only asked Mr. Kensedder if he could do anything about those little bars of soap. The new maid you assigned me must have thought I was a new check-in today, since she left another 3 bars of hotel soap in my medicine cabinet along with her regular delivery of 3 bars on the bath-room shelf. In just 5 days here I have accumulated 24 little bars of soap. Why are you doing this to me?
S. Berman
Dear Mr. Berman,
Your maid, Kathy, has been instructed to stop delivering soap to your room and remove the extra soaps. If I can be of further assistance, please call extension 1108 between 8 AM and 5 PM.Thank you,
Elaine Carmen,
Housekeeper
Dear Mr. Kensedder,
My bath-size Dial is missing. Every bar of soap was taken from my room including my own bath-size Dial. I came in late last night and had to call the bellhop to bring me 4 little Cashmere Bouquets.
S. Berman
Dear Mr. Berman,
I have informed our housekeeper, Elaine Carmen, of your soap problem. I cannot understand why there was no soap in your room since our maids are instructed to leave 3 bars of soap each time they service a room. The situation will be rectified immediately. Please accept my apologies for the inconvenience.
Martin L. Kensedder
Assistant Manager
Dear Mrs. Carmen,
Who the hell left 54 little bars of Camay in my room? I came in last night and found 54 little bars of soap. I don't want 54 little bars of Camay. I want my one damn bar of bath-size Dial. Do you realize I have 54 bars of soap in here. All I want is my bath size Dial. Please give me back my bath-size Dial.
S. Berman
Dear Mr. Berman,
You complained of too much soap in your room so I had them removed. Then you complained to Mr. Kensedder that all your soap was missing so I personally returned them. The 24 Camays which had been taken and the 3 Camays you are supposed to receive daily. I don't know anything about the 4 Cashmere Bouquets. Obviously your maid, Kathy, did not know I had returned your soaps so she also brought 24 Camays plus the 3 daily Camays. I don't know where you got the idea this hotel issues bath-size Dial. I was able to locate some bath-size Ivory which I left in your room.
Elaine Carmen
Housekeeper
Dear Mrs. Carmen,
Just a short note to bring you up-to-date on my latest soap inventory. As of today I possess:
Please ask Kathy when she services my room to make sure the stacks are neatly piled and dusted. Also, please advise her that stacks of more than 4 have a tendency to tip. May I suggest that my bedroom window sill is not in use and will make an excellent spot for future soap deliveries. One more item, I have purchased another bar of bath-sized Dial which I am keeping in the hotel vault in order to avoid further misunderstandings.
S. Berman
December 21, 2004 01:17 PM EST Permalink
| Physics of Santa | Humor |
You might have seen this in years past. If not, here is the analysis from Spy mag in January 1990, as well as a rebuttal.
1. No known species of reindeer can fly. BUT there are 300,000 species of living organisms yet to be classified, and while most of these are insects and germs, this does not COMPLETELY rule out flying reindeer which only Santa has ever seen.
2. There are 2 billion children (persons under 18) in the world. BUT since Santa doesn't (appear) to handle the Muslim, Hindu, Jewish and Buddhist children, that reduces the workload to 15% of the total -- 378 million according to Population Reference Bureau. At an average (census)rate of 3.5 children per household, that's 91.8 million homes. One presumes there's at least one good child in each.
3. Santa has 31 hours of Christmas to work with, thanks to the different time zones and the rotation of the earth, assuming he travels east to west(which seems logical). This works out to 822.6 visits per second. This is to say that for each Christian household with good children, Santa has 1/1000th of a second to park, hop out of the sleigh, jump down the chimney, fill the stockings, distribute the remaining presents under the tree, eat whatever snacks have been left, get back up the chimney, get back into the sleigh and move on to the next house. Assuming that each of these 91.8 million stops are evenly distributed around the earth (which, of course, we know to be false but for the purposes of our calculations we will accept), we are now talking about .78 miles per household, a total trip of 75-1/2 million miles, not counting stops to do what most of us must do at least once every 31 hours, plus feeding and etc. This means that Santa's sleigh is moving at 650 miles per second, 3,000 times the speed of sound. For purposes of comparison, the fastest man-made vehicle on earth, the Ulysses space probe, moves at a poky 27.4 miles per second -- a conventional reindeer can run, tops, 15 miles per hour.
4. The payload on the sleigh adds another interesting element. Assuming that each child gets nothing more than a medium-sized lego set (2 pounds), the sleigh is carrying 321,300 tons, not counting Santa, who is invariably described as overweight. On land, conventional reindeer can pull no more than 300 pounds. Even granting that "flying reindeer" (see point #1) could pull TEN TIMES the normal anoint, we cannot do the job with eight, or even nine. We need 214,200 reindeer. This increases the payload - not even counting the weight of the sleigh -- to 353,430 tons. Again, for comparison -- this is four times the weight of the Queen Elizabeth.
5. 353,000 tons traveling at 650 miles per second creates enormous air resistance -- this will heat the reindeer up in the same fashion as spacecrafts re-entering the earth's atmosphere. The lead pair of reindeer will absorb 14.3 QUINTILLION joules of energy. Per second. Each. In short, they will burst into flame almost instantaneously, exposing the reindeer behind them, and create deafening sonic booms in their wake. The entire reindeer team will be vaporized within 4.26 thousandths of a second. Santa, meanwhile, will be subjected to centrifugal forces 17,500.06 times greater than gravity. A 250-pound Santa (which seems ludicrously slim)would be pinned to the back of his sleigh by 4,315,015 pounds of force.
In conclusion -- If Santa ever DID deliver presents on Christmas Eve, he's dead now.
=============
A rebuttal:
If you're going to criticise Santa Claus on physical grounds, you may at least do it right.
The payload calculations are nonsense. Adding, say, 1000 stops back at the North Pole for reloading adds only a few percent to the entire distance covered, while reducing the payload by a factor of 1000. This is clearly the way to go.
The nonuniform distribution of children has a tremendous effect on the routing. With sensible routing, the average distance from a good child to the next good child is only a couple hundred feet in suburban conditions (this is clearly higher in the country, but is much less in, say, New York City). With only .05 miles between average good children, Santa need only travel at Mach 200, just a little faster than Ulysses. This reduces the force of air resistance by a factor of 200, and the power absorbed by the reindeer by 3000.
Of course, if Santa stops to give coal to bad children it could slow things down a bit. But it appears that increasing population has made Santa give up that trick. When was the last time you heard of anybody getting a lump of coal?
We all saw the pictures of a smart bomb falling through an Iraqi smokestack. Clearly Santa uses the same technology for toys and chimneys. By dropping, say, 100 toys at a time from high altitude, Santa can reduce his speed by another factor of 10. While still supersonic, this is now slightly less than orbital velocity, sparing Santa and his team the trauma of extreme centrifugal force.
Santa's trip IS a remarkable feat of aeronautics, but please don't say it's impossible.
December 21, 2004 12:03 PM EST Permalink
Monday December 20, 2004 | HeartRates & Health | Exercise |
So I recently played Racquetball, for the first time in 8 years! I have been jogging some, so my heart is in pretty good shape. But RB stresses muscles you probably forgot you had! At least for me, it was serious wind sprints :-)
Being the electronics gadget collector that I am, I wanted to plot my heart rate chart during the game, and compare it to a recent jog. So I wore my Polar HR monitor. Racquetball, it turns out, sustained an elevated HR that exceeds my jogging! And three games totaled 41 minutes, or almost twice my jogging time. So, from a cardio perspective, this is a really good exercise. But I knew that :-). I've included the chart from my Polar HR monitor below.
The following link is a 5-page scientific report on the statistics of Heart Rate Recovery and how it correlates to heart disease and death. It is pretty intense reading. But the bottom line is that it found that a HR recovery rate of less than 22 "beats per minute" during the 2 minutes following intense cardio exercise, is a bad thing. My 2 minute recovery was 34 bpm for racquetball, and closer to 50 for jogging.
http://www.cardiology.palo-alto.med.va.gov/recentpapers/AJCHRR.pdf
I'm guessing my jogging HR recovery is better than RB because my body is more acclimated to jogging (I've been doing that for a year now) than the wind sprints of racquetball. I stopped the watch before the 2 minute mark when I finished jogging, so I only show a 1-min recovery for jogging.
It is probably worth checking 2-minute HR Recovery every 6 months or so to see how the old ticker is doing.

December 20, 2004 11:44 AM EST Permalink
| Oracle RAC's Secret | Computers |
I'm a big fan of Oracle's RAC technology. I (speaking for myself, not Sun) think it is the only database product out there that can solve the challenge of near continuous database transaction access to a single (complete) data set even when the database node that a client is connected to experiences a catastrophic failure. Traditional failover can incur a 10x longer service disruption, and multi-site state "replicated" designs are complex and subject to sync skew.
However, there is a little secret associated with the magic of Oracle RAC. Well, it isn't really a secret, it is just something that most people don't like to talk about, an elephant in the room that people choose to ignore. In fact, it is a very natural consequence of a NUMA design. NUMA, of course, means "non-uniform memory access", and is generally a serious issue when frequent memory accesses takes place in which there is a latency ratio from best to worst of >10x. Local SGA memory access latency on an SMP node takes from 100 to 400ns (depending on the type of node). However, if that node is part of a RAC Cluster and it needs to access a memory block on another RAC node's shared SGA (via cache fusion) the latency to retrieve that block will be measured in micro-seconds, often 1000x worse! Here is an illustration of the NUMA aspect of Oracle RAC:

Oracle published a paper recently in which it lists GBE as having an average latency from 600us to over 1000us. That is well over 1000x worse than the local SMP node! Even Infiniband has a latency of almost 200us, which is 1000x worse than a 200ns local SMP node. Ouch. That is a serious performance hit! Here is a graphic from Oracle's paper:

There is also the issue bandwidth. An older server from Sun, the F15K, has over 172GB/s of internal bandwidth. That's aggregate B/W among 18 boards. However, that is a TON of bandwidth. GBE, bless its heart, can only push about 70MB/s of user data. Even with 18 of those links (if you attempted to build an "F15K" from blades), that adds up to only 1.2GB/s. And consider CPU utilization needed to drive each GBE link. Hmmm. Let's see what Oracle says about Bandwidth, Latency, and CPU:

You can get an idea of why this is a problem when you understand the internal structure of an Oracle database. It's amazing what Oracle can do w.r.t. data integrity and performance. It takes a lot of behind the scenes action. Here is a peek:

And when you try to spread this out among even two nodes, you suffer the consequences of 1000x higher latency, and 100x less throughput. Here is a look at the protocol mgmt that must take place for every node sync or transfer, which can happen thousands of times per second:

So it is no wonder that RAC can run into scaling issues if the data is not highly partitioned, to reduce to a trickle the amount of remote references and cache fusion transfers. TPC-C is an example of a type of benchmark in which the work is split between each node without inter-node interaction. RAC scales wonderfully in that benchmark. The problem is that most ad-hoc databases that customers are attempting to use with RAC involve significant inter-node communications. You can imagine the challenge, even with Infiniband, which still has 1000x higher latency (according to Oracle's tests).
Compare this to an SMP node, in which we have shown near 100% linear scale to 100+ CPUs running real world workloads that involve intense remote memory access. Thankfully, a "remote" access on an SMP box (a CPU asking for a block that is cached by another CPU) is still in the nano-second range. Here is a look at what SMP can do:

I have graphs that show Oracle RAC performance on real-world workloads, but Oracle doesn't allow anyone to publish Oracle performance results without their permission. So I will only suggest that the graph has a much different shape. And that anyone contemplating Oracle RAC run full load testing and a comparison to a non-RAC SMP baseline.
Okay... what does all this mean? Well, just that Oracle RAC, as I started out saying, is incredible technology that solves a particularly nasty problem that many customers face. But, you must enter the decision to deploy RAC with full knowledge of the engineering trade offs. RAC can be made to perform well in many environments, given a proper design and data/query partitioning and proper skills training. But in general, if there is appreciable inter-node communications, then you should consider using fewer RAC nodes (eg: 2 or 3), in which each node is larger in size. This keeps memory accesses as local as possible.
For many customers, a traditional HA-Failover is actually a very good design choice, in which you leverage the linear scale of an SMP box, and let SunCluster restart the database on some other node if there is a problem. This generally takes ~5-10 minutes, which is an acceptable service disruption duration for many, especially since that might only happen a couple times per year. And, for clusters with more than 4 CPU cores, Oracle charges $70K per CPU core for RAC+Partitioning, whereas Oracle "only" charges $40K per CPU core for an HA-Failover environment (and for failover, you only pay Oracle for the active node if you only expect to run Oracle on the failover node for less than 10 days per year).
December 20, 2004 05:00 AM EST Permalink
Sunday December 19, 2004 | A Wife's Christmas Present | Personal |
I enjoyed Scott Dickson's blog entry on the Joy of Christmas, as well as his quest for a gift for his wife.
http://blogs.sun.com/roller/page/scottdickson/20041218#fourth_sunday_in_advent_christmas
I just got home from Best Buy on a similar quest (my wife, not his :-). My wife has been on me for a year to upgrade our home network. Which is strange, since I'm a computer professional and general techno-geek, and she is a wonderful stay-at-home homeschooling mother and wife who can't quite figure out that you don't have to hit CR after each line in a word processor.
Anyway, she now has a laptop. So do I. My Toshiba M2 has built-in wireless. We both need to print. So I bought her a Linksys Wireless-G PCMCIA card for her laptop, a Linksys Wireless-G Router for our home, as well as a new HP 6840 Wireless-G printer (our current printer in a few years old). The printer is on sale for $150.00 (if you also buy a Linksys router, which I did), and is natively network-aware and can be shared with up to 5 computers concurrently.
I know... you might think those were for me :-). Yes, I'll have to get something else for her as well. But I'm going to have fun refreshing my home network over the break. And she will think I'm the smartest man alive. At least, she'll lead me to believe she thinks that :-) I might even buy a wireless bridge and hook a SunRay 1 up in the living room, using a PC running the new SunRay Server S/W that runs on Linux.
December 19, 2004 12:22 AM EST Permalink
Friday December 17, 2004 | History Lessons by Bill Petro | Personal |
Bill Petro worked for Sun for 11 years in a senior marketing position. He is now with EMC in similar role. However, I only knew him thru a series of automated e-mail notes I get from time to time throughout the year. He has a passion for researching and writing short historical narratives on several topics. One topic that I find interesting is a series called "History of the Holidays". You can sign up so that you get these delivered to your in-box just in time for the specific holiday. Or, you can go here and simply read them on-demand: http://www.billpetro.com/HolidayHistory/
Given the currrent season, here are two examples. A possible scientific explanation for the Christian star of Bethlehem, and the event that is remembered by the Jewish during Chanukah:
http://www.billpetro.com/HolidayHistory/hol/xmas/star12.html
http://www.billpetro.com/HolidayHistory/hol/xmas/chan12.html
December 17, 2004 09:16 AM EST Permalink
Thursday December 16, 2004 | What is Truth? Not Scientific Theory | Personal |
Masood Mortazavi and Geoff Arnold recently debated the concept of "truth" on BSC (blogs.sun.com). I appreciated the pointer to:
http://www.iep.utm.edu/t/truth.htm
Here, the "International Encyclopedia of Philosophy" [IEP] defines "truth". They suggest that "The principal issue is: What is truth?". Ironically (I say that because philosophy and religion seldom mix) in John 18:38, that exact question is asked of Jesus:
http://www.biblegateway.com/passage/?book_id=50&chapter=18&verse=38
At the end of the page on truth, the IEP summarizes that science really isn't about truth, but simply about describing hopefully useful models that allow us to predict and extrapolate the impact of actions and responses in meaningful ways. For example, a scientific model of gravity was useful to the original human "computers" that calculated ballistic trajectory for weapon systems. The formula was certainly useful, but not "true". Here is what the IEP says at the end of their description:
Giere recommends saying science aims for the best available 'representation', in the same sense that maps are representations of the landscape. Maps aren't true; rather, they fit to a better or worse degree. Similarly, scientific theories are designed to fit the world. Scientists should not aim to create true theories; they should aim to construct theories whose models are representations of the world.
That really makes sense to me. It is important to remember that representative models (aka: "scientific theories") are often incrementally refined (and occasionally rejected and reworked) over time, as we attempt to improve the approximation of perceived reality. This is a natural result of our insatiable appetite for scientific discovery. Even when a theory is repeatable and validated thru experimentation, this does not "prove" the theory as true... but simply gives us additional confidence that the model is useful within the context and range of the observations and experiments to which the model has been subjected.
Too often some will lock on to a particular model (aka: scientific theory) in its current form and declare it as irrefutable truth! Just look at the theory of evolution! Wow - talk about a religious topic for some who hold that science has "proven" as fact a model. Evolution is particularly weak because it is based on blind/partial observation without experimental validation. This model is under increasing attack from the scientific community itself. As a lover of science, I applaud those who look forward to the evolution of the theory of evolution (and every other scientific theory), and who promote with an open mind scientific discovery and the reworking of obsolete or incomplete models. I'm particularly interested in "myth busting" theories to which the scientific community holds to with religious fervor. That in itself should be a red flag that a philosophical agenda has blinded their purity.
December 16, 2004 06:00 PM EST Permalink
Wednesday December 15, 2004 | Boeing & Root Cause of Failure | Computers |
I found the following very interesting! It is buried in a 22 page report on Boeing's web site: http://www.boeing.com/news/techissues/pdf/statsum.pdf
Statistical Summary of Commercial Jet Airplane Accidents Worldwide Operations (1959 - 2003)
On page 19, you'll find the following graphic (I've added some context elements) that describes the root cause of hull loss and/or loss of life in the worldwide commercial air fleet over the last 10 years:

It is interesting to note that the large majority of cases of, um, "down" time, were due to people either making mistakes (or acting maliciously), or people correctly following faulty or incomplete procedures (which were written by people). It is rarely the products (airplanes) or the environmentals (weather).
In the same way, Gartner and others have long held that complex IT systems fail to deliver expected service levels mostly because of people and process related root causes (est. ~80%). Product failures actually account for a tiny fraction of IT service disruptions.
This seems to point to a general pattern that whenever complex systems expose their complexity to human touch points, even in situations in which those humans are psychologically screened, highly trained, highly paid, and limited in number, that catastrophic failures will occur that impact business and/or life.
This is probably no surprise to us. Each one of us have made mistakes behind the wheel, in social settings, etc, due to a variety of reasons (boredom, over-confidence, etc). But the implication of the Boeing and Gartner studies is that we should strive to abstract complexity away from human touch points at every opportunity. Think of "fly-by-wire" controls, in which a pilot's actions are constrained by a flight control system that will not allow actions that could harm the airplane or its passengers. Freedom and flexibility are permitted up to, but not exceeding, a "pain" threshold.
In professional audio systems, a "compressor" is often used. Dynamic response is not affected unless it reaches a threshold in which it might distort or consume undesired energy. Then the system steps in a cleanly limits further dynamic range. As long as you operate in the expected range, you have freedom. If your actions threaten the quality of the output, your action is constrained. Seem like a fair trade-off of freedom and control.
The ultimate expression of an automated datacenter would be to define desired service levels (and the cost and reward sensitivities as actual service levels vary from the nominal/desired) and let "fly-by-wire" micro-adjustments to the IT Infrastructure control optimization. This could radically reduce IT Service Disruptions as complexity is managed by highly-available and hardened controllers, rather than distracted operators. The sensitivity parameters allow the system to distribute excess resources to those services that could benefit most from better than desired performance, or degrade the least sensitive services if a shortfall were to occur.
Of course, cascading failures are still possible. Remember the recent black out in the NE! Codified heuristics that control optimization decisions are simply human designed algorithmic procedures. And procedures can be flawed or reach an "if" part of a decision tree based on stimuli for which there is no "then" statement. But, once solved and hardened, this datacenter control "product" will be much more dependable at delivering desired service levels than an army of humans manually adjusting knobs.
Can this go too far? Sure! I'm not sure I'd want to fly in a pilotless helicopter around Kauai... There are limits to the value of automated services that pre-define concepts of optimization. However, a helicopter with controllers that prevent potentially harmful actions from an error-prone human pilot would be comforting, and not only might keep the charter service in business (and me alive!), but be leveraged as a way to drive more business.
December 15, 2004 04:30 PM EST Permalink
Sunday December 12, 2004 | ZFS: Boils the Ocean, Consumes the Moon | Computers |
ZFS (aka: Zettabyte Filesystem), Sun's newest filesystem that will ship with an update of Solaris 10 in 2005, can address 128-bit filesystems! Let's explore how insanely huge this is from various perspectives. http://wwws.sun.com/software/solaris/10/ds/zfs.jsp
First, I've heard several times now that to construct/power a storage farm of this size would boil the world's oceans. Is this just hyperbole? Presenters typically say something like: "someone in engineering ran the math and this is amazing but true". The latest was from Larry Wake's presentation in which he said:
If we could implement a physical system with the storage capacity that matches the 128-bit address range of ZFS, that we would "literally evaporate all the oceans on earth".
This got me a little curious... Let's see: ZFS = 128-bit = 3*1026 [3E26] TB (per filesystem)
Using 300GB spindles, you'd need about 1E27 spindles. Seagate's modern drives consume ~10W idle, and ~14W for both startup and in operation. So, lets go with 10W each, for round numbers. That's 1E28W or 8.8E31 KW-hr (over a full 24x7 year). That's 3.2E38 Joules.
http://www.seagate.com/cda/products/discsales/enterprise/tech/0,1084,362,00.html
If we apply E=mc^2, we'd need to annhilate 3.5E21 kg of something (old beer cans?) to produce this much energy (power those spindles for a year). The National Oceanic and Atmospheric Administration (NOAA) experts figure that the world's oceans consist of 275 million cubic miles. Seawater weighs 1027 kg/m^3. That means all the oceans of the world weigh about 1.2E21 kg. Perfect conversion of the oceans to energy would spin those disks for about 4 months!!
Total World Consumption of Energy in 2002 was about 450 Quadrillion BTUs, or about 13E13 or 13 Trillion KW-hrs. Note (quadrillion means 1015 in the US, and 1024 in Europe... this stat is in the USA units).
http://www.eia.doe.gov/neic/infosheets/electricgeneration.htm
Therefore we'd need 6E18 times more capacity than the current worldwide consumption just to power this storage farm. If all the world's power generating capacity was a *single* grain of sand, then we'd need ALL the sand in ALL the beaches from around the entire planet to produce enough power for this storage farm. Est number of gains of beach sand: 7.5E18.
http://www.hawaii.edu/suremath/jsand.html
As far as the size of this storage farm (just laying the drives as close as possible to each other) when each drive is ~600K mm3 (the size of Seagate's 180GB disk). You'd need 6E32 mm3. The land surface of the earth is about 1.5E8 km2. You'd cover the earth's land surface to a depth of 2.5 million miles deep with disks to get this capacity (about 10x the distance to the moon!).
http://hypertextbook.com/facts/2001/DanielChen.shtml
Okay. The oceans are history! So are we. But ZFS will live forever. :-)
Let's look from another perspective. Lloyd tell us that the sub nuclear limit for storage is 1025 bits/kg. That means that a fully populated 128bit storage pool would have to weight at least 600 trillion pounds, for the the recording surface. Any less, and you can't exceed the 128bit space. Sun employees see: http://zfs.eng/faq.shtml
A combat ready aircraft carrier weighs only 194 million pounds! The Empire State building weights only 1.1 billion pounds! A solid cube made up of 1 *trillion* pennies (273 feet//side) weighs ~5.5 billion lbs [about the length of a football field in each dimension]. That is 300% more pennies than the US mint has ever produced!
http://www.kokogiak.com/megapenny/thirteen.asp
A penny made after 1982 weighs just 2.5 grams (5.5116 E -3 lbs). That site suggests that 1 trillion (+ 16k or so to make a cube) pennies weigh 3.125 tons. But in 1982, the penny's composition was altered from 95% copper 5% zinc, to the current 97.5% zinc, 2.5% copper mix, which made it "cheaper" and lighter. That many pennies now weigh just 2.75 tons (US) or 2.5 tons (metric), so we need a few more cubes.
You'd need 110 thousand of those cubes to equal the mass of the theoretically perfect mass-efficient storage pool.
In reality, the latest Seagate 300GB disk weighs 1.6 lbs. You'd need 1E27 of these, or 1.6E27 lbs. The moon weighs 1.6E24 lbs. So you'd need the weight of 1000 moons!! And that's just in the spindles (sans racks, air handlers, etc). http://nssdc.gsfc.nasa.gov/planetary/factsheet/moonfact.html
Hmmm. I'm thinking 128-bit filesystems might just be enough for a few years. :-)
December 12, 2004 11:19 PM EST Permalink
Today's Page Hits: 90