The Navel of Narcissus
Josh Simons' Coordinates in the Blogosphere

20080903 Wednesday September 03, 2008

New England OpenSolaris User Group Meeting: Wednesday, September 10th!

The fifth meeting of NEOSUG (New England OpenSolaris User Group) will be held next Wednesday, September 10th at Sun's Burlington, Massachusetts site. The featured speaker will be Jim Mauro, who will talk about Solaris 10 and OpenSolaris Performance, Observability, and Debugging. Full details below.

The New England Open Solaris User Group (NEOSUG) Meeting

Topic for this meeting:

Solaris 10 and OpenSolaris Performance, Observability and Debugging (The Abridged Version)

Who should attend? : UNIX Developers, Solaris users, System Managers and System Administrators.

AGENDA:

New England OpenSolaris User Group Meeting (NEOSUG)
Sept 10, 2008 6:30-9:30 pm (registration opens @5:30)
Sun Microsystems
One Network Drive
Burlington, MA

5:30-6:30: Registration, Refreshments
6:30-6:40: Introductions, Peter Galvin
6:40-8:30: Solaris 10 and OpenSolaris Performance, Jim Mauro, Sun Microsystems
8:30-9:00: Questions and Discussion

Please RSVP at : https://www.suneventreg.com//cgi-bin/register.pl?EventID=2341

TALK DESCRIPTION:

Solaris 10 and OpenSolaris Performance, Observability and Debugging (The Abridged Version)

The observability toolbox in Solaris 10 and OpenSolaris is loaded with powerful tools and utilities for analyzing applications and the underlying system. Solaris Dynamic Tracing (DTrace), allows you to connect the dots between the process and thread-centric tools, and the system utilization tools, and get a complete picture on what your applications are doing, how they are interacting with the kernel, and to what extent they are consuming hardware resources (CPU, Mem, etc).

This two hour talk walks through the tools, utilities and methods for analyzing workloads on your Solaris systems.

NEOSUG BIOs:

Peter Galvin : Chief Technologist, Corporate Technologies Inc.
Peter Baer Galvin is the Chief Technologist for Corporate Technologies, Inc., a systems integrator and VAR, and was the Systems Manager for Brown University’s Computer Science Department. He has written articles for Byte and other magazines. He wrote the Pete’s Wicked World and Pete’s Super Systems columns at SunWorld Magazine. He is currently contributing editor for SysAdmin Magazine, where he managed the Solaris Corner. Peter is co-author of the Operating Systems Concepts and Applied Operating Systems Concepts texbooks. Blog: http://pbgalvin.wordpress.com

Jim Mauro: Principal Engineer in the Systems Group, Sun Microsystems, Inc.
Jim Mauro works on improving delivered application performance on Sun hardware and Solaris. Jim's recent project work includes Solaris performance as a guest operating system on Xen and VMware virtual machines, Solaris large memory page performance, and Solaris performance on large SPARC systems. Jim co-authored Solaris Internals (1st Ed, Oct 2000), Solaris Internals (2nd Ed, June 2006) and Solaris Performance and Tools (1st Ed, June 2006).

ug-neosug mailing list: ug-neosug@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/ug-neosug


(2008-09-03 13:41:34.0) Permalink Comments [1]

20080902 Tuesday September 02, 2008

Another Worm in My Apple: iPhone 3G Woes

[generic iphoto shot]

I thought I was smart to wait for the second version of Apple's iPhone after having suffered through a host of early-adopter issues with my first-generation Mac Book Pro. Apparently not.

Up until last Saturday, I had been mostly satisfied with the iPhone 3G, having resigned myself to the poor battery life, the intermittent switching between Edge and 3G networks, and the occasional Failed Call. Even with these problems, the iPhone experience had been a compelling one for me.

On Saturday, I went away for the holiday weekend. On Saturday afternoon, all of my 3rd party applications -- both free and those I had paid for -- stopped working. Every such application would immediately exit after I launched it. Power cycling had no effect. I could not try re-syncing until Monday evening when I got home, though in retrospect I could have tried deleting the apps and downloading them again from the iTunes store (though with the Edge/3G flipping I'm not sure I would have wanted to try that.) In any case, syncing to my Mac Book Pro did not help. So I deleted the applications and tried to sync again, hoping this would clear the problem. No joy. This time, iTunes complained my computer was not authorized to use any of the applications I had previously downloaded and refused to reinstall them on the phone.

I called Apple support and we fixed the problem by re-authorizing my computer and then completing a sync that reloaded the apps, which now seem to be working again. The rep told me this is a known problem that sometimes needs to be fixed by deleting the apps from both the iPhone and the computer and reloading them from the iTunes store (which keeps track of purchased apps so you do not need to pay again.)

Just before calling Apple, I had two Call Failed incidents during one conversation and had to switch to a landline to complete the call. Not a great advertisement for a phone or for either Apple or ATT, I'm afraid.


(2008-09-02 10:10:03.0) Permalink Comments [0]

20080803 Sunday August 03, 2008

Wordle: What the Heck Have I Been Talking About?

According to Wordle, here is what I've been talking about lately on the Navel...

Via Joan, Wordle is a cool cloud tool written by Jonathan Feinberg, a member of IBM Research’s Collaborative User Experience Group. Have fun changing colors, text directions, fonts, etc. Make your own beautiful word cloud.


(2008-08-03 07:24:24.0) Permalink Comments [0]

20080801 Friday August 01, 2008

Redmond: Still Banging Rocks Together

I never published the following blog entry, which I wrote in October 2007, prior to Apple's release of Leopard, the latest version of Mac OS X.

I find the fanboy enthusiasm a little embarrassing now given how much pain and suffering Leopard inflicted on me and many others up until recently when the 10.5.4 (FOUR!) release finally seems to have fixed things. It took Apple eight months to fix what they broke! Shame on them, but at least they fixed it. Fanboy exuberance aside, the main point of the original blog entry is still valid so I decided to publish it.

The original entry...

Last night I watched Apple's Leopard preview video, which highlights a few of the major new features which will be included in their upcoming operating system release.

This morning I read some advice for Windows users in PC Magazine and was struck by the contrast. A reader had asked how to read the information on the blue screen of death before his machine rebooted. As Loyd Case noted, "it is strange that a screen meant to convey critical system error messages should disappear before the average human could possibly read it, much less copy down the often huge amounts of information on it."

Never fear: As Neil Rubenking explains in the same article, "you can keep that blue screen visible. Right-click My Computer, Choose Properties. Click to select the Advanced tab (or the Advanced system properties link in Vista). Click the Settings button in the Startup and Recovery pane. Uncheck Automatically restart, Click OK | OK. Now the blue-screen information will remain visible on your screen until you force a reboot with Ctrl-Alt-Del."

They really are still banging rocks together up in Redmond, still hoping someday they'll figure out how to make fire. Meanwhile, Apple is kicking butt and delivering an absolutely unparalleled and increasingly jaw-dropping user experience to its customers.


(2008-08-01 13:45:43.0) Permalink Comments [0]

20080730 Wednesday July 30, 2008

Fresh Bits: Attention all OpenMP and MPI Programmers!

The latest preview release of Sun's compiler and tools suite for C, C++, and FORTRAN users is now available for free download. Called Sun Studio Express 07/08, this release of Sun Studio marks an important advance for HPC customers and for any customer interested in extracting high performance from today's multi-threaded and multi-core processors. In addition to numerous compiler performance enhancements, the release includes beta-level support for the latest OpenMP standard, OpenMP 3.0. It also includes some nice Performance Analyzer enhancements that support simple and intuitive performance analysis of MPI jobs. More detail on both of these below.

As the industry-standard approach for achieving parallel performance on multi-CPU systems, OpenMP has long been a mainstay of the HPC developer community. Version 3.0, which is supported in this new Sun Studio preview release, is a major enhancement to the standard. Most notably it includes support for tasking, a major new feature that can help programmers achieve better performance and scalability with less effort than previous approaches using nested parallelism. There are a host of other enhancements as well. The OpenMP expert will find the latest specification useful. For those new to parallelism who have stumbled into a maze of twisty passages all alike, you may find Using OpenMP: Portable Shared Memory Parallel Programming to be a useful introduction to parallelism and OpenMP.


A parallel quicksort example, written using the new OpenMP tasking feature supported in Sun Studio Express 07/08

Sun Studio Express 07/08 also includes enhancements for programmers of parallel, distributed applications who use MPI. With this release of Sun Studio Express we have introduced tighter integration with Sun's MPI library (Sun HPC ClusterTools). Sun's Performance Analyzer has been enhanced to include the ability to examine the performance of MPI jobs by viewing information related to message transfers and messaging performance using a variety of visualization methods. This extends Analyzer's already-sophisticated on-node performance analysis capabilities. Some screenshots below give some idea of the types of information that can be viewed. You should note the idea of viewing "MPI states" (e.g. MPI Wait and MPI Work) to get a high level view of the performance of the MPI portion of an application: an ability to understand how much time is spent doing actual work versus sitting in a wait state can motivate useful insights into the performance of these parallel, distributed codes.

A source code viewer window augmented with several MPI-specific capabilities, one of which is illustrated here: the ability to quickly see how much work (or waiting) is performed within a function.

In addition to supporting direct viewing of specific MPI performance issues within an application, Analyzer now also supports a range of visualization tools useful for understanding the messaging portion of an MPI code. Zoomable timelines with MPI events are supported, as is an ability to map various metrics against the X and Y axis of a plotting area to display various interesting characteristics of the MPI run, as shown below.

Just one example of Sun Studio's new MPI charting capabilities. Shown here is a display showing the volume of messages transferred between communicating pairs of MPI processes during an application run.

This blog entry has barely scratched the surface of the new OpenMP and MPI capabilities available in this release. If you are a Solaris or Linux HPC programmer, please take these new capabilities for a test drive and let us know what you think. I know the engineering teams are excited by what they've accomplished and I hope you will share their enthusiasm once you've tried these new capabilities.

Sun Studio Express 07/08 is available for Solaris 9 & 10, OpenSolaris 2008.05, and Linux (SLES 9, RHEL 4) and can be downloaded here.


(2008-07-30 10:00:00.0) Permalink Comments [0]

20080728 Monday July 28, 2008

The Deep Blue Sea: Technology in the Service of Safety

My friends Jamie and Lori left Sunday on their annual month-long sailing trip from Boston to the Canadian maritime provinces. As usual, the trip begins with an open ocean sail across the Gulf of Maine directly from Boston to Cape Sable, Nova Scotia.

This year they are carrying a Spot satellite messenger on board. This neat little device can report its location every ten minutes, allowing others to track their progress over the course of the trip. It can also transmit a 911 emergency message, if needed. It is quite a nifty device and surprisingly inexpensive given its capabilities. Jim Gray should have had one of these on his boat last year when he went missing. Of course, boating is only one application--I can imagine this would be useful in any number of situations in which people may need to be rescued.

Here is a screenshot I took this morning of their progress towards Nova Scotia:


You can also view the live interface here.


(2008-07-28 08:22:32.0) Permalink Comments [2]

20080718 Friday July 18, 2008

Two MEASURED TeraFLOPs in a Box: Now THAT is Big Iron!

I love the smell of Big Iron in the morning.

We just announced new versions of our M-series midrange and high-end SMPs, the M4000, M5000, M8000, and M9000 systems, that sport the latest Fujitsu quad-core, dual-threaded SPARC64 VII processor. These systems, a co-development effort between Sun and Fujitsu, are traditionally viewed as high-end enterprise-class systems. With up to 64 quad-core processors, up to 2 TBytes of memory, and up to 288 PCIe or PCI-X IO slots, these systems are clearly high-end datacenter workhorses. But they kick butt on HPC workloads as well. No surprise given the tight coupling of compute and memory in such an SMP system, which is especially valuable for computations involving large amounts of very fine-grained communication between cooperating parallel processes.

We've published world record benchmark numbers on a standard Open MP benchmark, besting the competition by some considerable margins. We've also shown new world record benchmarks on a prominent standard floating-point benchmark. My favorite result, however, is a LINPACK score of over 2 TeraFLOPs with a single M9000 system using Solaris 10 and our latest compilers, Sun Studio 12. This result is almost 2X higher with the new 2.52 GHz SPARC64 VII processor than with the previous 2.4 GHz SPARC64 VI processor. Impressive--and yet another example of why shopping based on processor clock speeds is an increasingly bad idea. In any case, you can read more details about these benchmark results and others here and here.


(2008-07-18 18:24:28.0) Permalink Comments [0]

20080714 Monday July 14, 2008

Solaris InfiniBand: A Big Day!

Yesterday, the Sun InfiniBand engineering team released Solaris 10 driver support for ConnectX (a.k.a. Hermon), the latest generation of InfiniBand silicon from Mellanox. This is important news for both Solaris HPC customers as well as those enterprise customers interested in the best bandwidth and latencies available for applications like Oracle RAC. Congratulations to the team!

In addition to the driver, the update also includes a new flash updating tool for ConnectX, a uDAPL update, and several additional components, all of which is described in the documentation.

The specific ConnectX-related Sun part numbers supported by this release are: X4217A-Z HCA card, X4216A-Z EM, and X5196A-Z, the 24 Port NEM for the SunBlade 6048 family of servers. It also supports third-party cards based on the following Mellanox chips: MT25408, MT25418, and MT25428.

The release, called "Solaris InfiniBand Updates 2" is available for free download here.


(2008-07-14 12:30:48.0) Permalink Comments [5]

Innovation@Sun Conference

Each year Sun holds two internal technical conferences that bring together all of Sun's most senior engineers (Principal Engineers, Distinguished Engineers, and Fellows) at an offsite location for two days of technical presentations and networking. The conferences alternate between a smaller and larger venue. The smaller conference (Technology@Sun) accommodates just the PEs, DEs, and Fellows. At the larger conference (Innovation@Sun) we are able to include approximately 50 other attendees from around Sun. This year, we will choose them based on the merits of their poster or demo proposals.

The deadline for submissions closed last Friday and we received a total of 227 technical proposals from around the company. The program committee will now review all proposals and decide who will be invited to attend. We are also busy planning the rest of the technical content and events for the conference, which promises to be at least as interesting and useful as prior events.

(2008-07-14 08:19:41.0) Permalink Comments [0]

20080624 Tuesday June 24, 2008

Josh Simons, CEO Sun Microsystems

I received a phone call yesterday morning from a firm claiming to be preparing a plaque recognizing Sun's selection as one of the 20 best large companies to work for in Massachusetts by the Boston Business Journal. They wanted to send a mock-up of the plaque to me for my approval. A weird request to make of a random engineer in a 30K+ person company. Figuring this was some sort of headhunter scam to extract additional information from me, I asked how they had found my name and phone number. Whereupon she asked, "Aren't you the CEO?" I responded by giving her Sun's main phone number in Burlington and suggested she call there for help.

I had a good laugh about this with Eric, the engineer in the office next to mine. As it happens, Eric's office is across from our mailstop. While talking with him, I was idly sorting through my mail when I came across a piece with this mailing label:

The sender of this letter has nothing whatsoever to do with the firm that had just called me about the plaque. Oh boy. I have no idea how this happened and I can't imagine what kinds of mailings, phone calls, and invitations I'll now receive as a result. Jonathan, see you in Davos? :-)

(2008-06-24 05:27:41.0) Permalink Comments [5]

20080623 Monday June 23, 2008

ClusterTools 8: Early Access 2 Now Available

The latest early access version of Sun HPC ClusterTools -- Sun's MPI library -- has just been made available for download here. As an active member of the Open MPI community, we continue to build our MPI offering on the Open MPI code base, making pre-compiled libraries freely available and offering a paid support option for interested customers. Wondering why we would base our MPI implementation on Open MPI? Read this.

What is particularly cool about CT 8 is that in addition to supporting Solaris, we've added Sun support for Linux (RHEL 4 & 5 and SLES 9 & 10), including use of both the Sun Studio compilers and tools and GNU C. We've also included a DTrace provider for enhanced MPI observability under Solaris as well as additional performance analysis capabilities and a number of other enhancements that are all detailed on the Early Access webpage.


(2008-06-23 14:38:41.0) Permalink Comments [0]

Open MPI on the Biggest Supercomputer in the World

Los Alamos National Laboratory and IBM recently announced they had broken the PetaFLOP barrier with a LINPACK run on the Roadrunner supercomputer. The Open MPI community, including Sun Microsystems, was proud to have played a role in this HPC milestone. As described by Brad Benton, member of the Roadrunner team, the 1.026 PetaFLOP/s LINPACK run was achieved using an early, unmodified snapshot of Open MPI v1.3 as the messaging layer that tied together Roadrunner's 3000+ AMD-powered nodes. For more details on specific MPI tunables used, read this subsequent message from Brad and this follow-up message from Jeff Squyres, Open MPI contributor from Cisco.

About two years ago, we decided to change Sun's MPI strategy from one of continuing to develop our own proprietary implementation of MPI to instead joining a community-based effort to create a scalable, high-performance, and portable implementation of MPI. We joined the Open MPI community because we felt (and still feel) strongly that combining forces with other vendors and other organizations is the most effective path to creating the middleware infrastructure needed to support the needs of the HPC community into the future.

Sun was the 2nd commercial member to join the Open MPI effort, which at the time consisted of a small handful of research and academic organizations. Two years later, the community looks like this:


This mix of academic/research members and commercial members brings together into one community a focus on quality, stability and customer requirements on the one hand, with a passion for research and innovation on the other. Of course, it does also create some challenges as the community works to achieve an appropriate balance between these sometimes opposing forces, but the results to date have been impressive, as witnessed by the use of Open MPI to set a new LINPACK world record on the biggest supercomputer in the world.

(2008-06-23 14:07:14.0) Permalink Comments [0]

20080622 Sunday June 22, 2008

Lufthansa Schadenfreude

[WARNING: This blog entry is primarily about vomit.]

I flew home from Dresden via Frankfurt on Friday, boarding LH 422 for the eight-hour flight to Boston. Sitting just forward of me was a coed group of boisterous college-aged kids who mostly quieted down once they had stowed their gear and found their seats. With the exception of Oscar, who was sitting two seats in front of me across the aisle.

Oscar's behavior went beyond boisterous, well into the realm of obnoxious. He was loud, he was rude, he was up out of his seat repeatedly, unable to sit still. I'm a seasoned air traveler and very used to ignoring the various mis-behaviors of my fellow passengers, but there was something about Oscar I found particularly grating. He persisted in this behavior up until the first meal service, at which point some god out of some pantheon smote him but good and he threw up all over himself in spectacular fashion. When he stood up after being directed aft by an attendant, I saw that he was literally covered in vomit--all over his shirt and down his pant legs and in considerable quantity. And there was apparently enough left over to have covered the seat as well, since the attendant later placed a pillow on it to make the seat usable again.

Peace reigned for most of an hour while Oscar presumably cleaned himself up. He then reappeared--shirtless. Perhaps a little quieter, but still with plenty of swagger. Which I must say I viewed with some amusement since his cool demeanor did not jive with the fact that from the rear one could see that the entire crotch of his pants was completely packed with now-drying vomit which he had apparently missed in the clean-up effort. Ah, I thought to myself. This was schadenfreude.

Eventually one his of traveling companions gave Oscar a t-shirt and he fell asleep sitting on his vomit-laden pillow for most of the remainder of the flight. Later, mention of a $120 bar bill lead me to conclude that Oscar had had far too much to drink prior to boarding the flight.


(2008-06-22 15:54:13.0) Permalink Comments [0]

20080619 Thursday June 19, 2008

Inside NanoMagnum, the Sun Datacenter Switch 3x24

Here is a quick look under the covers of the new Sun Datacenter Switch 3x24, the new InfiniBand switch just announced by Sun at ISC 2008 in Dresden. First some photos and then an explanation of how this switch is used as a Sun Constellation System component to build clusters with up to 288 nodes.

First, the photos:

Nano Magnum's three heat sinks sit atop Mellanox 24-port InfiniBand 4x switch chips. The purple object is an air plenum that guides air past the sinks from the rear of the unit.
Looking down on the Nano, you can see the three heat sinks that cover the switch chips and the InfiniBand connectors along the bottom of the photo. The unit has two rows of twelve connectors with the bottom row somewhat visible under the top row in this photo.
The Nano Magnum is in the foreground. The unit sitting on top of Nano's rear deck for display purposes is an InfiniBand NEM. See text for more information.

You might assume NanoMagnum is either a simple 24-port InfiniBand switch or, if you know that each connector actually carries three separate InfiniBand 4X connections, a simple 72-port switch. In fact, it is neither. NanoMagnum is a core switch and none of the three InfiniBand switch chips is connected to the others. Since it isn't intuitive how a box containing three unconnected switch chips can be used to create single, fully-connected clusters, let's look in detail at how this is done. I've created two diagrams that I hope will make the wiring configurations clear.

Before getting into cluster details, I should explain that a NEM, or Network Express Module, is an assembly that plugs into the back of each of the four shelves in a Sun Blade 6048 chassis. In the case of an InfiniBand NEM, it contains the InfiniBand HCA logic needed for each blade as well as two InfiniBand leaf switch elements that are used to tie the shelves into cluster configurations. You can see a photo of a NEM above.

The first diagram (below) illustrates how any blade in a shelf can reach any blade in any other shelf connected to a NanoMagnum switch. There are a few important points to note. First, all three switch chips in the NanoMagnum are connected to every switch port, which means that regardless of which switch chip your signal enters, it can be routed to any other port in the switch. Second, you will notice that only one switch chip in the NEM is being used. The second is used only for creating redundant configurations and the cool thing about that is that from an incremental cost perspective, one need only buy additional cables and additional switches--the leaf switching elements are already included in the configuration.

If the above convinced you that any blade can reach any other blade connected to the same switch, the rest is easy. The diagram below shows the components and connections needed to build a 288-node Sun Constellation System using four NanoMagnums.

Clusters of smaller size can be built in a similar way, as can clusters that are over-subscribed (i.e. not non-blocking.)


(2008-06-19 07:29:57.0) Permalink Comments [0]

20080618 Wednesday June 18, 2008

Sun Announces Hercules at ISC 2008 in Dresden

Last night in Dresden at the International Supercomputing Conference (ISC 2008), Sun unveiled Hercules, our newest Sun Constellation System blade module. Officially named the Sun Blade X6450 Server Module, Hercules is a four-socket, quad-core blade with Xeon 7000 series processors (Tigerton) that fits into the Sun Blade 6048 Chassis, the computational heart of Sun's Constellation System architecture for HPC. According to Lisa Robinson Schoeller, Blade Product Line Manager, the most notable features of Hercules are its 50% increase in DIMM slots per socket (six instead of the usual four), the achievable compute density at the chassis level (71% increase over IBM and 50% increase over HP), and the fact that Hercules is diskless (though it does also support a 16 GB on-board CF card that could be used for local booting.) A single Constellation chassis full of these puppies delivers over 7 TeraFLOPs of peak floating-point performance.

Lisa Schoeller and Bjorn Andersson, Director for HPC, showing off Hercules, Sun's latest Intel-based Constellation blade system

(2008-06-18 05:29:48.0) Permalink Comments [0]


archives
links
stats