Productivity Matters
High Productivity Computing Systems

Archives
« July 2009
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today
XML
Search

Links
 

Today's Page Hits: 5

20060728 Friday July 28, 2006
Good
This is my last day at Sun, and it's with truly mixed feelings that I say goodbye. It's been a great 13 years. This blog never got going as well as I hoped. We got busy putting together new plans and a proposal for the future HPCS project, and though there was lots of interesting stuff, I couldn't write about any of it! I didn't have the time or energy to make stuff up just to post to the blog. Besides, I think that sort of defeats the purpose of a log. I'm hoping that my fellow bloggers, Sue and Michael, will carry on. I'm going to be working for MicroSoft at a new group aimed at multi-threading tools and compilers, possibly moving to low-end HPC. I don't know what MS's blog policy will be, but I'll find out. When and if that gets going, I'll write about it in my personal blog, which up to now has been aimed strictly at family and friends. I wish Sun and all of my friends there the very best, and I hope to see many of you over the next few years.

posted by ball Jul 28 2006, 11:00:57 AM PDT Permalink

20060526 Friday May 26, 2006
A Quick Update
My apologies for dropping this blog for the last couple of months, but we had a milestone preperation and the final proposal for HPCS Phase III to prepare, and it was rather all-consuming. Things are returning to normal, and we have severaly interesting posts planned for the next few days. I've also updated some links in an earlier entry. They had changed out from under me. The changes were to the High Productivity Computing entry.

posted by ball May 26 2006, 02:52:01 PM PDT Permalink

20060227 Monday February 27, 2006
Sue's comments on "Expertise Gaps" in HPC Software Development
I think Mike's right about there being different kinds of expertise, and it wasn't our intention in the expertise gap paper to conflate them. Sue Squires, the anthropologist member of the HPCS Productivity Team (who has been interviewing and surveying people in the HPC community), commented on this in a recent email. She's out of town right now, so I'll post that comment for her.
Now as for the mission partners - After having talked to quite a few, I have come to the conlusion that they are not in agreement on the exact nature of the expertise gap although they all agree that there is one. I am reminded of [the blind men and the] elephant. The percieved expertise gap by a mission partner is dependent on the type of projects they undertake and the type of expertise they most need.

No matter what the type of gap - at the system level I believe complexity is the cause.

So we can fix the complexity or we can address each of the expertise domains and try to address the gap at this level.

And yes I think, if we take this second approach, that there are different solutions dependent on the type of expertise gap. Education, appropriate team mix and practices, abstract languages etc. In fact we may have to do this as an intermediate step

The problem is multi faceted and the near term solutions will have to be as well.

At least we are beginning to understand the highly variable nature of the expertise gap.


posted by michaelvdv Feb 27 2006, 09:54:09 PM PST Permalink Comments [0]

20060222 Wednesday February 22, 2006
On Gaps
It seems to me that what you have documented in this paper is the existence of two very different expertise gaps, with different causes and possibly different remedies. One is expertise in the program itself, and the other is expertise in the technology and technique of performance tuning on parallel systems. There's a third gap, which is expertise in the programming tools themselves, but we can handle that pretty well by education.

Expertise in the program itself, exemplified by Don's case, is a problem seen in the maintenance of all large and complex programs. It is actually unrelated to expertise in the tools, since that is easily acquired. Theres a lot of work going into this area, but nobody actually seems to use it. Most tools for understanding programs become shelfware, a fact that should deeply worry tool developers. Of course this is made much worse by complexity added into the program in the tuning process.

I don't think this is the "expertise gap" bemoaned by the HPC community.

I really think that they are talking about the second gap, in the techniques of performance tuning. You don't demonstrate that education is not a potential solution for this problem, though it certainly hasn't been a good solution up to now. In fact, I think that we don't know how to educate people in scaling and tuning, though we have some success with apprenticeships and similar techniques. Maybe we need performance tuning workshops, similar to writers' workshops.

Reducing complexity helps both gaps, of course, but reducing the complexity of performance tuning has double leverage. It aids the complex job of performance tuning directly, and it reduces the complexity of the code itself, reducing the first gap. If programmers didn't have to worry about performance, they could all write compact, beautiful code.

I think that you need to call out both problems explicitly, and propose separate remedies for both of them. The different solutions may well be at odds with one another, too. In particular, the solution to individual program expertise is often more abstraction. The solution to tuning is often less abstraction. We need to be very explicit about what part of the problem we are attacking with what tools. I think separating the kinds of "expertise gap" might be a good start.


posted by ball Feb 22 2006, 08:39:15 PM PST Permalink

Is There an "Expertise Gap" in HPC Software Development?

There really is an "expertise gap" out there in the High Performance Computing (HPC) community. One of the goals of Sun's HPCS Productivity Team has been to understand what's really going on, as opposed to what people like to complain about. Susan Squires, our group's anthropologist, would be the first to tell you that those aren't always the same thing.

Some key people in the HPC community have been telling us (in various ways) that they are constrained by the expertise needed for HPC application development, so we looked into it. We gathered several kinds of data and analyzed it using a variety of methods. It turns out that this "expertise gap" idea is right on target, and we can now say quite a bit about how it looks. It takes lots of education and years of experience for people to learn this kind of programming, and only a very few ever get really good at it. We see the problem as an inevitable consequence of the way HPC software gets developed; this will have to change if we're going to get the kind of dramatic productivity increase that DARPA is seeking with their funding of the HPCS program.

Sue Squires, Larry Votta, and I wrote up some of these conclusions for the recent Workshop on Productivity and Performance in High-End Computing (P-PHEC) in a paper we titled "Yes, There Is an 'Expertise Gap' in HPC Application Development" (download the PDF). Here's the abstract:

The High Productivity Computing Systems (HPCS) program seeks a tenfold productivity increase in High Performance Computing (HPC), where productivity is understood to be a composite of system performance, system robustness, programmability, portability, and administrative concerns. Of these, programmability is the least well understood and perceived to be the most problematic. It has been suggested that an “expertise gap” is at the heart of the problem in HPC application development. Preliminary results from research conducted by Sun Microsystems and other participants in the HPCS program confirm that such an “expertise gap” does exist and does exert a significant confounding influence on HPC application development. Further, the nature of the “expertise gap” appears not to be amenable to previously proposed solutions such as “more education” and “more people.” A productivity improvement of the scale sought by the HPCS program will require fundamental transformations in the way HPC applications are developed and maintained.

posted by michaelvdv Feb 22 2006, 04:00:00 PM PST Permalink Comments [0]

Abstraction Levels
I'm reading an interesting book by Don Knuth, Things a Computer Scientist Rarely Talks About. A lot of it has nothing to do with computer science, though it is very interesting, but I found the following gem:
One of the main characteristics of a computer science mentality is the ability to jump very quickly between levels of abstraction, between a low level and a high level, almost unconciously. Another characteristic is that a computer scientist tends to be able to deal with nonuniform structures — case 1, case 2, case 3 — while a mathematician will tend to want one unifying axiom that governs an entire system. ... Experience shows that about one person in 50 has a computer scientist's way of looking at things.

That's a fascinating observation that tends to align a lot of the data about the expertise gap. It's not just expertise, so training won't really handle it. Education, in the true sense might, and apprenticeship might help, but it might just take genetics. It also fits what I was saying about levels of abstraction.

It makes you really want something like Fortress, so the library writer, who is a computer scientist, can paper over the gaps while the mathematician/physicist can think at his own level.


posted by ball Feb 22 2006, 01:28:36 PM PST Permalink Comments [0]

20060216 Thursday February 16, 2006
Introducing Michael Van De Vanter

Allow me to introduce myself. Mike Ball has been inviting me to join in the conversation here, and I'm finally able to jump in, once we dispense with a few formalities.

I'm Michael Van De Vanter, and I've been working with Mike and others at Sun on the DARPA-funded supercomputer project for about a year and a half. It is a real privilege to participate in such an ambitious program. We've been chartered to do no less that rethink completely, from the ground up, how computing systems help people get real work done. The focus of the program is very specifically about the HPC world, but that doesn't take away from the scope of the challenge.

I'm part of the Core Productivity Team, along with Larry Votta (our lead), Susan Squires (our anthropologist), and most recently Victoria Livschitz; we work with lots of other HPCS groups inside Sun to build an understanding of Productivity as a relationship between a whole system (not just the various hardware and software parts) and the context (human, organizational, political) in which it is deployed. We're also chartered to take a deeper look into programmability, which is one of the key aspects of Productivity, and that's where my background comes into play. I've been writing software, teaching programming, and doing research into how to build tools that help people write software for many years (but not in HPC, where I'm a newbie). You can learn more about my background on my personal home page.

I've just returned from the Third Workshop on Productivity and Performance in High-End Computing held last Sunday in Austin, TX. It was a great chance to talk with some of the other folks looking at these questions, not only from the other HPCS Program Vendors (IBM and Cray) but also the broader research community. There are lots of people right now working on this big question, and we're all trying to learn as much from one another as we can.

I'll say more about the workshop and the paper we write for it in a subsequent post; I'll also mention what a pleasure it was to visit Ira Baxter, CEO of Semantic Designs while I was in Austin.


posted by michaelvdv Feb 16 2006, 12:30:00 AM PST Permalink Comments [0]

20060206 Monday February 06, 2006
Abstract thought
Sorry there isn't much going up here, but we are all busily preparing for a huge presentation to Darpa. This is where we tell them all of the wonderful things we've done, and the wonderful plans we've made. Then we hope that they want us to do more of them.

Anyway, though my partners are busy preparing posters for the presentation, my talk is frozen and I have a little bit of time. So I thought I'd talk a little about abstraction.

The mantra among the software engineering community is that increasing the level of abstraction (the language level) increases productivity. There is certainly some truth to that. One poster child for abstraction is garbage collection, which replaces allocation and deallocation with the abstraction of infinite memory. For programs where this works, it truly does improve productivity. All is not, however, sweetness and light.

I recently sat in on a panel of HPC managers and programmers, talking about what would improve their lives. A manager got up, and said that he wanted a high level language that matched the abstraction of mathematics, in which the compiler and runtime took care of all of the performance issues. A programmer got up and said that he wanted a high level language that gave him explicit control over data layout and similar performance-related issues. Of course, as is usual, both are right. The manager is right to want an abstract language that will make it easier to translate mathematics into programs. The programmer, on the other hand, had to face the fact that the abstraction offered is never perfect. In particular programs, as opposed to equations, have performance characteristics. They may be too slow, or too big, and the programmer, armed only with the high-level code, has no control over either factor. What he needs is a way to ignore the abstraction, which is imperfect in the ways that he cares most about, and to manipulate those performance elements directly.

Guy Steele has been designing a language that tries to meet both requirements. You can read about it in general or in excruciating detail, but the important idea related to our current discussion is that it provides a separation between the abstract algorithm specification and the important implementation details that determine speed. The other HPCS vendors have their own languages with similar characteristics. Cray is developing Chapel and IBM is developing X10. The PGAS languages UPC and Fortran, which I mentioned earlier, are another, though lower level, attempt to add data placement to the language while keeping algorithmic development simple. These languages are a recognition that the abstractions offered for programming mathematics are broken in ways that are important for HPC.

Sometimes, though, abstraction fails simply because it's harder to use than more direct approaches. Sometimes, you are much better off with direct manipulation of important objects. My favorite example is the game editor offered with products like WarCraft. (Note that I resisted the temptation to add a link to the gaming company!) When you are building a field to play the game, you don't want to write a loop that puts mountains, N, E, S, but not in the middle, and a ridge running NW to SE. No, you want to say "Mountain. There. There. There, (no, remove that one), There....". If you don't like computer games, consider a GUI builder, it's the same sort of thing. This is a case where direct manipulation of the objects of interest beats going though a set of indirect steps. Oh, yes, this is a lot more abstract than bits, but it reduces the task to a concrete one, not an abstract one.

More on this later, I need to knock off for some sleep.


posted by ball Feb 06 2006, 11:12:49 PM PST Permalink

20060130 Monday January 30, 2006
Hummingbird Simulation using UPC
I showed a couple of slides from a PGAS conference below, but today I just got a real treat. It's a dvd with videos of the conference plus some very interesting animations. This is an animation of a simulation of a hummingbird implemented using UPC (Unified Parallel C), a PGAS language. It's an alternative to using MPI for cluster computing, and many find its higher-level communications easer to use than MPI. The video of the talk should give you an idea what is happening here. I've loaded these videos onto my own web server, since Sun won't let us load such large objects here. The animation is 20 MB and the talk is 65 MB. I may be forced to take them down one day, but for now, enjoy. Oh, you might want to download them before viewing.

posted by ball Jan 30 2006, 07:34:16 PM PST Permalink

20060129 Sunday January 29, 2006
HPC Programming Models

Essentially all modern HPC code makes use of parallelism to speed up execution. Parallelism isn't the only way to speed things up, but it's the most general way, and the other approaches are usually used in massively parallel systems anyway. We'll talk about these other approaches some other time.

There are two common approaches to tying computers together for HPC. One is a Symmetric multiprocessor, or SMP, which consists of group of processors sharing a common memory. These are very common, and are becoming even more common with multi-core chips. The other is a cluster of computers interconnected with some high-speed network. These techniques can be used together to build a cluster of SMPs. In fact, the most common HPC system in the near future will probably be a cluster of multi-core Opterons.

The most common tool for writing parallel programs is MPI, or the Message Passing Interface library. This is a library, used from either Fortran or C/C++, that handles data transfer and synchronization among processes.

If we look at real-world usage we discover that it's dominated by MPI.

NERSC Top 20 Applications (52% of Cycles)
F90/F95 17
C/C++ 3
MPI 20
OpenMP 4 (as an alternative to MPI)
(Sca)LAPACK 5
NetCDF 2
PETSc 1

A larger survey of about 300 users at NCSA asked about how the program is parallelized.

How is the program parallelized?
MPI44%
openMP14%
Mixed MPI/openMP11%
Automatic8%

Unfortunately, though MPI works well, it's not easy to use, since it's a very low-level library. A good part of our productivity study has been spend looking for alternatives.


posted by ball Jan 29 2006, 12:14:28 PM PST Permalink

High Productivity Computing
I started writing a blog on HPCS, and some other members of the team thought that we really ought to do a group blog on the results we've been getting. I thought that was a wonderful idea, so I've set this up and copied the first posts from my blog over to here. From now on, I'll be using my original blog for more general tools subjects, and this one for HPCS topics. My fellow bloggers are Michael Van De Vanter and Susan Squires. You can read about us on the "About" page. -Mike Ball-

HPCS

We've been working on the High Productivity Computer Systems (HPCS) project within Sun for the past two years. This is a project sponsored by DARPA to make super-computer systems more productive as well as faster and bigger. At the very least, it's a noble effort and will increase our understanding of productivity. I'm in the Developer Products group and am working on developer tools for highly parallel programs.

I'm going to concentrate on Performance, Productivity, and tools, but with a bit of a different emphasis from most of the Sun blogs.

Performance

First, let's define the term High Performance Computing (HPC). This used to be High Performance Technical Computing (HPTC), and I don't know why it changed. The major features of HPC are floating point computation, arrays as a data structure, Fortran, and simply enormous problem size. Let's discuss each point individually, keeping in mind that I can't include everything in this initial post. There are unmentioned exceptions, overgeneralizations, and significant omissions from each discussion. I'll try to cover those in later posts.

Floating Point Computation

Most HPC programs make heavy use of floating point computation. In fact a count of Floating Point Operations Per Second (FLOPS) is commonly used as a figure of merit for HPC systems. There are even one or two common applications for which this makes sense. Most HPC applications, though, are not so simple, and spend more time moving data around than actually doing floating point arithmetic. None the less, FLOPS is an easy number to measure, and is to this day the basis for inclusion in the top 500 list of supercomputers.

Arrays

Again, HPC programs tend to depend quite heavily on arrays rather than more complicated data structures. Even when there is some indirection involved, this is usually done with an array of indices. Scaling a program is usually a matter of changing the size of arrays. A great deal of discussion goes into just how these arrays are distributed in memory. Different programming models imply different data distribution.

Fortran

A skillful HPC programmer can write Fortran programs in any language, and usually does. Such programs have most of the computation in loops that iterate over arrays, doing some floating point operations on the elements. The calculations on an element are frequently independent or simply related to the calculations on other elements. This makes it possible to vectorize or parallelize the loop to improve the performance. If the program is written in Fortran 90 or some later version, array operations provide a way to apply calculations to every element without writing any loops. In fact, it was a pleasant surprise to me just how close F90 programs are to the underlying mathematical notation. I haven't written any Fortran programs since the 1970's, but I'd certainly use it now if I had any computationally intensive problems to solve.

Size

HPC programs tend to be big, and they tend to have very large amounts of data. There is no such thing as "big enough" for a real HPC program. If you tell a web applications programmer that his machine just got ten times faster, he will say something like "Now I only need a tenth the number of machines to do my job." if you tell an HPC programmer the same thing, he'll say something more like "Now I can decrease the mesh granularity and get better answers." In other words, system size and speed has moved from being a purely economic problem to being a technical problem. It's a very different attitude.

Productivity

Productivity is a big and complex subject, and I'll have lots to say about it over the future weeks. For now, I'd like to leave you with a couple of teasers. These are two keynote talks from a recent conference on programming models for HPC. They are by people who have been in this field a long time, and understand the problems very well indeed. I just have slide sets here, so they are short. Take time to look at them

posted by ball Jan 29 2006, 12:11:27 PM PST Permalink