|
We have seen numerous press releases on Message Passing Interfaces
(MPI) lately including those from Microsoft who has been working with
Argonne Labs (funding a Win32 port) of MPICH2, and this, most recent
announcement of Ohio State University's port of MVAPICH to Solaris across Infiniband.
Sun has been collaborating with OSU for a long time, working with Linux
and Solaris on both SPARC and x64 based platforms. The current
announcement from OSU is a novel MPI-2 based design (at the ADI-3
level) providing uDAPL-Solairs support. So what is this acronym soup?
Infiniband:
a high performance switched fabric providing high bandwidth (in excess
of 30Gbps) and low latency (can be lower than (<)5ms for serial I/O
(channel based) between two host channel adapters
(HCAs which are available at costs < $70). This fabric utilizes a
separate I/O | communications processor from the traditional node CPU
to allow the independent scaling of I/O and the offloading of I/O
responsibilities allowing performance & cost tuning of computing
clusters. Typical per port costs are in the $300 range (HCA & TCA)
vs. >$1k for 10GBE adapters, so performance@cost is definitely in
IB's favor for the highest of performance needs.
Message Passing Interface (MPI):
established in 1999 to provide a standard set of message passing
routines that focus on both performance and portability, recognizing
that these goals are often at odds with one another. MPI-2 work was
begun in 1997 was designed to realize areas where the MPI forum was
initially unable to reach consensus like one-sided communications &
file I/O. Basically MPI, makes use of GETting or PUTting ( or
ACCUMULATE) data from/to a remote window that reflects a shared memory
space in non-blocking ways for parallelized performance (an older, but
still relevant tutorial from University of Edinburgh).
User-Level Direct Access Transport APIs (uDAPL):
there has been a need to standardize a set of user-level API's across a
variety of RDMA capable transports such as InfiniBand (IB), VI and
RDDP. The model, is a familiar one to most infrastructure programers,
that of a interface producer (both local and remote) and an interface
consumer that has visibility as to the localness of the provider. uDAPL
is designed to be agnostic to transport ala IB to unlock consumers
(like MPI) from the intricacies of the underlying transport in a
standardized way. Within this layer cake, it is expected that a uDAPL
consumer will talk across a fabric to another uDAPL consumer though
this is not mandated, it is common practice.
MPICH & MVAPICH2: are
implementations of MPI provided by a variety of entities (mostly
government agencies/labs and universities) which are frequently
competed on features and performance. MVAPICH2 has been focused on IB,
whereas MPICH2 supports other interconnects including Quadratics and
Myrinet, either way, the goal is to create a high performance consumer
(programmer) interface that can sit on standard or customized
interconnection stacks. Where MVAPICH2 tends to shine is in larger
packets providing higher bandwidth (though at a cost to small packet
latency). A reasonable comparison from OSU and Dr Panda here (though we have to remember Dr. Panda's sponsorship of MVAPICH).
So that was a short summary, but hopefully this just wets your appetite
for looking at architectures like Infiniband for constructing highly
performant Grids/Clusters, and some of the techniques that you might
request from Sun Grid to accelerate your parallel applications.
BTW: Sun Grid has MPICH 1.2.6 pre-installed including Java wrappers, here is a sample deployment script:
Permalink
Trackback:

http://blogs.sun.com/dhushon/entry/p_we_have_seen_numerous
|