Message Passing Toolkit

Many important scientific and engineering applications use message passing for their parallel implementations, since it delivers both portability and performance across a variety of computer architectures and systems.

The SGI® Message Passing Toolkit (MPT) provides versions of industry-standard message-passing libraries optimized for SGI® computer systems running SGI® IRIX® and Linux® operating systems. These high-performance libraries permit application developers to use standard, portable interfaces for developing applications while obtaining the best possible communications performance.

MPI

Message-Passing Interface (MPI) was developed by a group of industry, academic, and government representatives with experience in developing and using message-passing libraries on a variety of computer systems. MPI was developed to serve as a common standard, bringing together years of research and experience with message passing. SGI has been a key member of the MPI Forum. The MPI implementations in MPT are fully compliant with the current MPI-1.2 specification.

The MPI-2 specification, which extends the MPI-1.2 specification, includes definition of interfaces in key areas not covered by the MPI-1.2 specification, such as dynamic process management and one-sided communications. MPT contains a number of MPI-2 features, and additional highly sought MPI-2 features continue to be added.

SHMEM™ Programming Model

The explicit parallel programming capabilities of MPI are further extended through the innovative SHMEM parallel programming model. The SHMEM library provides the fastest interprocessor communication in the industry using data passing or one-sided communication techniques. The SHMEM library also contains a facility for assigning global pointers that allow data in another cooperating process to be accessed directly via load/store for communication or synchronization.

In addition, the SHMEM library includes a number of highly optimized routines for collective operations such as global reductions. Since it can be implemented very efficiently on globally addressable shared- or distributed-memory systems, use of this library improves communication latency by an order of magnitude over optimized MPI implementations on all SGI® architectures. Some of the one-sided communications concepts introduced in SHMEM have been incorporated into the MPI-2 specifications, providing future portability to applications programmers choosing to use SHMEM today.

Development Tools

Development tools support these communications libraries. The TotalView Technologies TotalView® and Allinea Distributed Debugging Tool (DDT) debuggers can be used to debug and optimize message-passing applications for SGI computer systems running IRIX or Linux operating systems.

MPI Library Features on IRIX and Linux Systems

  • MPI-1.2 compliant implementations
  • Scales to support 48x128 clusters or 512P hosts.
  • NUMAlink® optimizations for single host and multi-host systems.
  • Optimized MPI collectives routines
  • MPI-2 capabilities:
    • MPI I/O
    • One-sided communication, including PUT, GET, FENCE, LOCK/UNLOCK
    • C++ bindings
    • MPI_Type_create_hindexed and other replacements for deprecated MPI-1 data types
    • Generalized requests
    • MPI-2 attributes
    • USE MPI Fortran 90 statement support
    • Thread safety
  • MPI statistics
  • Automatic aborted job cleanup
  • Auto selection of NUMAlink or InfiniBand interconnect
  • TotalView Technologies TotalView message queue display

MPI Features Only on IRIX Systems

  • Auto selection of GSN®, Myrinet®, or NUMAlink interconnect
  • Fortran 90 compile-time MPI interface checking
  • MPI-2 capabilities:

SHMEM Library Features

  • Simple shared-memory interfaces
  • One-sided "get" and "put" operations
    • Less overhead (and better performance) than traditional two-sided message-passing
    • More natural interfaces for many applications
  • High-performance collective operations, including reductions and broadcasts
  • Remote global pointer capability
  • Optimized implementations

MPT may be installed easily in alternate locations, allowing customers with many users to more gradually introduce new releases to their user communities.

MPT Key Features

Some of the features of MPT are listed below. The SGI Technical Publications site (techpubs.sgi.com) contains more information regarding each feature in the release notes and associated documentation.

  • MPI_IN_PLACE
    Collective communications can occur "in place" for intra-communicators, with the output buffer being identical to the input buffer.
  • Support of MPI_Accumulate
    This is a one-sided communication function that combines the data moved to the target process with the data that already resides at that process, rather than replacing the data there.
  • MPI and OpenMP™ hybrid NUMA placement
    NUMA placement can be critical for MPI codes on IRIX and Linux. This is also the case for hybrid OpenMP and MPI codes. Support has been added for "containing" the OpenMP threads near the MPI ranks in this release.
  • Fast barrier synchronization
    MPT includes optimized implementations of the MPI_Barrier, shmem_barrier_all and MPI_Win_fence functions.
  • Enhanced Altix partitioned support
    The SHMEM programming model is supported for partitioned Linux Supercluster systems. On Linux operating systems, the mpirun multiple host syntax is required to launch a SHMEM application on multiple partitions. SHMEM routines can be used exclusively in an application, or can be used in conjunction with MPI message-passing routines in the same application.
  • Stack Traceback and Core Dump Control
    If a rank aborts, either by calling MPI_Abort() or by receiving an unhandled signal that normally results in a core dump, MPT will display a stack traceback showing the location of the abort. MPT will limit creation of core files to the first rank on each host to abort, by default.
  • Optimized MPI on the InfiniBand interconnect fabric
    MPT supports optimized MPI communication on clusters of SGI servers that are interconnected using InfiniBand hardware and software provided by Voltaire, Inc.
  • Global shared memory allocator
    SGI® Altix® servers have global shared memory. This means that memory segments can be allocated on or striped across memory nodes anywhere in the system and be referenced via ordinary load/store instructions. MPT provides a global shared memory allocator that may be called within MPI or SHMEM programs to allocate remote or distributed memory segments.
  • Run-time optimization of large messages via direct-copy, unbuffered transfer
    Large message bandwidth is optimized by direct data transfer between user send and receive buffers without buffering in the library. This optimization opportunity is detected automatically, as well as being tunable by environment variables.