DeinoMPI

The Great and Terrible implementation of MPI-2

FEATURES

DeinoMPI is a derived work from MPICH2 provided by Argonne National Lab.  By starting with the MPICH2 code base DeinoMPI inherits a stable and complete implementation of the MPI-2 standard.  DeinoMPI heavily modifies the original code base and does not rely on anything from Argonne National Lab.  Everything needed to build and execute MPI applications can be downloaded here.

DeinoMPI extends MPICH2 with the following support:

USABILITY:

  • UNICODE support.  All the functions that take char* arguments now have a second version that takes wchar_t*.  DeinoMPI is implemented using wide characters and provides wrapper functions for ASCII char * strings.  The name conversion is taken care of automatically with macros in mpi.h so all the user program has to do is define UNICODE and compile. These are the dual implementation functions: MPI functions with string arguments
  • Binary Win32-Win64 compatibility.  Clusters of 32bit and 64bit Windows are able to run jobs that span both machines.
  • Singleton init supports MPI-2 spawn functions.  DeinoMPI allows single processes started without the process manager to call the spawn functions just as if they had been started by mpiexec.  In other words, if you start your application like this, mpiexec n 1 myapp.exe, or like this, myapp.exe, both applications can call MPI_Comm_spawn.
  • Directory staging.  You can automatically copy a directory with all your data files and even the MPI executable out to the worker nodes and then start your job from this directory.  After the job completes you can select to have any new or modified files in the directories on the worker nodes copied back to the source directory.  Any files on the worker nodes copied back to the source directory are automatically renamed to avoid collisions if necessary.

OPTIMIZATIONS:

  • Collective operations have been optimized for clusters of SMP machines.  The collective operations have been optimized to minimize network traffic when multiple processes reside on each node.  The new functions only affect MPI_COMM_WORLD but in the future this support will be extended to derived communicators.  Currently this functionality has to be turned on by an environment variable.  If you want to try it out add "-env DeinoMPI_USE_SMP_OPTIMIZATIONS 1" to your mpiexec command line.

SECURITY:

  • A new startup mechanism and process manager.  The DeinoPM (process manager) uses public and private keys to establish secure connections between machines in the cluster.  All traffic between the process managers is encrypted.  Each user controls their own keys (similar to the way they would for ssh).

DEBUGGING:

  • An abort callback function has been added.  This allows for a function to be called asynchronously when a job is about to be aborted.  The current implementation uses this function to write out logging buffers to disk before the process is killed.  The subsequent log files contain more data as a result.
  • An MPI message queue printing callback function has been added.  While an MPI application is running you can request that the internal MPI message queues be printed out.  This can be helpful if your application hangs and you want to see what MPI messages the processes are waiting on.
  • A runtime option to save a textual description of each MPI function call in a ring has been added.  The user can select to print this function call history while the application is running.  This can be helpful while debugging applications by showing the recent MPI function calls for each process.

The future of DeinoMPI:

  • A scheduler to accompany the DeinoPM process management mechanism.
  • Parallel scratch file system for Windows