MPJ Express: An implementation of MPI in Java

Beta Release (0.27)
Aamir Shafi, Bryan Carpenter, and Mark Baker
Last updated: $Date: 2007/07/10 11:30:34$

The latest version of this document is available here

Table of Contents

  1. Introduction
  2. Getting started
  3. Writing and compiling MPJE programs
  4. Running MPJE programs with the MPJ Express runtime
  5. Running MPJE programs without the MPJ Express runtime (manually)
  6. The MPJ Express test suite
  7. Compiling MPJ Express source code and test suite
  8. Tested platforms
  9. Java-docs
  10. Contact and support
  11. Miscellaneous
    1. Turning debugging on and off
    2. Running daemons in console mode
    3. Running MPJ Express daemons on Solaris, PowerPC Linux, and PowerPC Mac OS X
    4. Changing protocol switch limit
    5. API changes between mpiJava-1.2.x and MPJ Express
    6. Acknowledgements
  12. Known issues and limitations
    1. Test cases hang on Solaris and Windows
    2. Takes a long time to bootstrap MPJ processes
    3. Intercomm.Merge(..) -- limited functionality
    4. Using MPI.PACK datatype -- tweaked
    5. MPI.PACK with buffered mode -- tweaked
    6. Cartcomm.Dims_Create(..) -- limited functionality
    7. Request.Cancel(..) -- not implemented
    8. Printing long lines (>500 characters) with runtime -- limitation
    9. Exception "Another mpjrun module is already running on this machine"
    10. Permission issues while using MPJ Express runtime on Windows
    11. Mixing remote loading and local loading may end up in ClassNotFoundException for $MPJ_HOME/src/mpi/MPI.java
  1. Introduction

  2. This software (MPJ Express - MPJE) is a reference implementation of the MPI bindings defined for the Java language. The current version of this software is following the mpiJava 1.2 API specification . We plan to add support for the MPJ API in a subsequent release. It is important to note that the difference between these two APIs is in the naming schemes for classes and methods. The functionality provided to users is essentially the same for both APIs.

    This release contains the source code and binaries of MPJ Express library, as well as the runtime infrastructure. We have developed a test suite that imports various test cases from mpiJava; it also has a number of new test cases. This test suite checks the functionality of almost every MPI function. See the section on "MPJ Express test suite" for further details. This software has been tested on various UNIX and Windows operating systems. See the section "Tested Platforms" for a list of tested platforms.

    There are two fundamental ways of running MPJE applications. The first, and the recommended way is using the MPJ Express runtime infrastructure, alternatively the second way involves the 'manual' start-up of MPJE processes.

    The MPJ Express runtime infrastructure consists of daemons and the mpjrun module. The idea is, that the users of MPJ Express first start daemons on a number of compute-nodes, which in this document means the machines that execute MPJE processes. These can also be thought as the compute-nodes of a cluster. Once the daemons are running on compute-nodes, then the users can use the mpjrun module (using mpjrun.sh or mpjrun.bat scripts) on the cluster's head-node, which contacts the daemons, starting the MPJE application, and transports output back to the head-node so that users can view the progress of their programs during execution. The MPJ Express runtime infrastructure is able to run the code as JAR or class files. The runtime infrastructure provides the notion of local loaders and remote loaders. A user may prefer to use local loaders if the compute-nodes and head-node have a shared file system, and the MPJ Express JAR files as well as the user application is available locally on the compute-nodes. On the other hand, remote loaders can be used in cases where there is no shared file system on the compute-nodes, and the MPJ Express JAR files and the user applications have to be fetched from the head-node.

    The second way, which is referred in this document as 'manual', is to run the shell script 'runmpj.sh' that uses SSH to execute the code. This script is able to run JAR or class files, but it is only possible to use this script on UNIX-based operating systems. For Windows, running test cases and applications manually means starting each MPJE process by using the java command.

    The MPJ Express infrastructure does not deal with security in the current release. The MPJ Express daemons could be a security concern, as these are Java applications listening on a port to execute user-code. It is therefore recommended that the daemons run behind a suitably configured firewall, which only listens to trusted machines. In a normal scenario, these daemons would be running on the compute-nodes of a cluster, which are not accessible to outside world. Alternatively, it is also possible to start MPJE processes 'manually', which could help avoid runtime daemons. In addition, each MPJE process starts at least one server socket, and thus is assumed to be running on machine with configured firewall. Most MPI implementations assume firewalls as protection mechanism from the outside world.

  3. Getting started

    1. The pre-requisite for using MPJ Express is Java 1.5 (stable) or higher. Make sure that you use the stable version because there is a bug in Java 1.5 beta that affects MPJ Express. If you are interested in compiling the source code of MPJ Express, see section "Compiling the MPJ Express source code and test suite

    2. Download MPJ Express and unpack it. This should create a folder named "mpj-v<version_number>".

    3. Set MPJ_HOME and PATH environmental variables.
      • Linux (assuming MPJ Express is in '/home/aamir/mpj')
               export MPJ_HOME=/home/aamir/mpj
               export PATH=$PATH:$MPJ_HOME/bin 
        These lines may be added to ~/.bashrc
      • Windows (assuming mpj is in 'c:\mpj')
        • Right-click My Computer->Properties->Advanced tab->Environment Variables and export the following system variables (user variables are not enough)
        • Set the value of variable MPJ_HOME as c:\mpj
        • Set the value of variable PATH as c:\mpj\bin
      • Windows with cygwin (assuming mpj is 'c:\mpj'
        • The recommended way to is to set variables as in Windows
        • If you want to set variables in cygwin shell
        •           export MPJ_HOME="c:\\mpj"
                    export PATH=$PATH:"$MPJ_HOME\\bin" 
          These lines may be added to ~/.bashrc

    4. Create a new working directory for MPJE programs. This document assumes that the name of this directory is mpj-user. The location of this directory is not important in the context of execution of the code. This directory will hold users MPJE programs, machines file, and configuration file (for manual execution)

    5. Start the daemons
      • cd mpj-user
      • Write a machines file simply stating a machine name or IP address on each line. Save this file as 'machines' in mpj-user directory. More details on the format of machines file can be found here
      • Installing and starting daemons
        • Linux: mpjboot machines
        • Windows: on each machine listed in machines file:
          • Run $MPJ_HOME/bin/installmpjd-windows.bat
          • Goto Control-Panel->Administrative Tools->Services-> MPJ Daemon and start the service. It is important to start the daemon as a user process instead of a SYSTEM process. Click here to see how can this be done.
        • To test if the daemons have started on compute-node
          • For Linux Only: Each daemon produces a MPJ-Daemon<machine_name>.pid file in $MPJ_HOME/bin directory.
          • Each daemon produces a log file named daemon-<machine_name>.log in $MPJ_HOME/logs directory.

    6. Running test cases
      • cd mpj-user
      • Linux: mpjrun.sh -np 2 -jar $MPJ_HOME/lib/test.jar
      • Windows: mpjrun.bat -np 2 -jar %MPJ_HOME%/lib/test.jar
      • You may view sample output.

    7. Running your first MPJE application
      • Write a MPJE program, and save it as World.java. This document is assuming that you have a 'machines' file in mpj-user directory.
      • cd mpj-user
      • Compile
        • Linux: javac -cp .:$MPJ_HOME/lib/mpj.jar World.java
        • Windows: javac -cp .;%MPJ_HOME%/lib/mpj.jar World.java
      • Execute
        • Linux: mpjrun.sh -np 2 World
        • Windows: mpjrun.bat -np 2 World
      • You may also make a JAR file 'hello.jar' that contains World.class (see section "Writing and compiling MPJE programs" for details) and execute it
        • Linux: mpjrun.sh -np 2 -jar hello.jar
        • Windows: mpjrun.bat -np 2 -jar hello.jar
  4. Writing and compiling MPJE programs

  5. Running MPJE programs with MPJ Express runtime

    One of the challenging aspects of a Java messaging system is creating a portable mechanism for bootstrapping MPJE processes across various platforms. If the compute-nodes are running a UNIX-based OS, it is possible to remotely execute commands using RSH/SSH, but if the compute-nodes were running Windows, these utilities would not be available. The MPJ Express runtime provides a unified way of starting MPJE processes on compute-nodes irrespective of what operating system they may be using. The runtime system consists of two modules. The 'daemon' module runs on compute-nodes and listens for requests to start MPJE processes. The daemon is simply a Java application listening on an IP port, which starts a new JVM every time there is a request to run a MPJE processes. The 'mpjrun' module acts as a client to the daemon module. This module is started on, for example, the cluster head-node, and will contact daemons and return standard output for the user to view.

    With Java, it is possible to run applications using class files, or class files bundled as a JAR file. The MPJ Express runtime allows the execution of MPJE applications both as JAR files and class files. With MPJ Express, the users may want to load MPJE JARs and classes either remotely or locally on the compute-nodes. With remote loader, it is possible to load all classes (application and MPJ Express code) from the head-node. This is useful in scenarios when there is no shared file system and the code is constantly being modified at the head-node. With local loader, it is possible to load all classes (application and MPJ Express code) from the compute-node. This might be useful if there is a shared file system. As all classes are loaded locally, this might provide better performance in comparison to remote loader. The default loader used in MPJ Express runtime infrastructure is remote loader. 'mpjrun' module provides -jar switch to execute JAR files and no switch is required to execute class files. The users can select local loading with the switch -localloader. The -wdir switch can be used to run the code in the appropriate directory on the remote node. When running JAR files using -localloader, the users should put the JAR in the CLASSPATH using -cp switch.

    MPJ Express uses the Java Service Wrapper Project software to install daemons as a native OS service. This essentially means that there is some platform specific code used in order to achieve this. Currently, MPJ Express is distributing only Linux and Windows specific native code, but if you are interested in running MPJ Express daemons on other platforms like AIX, FreeBSD, HP-UX, HP-UX64. IRIX, MacOS, etc., then you can download the platform specific code from Java Service Wrapper Project . Some PATH variables in the scripts for these platforms will have to be changed. Feel free to contact us , if you need any help regarding this. The rest of this section explains how to install, start, stop, and uninstall MPJ Express daemons on Linux and Windows. In addition, it also shows how to run your MPJE programs using mpjrun module on these platforms.

  6. Running MPJE programs without MPJ Express runtime (manually)

  7. We do not recommand starting programs manually as normal procedure. This section documents the procedure for manual start-up, mainly to allow developers the flexibility to create their own initiation mechanisms for MPJE programs. The runmpj.sh script can be considered one example of such a mechanism.
  8. MPJ Express test suite

  9. MPJ Express contains a comprehensive test suite to test the functionality of almost every MPI function. This test suite consists mainly of mpiJava test cases, MPJ JGF benchmarks, and MPJ microbenchmarks. The mpiJava test cases were originally developed by IBM and later translated to Java. As this software follows the API of mpiJava, these test cases can be used with a little modification. MPJ JGF benchmarks are developed and maintained by EPCC at the University of Edingburgh . MPJ Express is redistributing these benchmarks as part of its test suite. The original copyrights and license remain intact as can be seen in source-files of these benchmarks in $MPJ_HOME/test/jgf_mpj_benchmarks. Further details about these benchmarks can be seen here. MPJ Express also redistributes micro-benchmarks developed by Guillermo Taboada . Further details about these benchmarks can be obtained here

    The suite is located in $MPJ_HOME/tests directory. The test cases have been changed from their original versions, in order to automate testing. TestSuite.java is the main class that calls each of the test case present in this directory. The build.xml file present in test directory, compiles all test cases, and places test.jar into the lib directory. By default, JGF MPJ benchmarks and MPJ micro-benchmarks are disabled. Edit $MPJ_HOME/test/TestSuite.java to uncomment these tests and execute them. Note, after changing TestSuite.java, you will have to recompile the testsuite by executing 'ant' in test directory.


  10. Compiling MPJ Express source code and test suite


  11. Tested Platforms


  12. Java-docs

  13. Java-docs can be seen in $MPJ_HOME/doc/javadocs

  14. Contact and support


  15. Miscellaneous

    1. Turning debugging on and off

    2. Changing protocol switch limit

    3. For debugging purposes, sometimes it is useful to run the daemons in console mode. This can be achieved in the following way:
      1. cd $MPJ_HOME/bin
      2. On UNIX systems, execute ./mpjdaemon_linux_x86_32 console . Here we are starting the daemon on a 32 bit x86 processor. Choose the appropriate script for your machine.
      3. On Windows, execute cd %MPJ_HOME%/bin ; wrapper.exe -c ../conf/wrapper.conf

    4. With default settings attempting to start MPJ Express daemons on UltraSPARC Solaris, PowerPC (PPC) Linux, or PPC Mac OS X would result in an error like this:
       
           mpjboot machines 
           Starting mpjd... 
           ./mpjdaemon_linux_x86_32: line 1: ./daemon_linux_x86_32: cannot execute binary file 
         
      The reason is that by default x86 based code is called, which naturally does not work on PPCs and UltraSPARCs. We are currently in the process of writing smart scripts that call the appropriate libraries based on the processor architecture and operating system. In the meantime, this problem can be fixed in the following way:
      • Solaris
         
        	  a. Edit $MPJ_HOME/bin/mpjboot and $MPJ_HOME/bin/mpjhalt
        	  b. Comment the line ssh $host "cd $MPJ_HOME/bin;./mpjdaemon_linux_x86_32 start;"
        	  c. Uncomment the line #ssh $host "cd $MPJ_HOME/bin;./mpjdaemon_solaris_sparc_64 start;"
        	  d. cd $MPJ_HOME/lib
        	  e. cp libwrapper.so_solaris_sparc_64 libwrapper.so
        	  
      • PPC64 Linux
         
        	  a. Edit $MPJ_HOME/bin/mpjboot and $MPJ_HOME/bin/mpjhalt
        	  b. Comment the line ssh $host "cd $MPJ_HOME/bin;./mpjdaemon_linux_x86_32 start;"
        	  c. Uncomment the line #ssh $host "cd $MPJ_HOME/bin;./mpjdaemon_linux_ppc_64 start;" 
        	  d. cd $MPJ_HOME/lib
        	  e. cp libwrapper.so_linux_ppc_64 libwrapper.so
        	  
      • PPC32 Mac OS X
         
        	  a. Edit $MPJ_HOME/bin/mpjboot and $MPJ_HOME/bin/mpjhalt
        	  b. Comment the line ssh $host "cd $MPJ_HOME/bin;./mpjdaemon_linux_x86_32 start;"
        	  c. Uncomment the line #ssh $host "cd $MPJ_HOME/bin;./mpjdaemon_macosx_ppc_32 start;"
        	  d. cd $MPJ_HOME/lib
        	  e. cp libwrapper.jnilib_macosx_ppc_32 libwrapper.jnilib
        	  

    5. To see API differences between mpiJava-1.2.x, and MPJ Express, read $MPJ_HOME/doc/APICHANGES.txt

    6. We would like to thank:

  16. Known issues and limitations