Open Channel Foundation
Not Logged In |  | 
Open Channel Foundation


Quick Application Search:


MPICH
Get this title!
¤ 
Get MPICH
¤ 
Monitor new releases


Basic information
¤ 
FAQ
¤ 
Contributors
¤ 
Vision & Direction
¤ 
History
¤ 
Documentation
¤ 
Support
¤ 
Patches & Enhancements


Additional resources
¤ 
Gallery of MPICH projects
¤ 
MPICH Abstract Device Interface 3
¤ 
MPICH and Threads
¤ 
MPICH Demonstrations
¤ 
MPICH for Windows Download Page
¤ 
MPICH Papers and Talks
¤ 
MPICH Year 2000 statement
¤ 
Platforms supported by MPICH
¤ 
Status of the MPICH 1.2.4 implementation of MPI-1 and MPI-2
¤ 
Tools that work with MPICH


Foundation :: Parallel Computing :: MPICH

MPICH Support

Fortran Compilers
Back to top

Some users have had difficulty when mixing C and Fortran compiler vendors. For example, when using GNU gcc and the Portland Group pgf77, C programs would fail to link. This is because some of the code used to provide definitions of some Fortran types is computed at MPI_Init time in MPICH 1.2.3 by using a small Fortran routine, and in some cases, these routines refer to symbols that are defined only in the Fortran run time libraries. To fix this, you need to add the Fortran libraries when building C programs. You can do this with the -lib switch in configure:
configure -cc=gcc -fc=pgf77 -lib="-L/usr/local/pgi/linux86/lib -lpgftnrtl -lpgc" ...
The next release of MPICH will attempt to avoid the need for these libraries when possible and will determine them for you otherwise.

If you are using a Fortran 90 compiler as your Fortran compiler (e.g., -fc=pgf90), you'll need a longer list of libaries. Look in the library directory for your Fortran compiler for the libraries. For example,

configure -cc=gcc -fc=pgf90 -lib="-L/usr/local/pgi/linux86/lib -lpgf90 -lpgf90rtl -lpgftnrtl -lpgc" ...

Redhat 7.0 Compilation Problems
Back to top

Redhat 7.0 ships with a version of gcc that is a development version and should not have been used by Redhat. See http://gcc.gnu.org/gcc-2.96.html for GNU's discussion of this.


LINUX TCP Performance
Back to top

There is a bug in the implementation of TCP in some versions of Linux. This has been documented by Josip Loncaric. His description of the problem and fix are available on the Web. He reports that as of Linux 2.4.3, this TCP patch is no longer necessary.


Fortran 77 and Fortran 90/95
Back to top

MPICH currently supports Fortran 77. There is some support for using Fortran 90/95, but this is still imperfect. There is a table that shows some of the combinations of configure options that can be used to make MPICH work with Fortran 90 and Fortran 95 compilers.


Failures in System Includes on Solaris
Back to top

Some users have had problems with the make failing when compiling the routines in mpid/ch_p4 on Solaris. A typical output is
...
In file included from /usr/include/rpc/rpc.h:38,
from p4/include/p4.h:16,
from chdef.h:9,
from ../ch2/packets.h:350,
from mpiddev.h:15,
from adi2recv.c:9:
/usr/include/rpc/auth_des.h:58: field `adv_ctime' has incomplete type
...
This appears to be caused by the MPICH configure deciding to reject cc as the C compiler (because it does not support function prototypes) and picking gcc instead. Unfortunately, configure is still using cc for the C preprocessor, and gets confused (this is actually a bug in autoconf; it should not be using a C preprocessor to decide what the C compiler, whose search paths might depend on various options, will find).

To fix this, add -cc=gcc to the configure line, reconfigure and remake.


Unsolved Problems
Back to top

errno=110 or connection timed out in Linux
This indicates that Linux has closed a TCP connection. This can happen if, just when MPICH is trying to communicate a message, the network interconnect becomes very busy. Other operating systems have less fragile TCP connections. We are working on a fix for this, but a better one would be for Linux to provide more robust TCP connections.

Shared Memory in LINUX

Shared memory in LINUX (ch_shmem and we expect ch_p4 with -comm=shared) may fail. In some Linux systems, the system call 'semget()' fails to create a new semaphore set. In particular, MPICH will return the following error message when attempting to run an executable.

semget failed for setnum = 0
This seems to be a problem with LINUX and not with MPICH. While attempting to uncover the problem, it was found that most LINUX machines worked, however, there were some that did not. Further investigation led us to two C programs from the "LINUX Programmer's Guide", semtool.c - command line tool for tinkering with SysV style Semaphore Sets semstat.c - command line tool for the semtool package which displays the current value of all semaphores in the set created by semtool (not necessary). These tools are useful in determining if your LINUX machine will run MPICH in shared memory. If you are able to create a semaphore as the result of running semtool, it is highly probable that MPICH will run on your LINUX machine with shared memory. At the moment, we are attempting to get to the root of the problem and would appreciate any suggestions or comments from the LINUX world.

One user found that the include file
/usr/include/linux/sem.h
(used by the kernel) and
/usr/include/sys/sem.h
and
/usr/include/sys/ipc.h
were not consistent, causing programs using semaphores to fail. If you have this problem, you need to correct your installation of Linux.


Problems with MPI-IO and NFS
Back to top

The network file system (NFS) must be configured extremely carefully for MPI-IO (and many other programs) to work correctly. Unfortunately, few systems are so configured, and doing so can adversely impact performance. As a result, programs using files on an NFS system may hang or produce incorrect results. Note that this is, officially, a design feature of NFS; unless the NFS system is configured with no attribute caching, any two processes, accessing the same file, may produce incorrect results. You can use the -file_system=ufs option of configure to build an MPICH that supports only UFS (Unix File System); MPI-IO works correctly with UFS, XFS, PIOFS, HFS, SFS, etc. (more precisely, those file system that correctly implement basic Unix I/O system calls; something that NFS does not do).

Detailed instructions on setting up NFS can be found in the installation manual or online.


Problems not in MPICH
Back to top

Some problems are caused by compiler problems. Some of the problems are
Solaris C
Version SC3.0.1 will fail with
cg: assertion failed in file ../src/regman/regman_reporter.h at line 36 cg: add_edges_to_new_node -- unexpected edges cg: 42 warnings, 1 errors cc: cg failed for ad_read_coll.c
Removing -O from all of the Makefiles (particularly in ROMIO) may fix this
GNU gcc
Version 2.8.x does not handle the command line argument -I./ correctly.
Compaq Fortran
Version 5.3-915 of f95 has a bug that causes it to fail with files that have the extension .F (e.g., to be processed with the preprocessor). In this case, specify the Fortran 90 compiler to be f90 instead of f95.

An update to Compaq Fortran 5.3 exists in Compaq's FTP repository at ftp://ftp.compaq.com/pub/products/fortran/Tru64/. Please see the file readme.txt in this directory.

You may also want to check on patches for the system that you are running on.

HP patch database:
Europe or US

Back to top



Open Channel Software runs entirely on Open Source Software. We return value to the Software community in the form of services and original software. Most of our content is currently available as source code, with the copyright owned by the original author, All Rights Reserved. Everything else is Copyright ©2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Open Channel Software.

View our privacy statement.
Contact webmaster at openchannelsoftware dot org with questions.