« I really have no idea, Dave. I've been stone-cold drunk since about 8 this morning. | Main | Dave, we're *not* sinking! »

That's a good ploy, Dave, to pretend that the ship is sinking.

Linux really sucks sometimes.

I'm working heavily on Tucson, and since yesterday morning I've been fighting a bug where the manager wouldn't spawn children properly. LAM/MPI would return an error and say that the rpcreatev() (one of the underlying functions under MPI_COMM_SPAWN that is used in LAM to actually spawn a remote process) had failed.

I couldn't figure out why this routine was failing -- it's used successfully in many different places. It's used in mpirun itself, and isn't failing there, for example. So why is it failing in MPI_COMM_SPAWN?
I tried to use gdb and ddd to track the problem down, but gdb kept seg faulting. Sigh. Linux debuggers are generally useless. I was reduced to printf debugging in a multi-threaded, parallel program. Do you have any idea how painful that is? Sigh.

It took me quite a while, but I finally figured out what the problem was.

Each LAM client has a global structure named _kio that contains, among other things, the PID of the process that is using LAM. That is, each MPI program has to call MPI_INIT, which, in turn, calls the internal LAM function call kinit, which opens a socket to the local LAM daemon and does some other bookkeeping things. One of the things that it does is cache the PID of the kinit-calling (i.e., MPI_INIT-calling) process on this global _kio struct. That way, if you fork, if you invoke a LAM function call, it will know that this process is not registered with the LAM daemon and can therefore throw an error.

Note that only some MPI functions will end up doing this compare-the-PID thing. One class of examples are MPI functions that need to send out-of-band (OOB) information, such as MPI_COMM_SPAWN.
This scheme actually works fine, and has prevented me from doing stupid things in the past.

However, it has caused me much grief over the last 24 hours because Linux implements threads are processes. Hence, each thread has a different PID. End result: MPI_COMM_SPAWN will end up comparing the thread's PID with the one cached on _kio. If they don't match, boom.
This is a problem if any thread other than the one that invoked MPI_INIT invokes these MPI functions. i.e., even if we guarantee that only one thread is "in MPI" at any given time, the current scheme in LAM will fail because each thread has a different PID.


I don't quite know how to solve that in LAM yet (there's probably some way to get a unified PID for all the threads in a single process... need to look that up...), but I do know how to solve it in Tucson: force all MPI calls to be in a single thread. What a pain.


There are 377 copies of xmms running, out of 460 total processes (81%).

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


This page contains a single entry from the blog posted on May 11, 2001 11:53 AM.

The previous post in this blog was I really have no idea, Dave. I've been stone-cold drunk since about 8 this morning..

The next post in this blog is Dave, we're *not* sinking!.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.34