Dave, we're *not* sinking!

I started the C++ Friday Lunch list today.

I subscribed everyone. We'll start next week after everyone has a chance to get the book.

Been working on Tucson heavily.

It took a lot longer to do the MPI queue than I thought. Particularly with respect to arrays of requests. Every time I thought I had it right, I realized that the abstractions were just slightly off, and that would cascade into a whole chain of side-effects and whatnot.


Took quite a while to get it right. I think I've got it right now
-- it all compiles -- but I'm too tired to try it (it can't possibly work -- it's hundreds of lines of code that's all brand new). I'll debug tomorrow.

I really want to have it working -- or at least major parts of it working that I can have some kind of reportable results on Tuesday for me meeting w/ Lummy.

I'm seeing some really weird cron behavior on queeg. Until now, I thought the problem was with my script somehow and so I ignored it. The problem is that I sometimes get double entries in my checking-DSL-connectivity log. That is, it's fired up by cron every minute to check my DSL connectivity. Sometimes I get an entry in the log at xx:xx:59 and xx:xx:00.

I thought my script was just mucking up somehow (it is actually somewhat complicated), so I never bothered to check, because both entries in the log were correct. But today I noticed that cron itself is actually launching the script twice.

My line in crontab is:

 * * * * * /usr/local/bin/check_up.pl 

Watching /usr/log/messages, sometimes I see double entries:

 May 12 22:42:59 queeg CROND15952: (jsquyres) CMD (/usr/local/bin/check_up.pl)
May 12 22:43:00 queeg CROND15954: (jsquyres) CMD (/usr/local/bin/check_up.pl)


DSL dropped out twice today, each for <= 30 minutes. But still annoying, nonetheless. Same old problem -- packets stopping in Atlanta. Gumdangit, BellSouth!

Can't get to anything, though -- not even Excite.


Stupid Linux thread model. I know that I saw a web page once that went through it and said why it was a good thing that threads are different processes (other than "it was an easy hack"). I did some web searches and can't find it.


This is going to problem for LAM itself, when we make it multithreaded because what I described in a previous journal entry. I did find the function pthread_atfork, though, and I think it can be used to fix this problem. There will have to be a cached value of getpid(), and at fork time, we'll have to zero out the cached value.

This can work. I haven't fully thought this out yet, but I'm quite sure that this scheme can work. It may require an additional configure test, too, which may be a bummer, but possibly not.

xmms crashed earlier today. I notice that I have xmms 1.2.3, and 1.2.5pre1 was announced on freshmeat today.

There are 98 copies of xmms running, out of 173 total processes (56%).

