My car looks fantastic!
I had to take it in to be detailed to get rid of the mildew smell from when my AC self-imploded (read: the output valves got clogged and all the water ran off into my front passenger footroom. Eeewww!!). I took the car in this morning, and when I went to pick it up, I was amazed: the car looks 5 years younger. They vacuumed and shampooed everything, and used the make-the-plastic-look-new stuff. The buffed and shined, and gave my car a complete exterior car wash. It looks amazing.
I could see people that I drove by gaping at my car, then touching their nose, pointing to my car and saying to their neighbor, "You see? That's what a 1993 Honda Civic is supposed to look like."
Spent too much time on LAM/MPI today. But I resolved some important bugs:
- We finally got confirmation that we fixed the
lambootrace condition. Hurray for the good guys! - I found a bug in the
lamdtoday such that any new process that it forked (e.g., viampirun) would inherit all the file descriptors of the named unix socket client connections that thelamdhad open. Oops. The spawn code now closes everything exceptstdin,stdout, andstderr(which it replaces with whatevermpirun/lamexecgives it, anyway). - I made the
show_help()function a bit more robust in that it will try harder (and smarter) to find the helpfile. It will even display a specific error if it finds the helpfile but can't open it (e.g., if the process is out of file descriptors). Indeed, we now saveerrnoproperly so that when we use the%perroror%errnotokens in the LAM helpfile, it will display the correcterrno, not just the last one.
We still may be having issues with really large numbers of nodes, though. Theoretically, we should be able to go up to 1024 -
(stdin, stdout, stderr, and a socket to the local lamd) ranks since that's how large the type fd_set that is used with select(2) can handle, but we seem to be falling way short of that for some reason. There's a user in Germany who is trying to use LAM with 528 nodes (he was thrilled when I gave him a copy of the 6.3.3 beta with lamhalt in it -- he says that a lamboot can take up to 10 minutes!). I am still investigating this.
An engineer from GE Aircraft Engines mailed me today, concerned about the [accidental] inclusion of the GNU license in LAM 6.3.2, because they want to use LAM internally. I told him that all was well -- its inclusion was accidental and I would never cut off my shuga-momma's company like that.
Other random acts of goodness:
- Hooked Janna up with John's extra ND/Stanford tickets.
- Saw a neat article today (from dad) about how Scott Malpass has really, really grown the ND endowment since he started managing it. Did you know that ND was one of the initial investors of Yahoo!?
- Got into an interesting discussion with Arun and Rich yesterday about barber shops when Rich said something about "Arun-esque". This triggered a long forgotten memory about the word "Arunesque", which I shared with them. Long story short: "Arunesque" means "to celebrate", or "to perform a ceremony for".
- Since they don't seem to broadcast News Radio down here, I have had to replace it with something else. The Drew Carey show seems to do nicely. I've always liked Drew Carey, and his shows are pretty funny. I highly recommend them to anyone who hasn't seen them -- I'd rate most of them at 17.5 minutes.
- I took the most recent copy of LAM's
inetexec.c(the code that usesrshto spawn things on remote machines), C++-ized it, and started working on it to do tree-based boots, and to allow nodes to fail during the boot. I stole a bunch of minime code to do this as well -- the result will get merged back into minime before it gets merged back into LAM -- because I wanted to do it in a small system first. Minime isn't large, but it sure isn't small (12,000+ lines of C++ code). - Tracy's music group at church had a little "congrats" reception for us last night. Free food and wine, plus they gave us a bunch of gift certificates. I love all the free stuff that you receive when you get married; I should do it more often. No, wait...
Miles to code before I sleep...
(I've pointed this out before, but I just love jjc. It pointed out 3 places where I didn't close my HTML tags properly,
and let me go back and edit it before I submitted. With all the
<code> tags that I used in this entry [which pine
does not show, sadly --
href="http://jeff.squyres.com/journal/">see the web page], I
accidentally repeated <code> instead of the proper
closing tag a few times. Happy, happy, joy, joy...)