## September 1, 2000

### All things being equal, LAM rocks

And version 1.0.6 of the MPI 2 C++ bindings has been released with extraordinary little fanfare. See what's new (it's actually nothing very interesting :-). The test suite still hangs in MPICH, but they say that that's ok, 'cause neither they nor I can figure out why... Seems to be some kind of Heisenbug in MPICH itself (shrudder).

Took the red eye with Lummy last night. Got to Cincinnati at 6am. Got to South Bend around 9am. Came to the lab and have been here ever since.

I gave my talk on the generalized master/slave parallelism stuff at lunch. It seemed to go well, but I wish that I had had a blackboard or whiteboard to use. :-(

Had a code review with Arun w.r.t. LAM/gm. Arun seems to have some kind of medical condition in his thumbs that prevents him from hitting the spacebar -- for this, I forgive him for the enormous lack of white space in his code (making it squished together and hard to read -- but who am I to judge? Oh... wait. I'm his boss). We recompiled LAM and his test program with the Solaris compilers so that he can use bcheck to find some Random Badness (there's at least one write to unallocated in a simple MPI_INIT/MPI_FINALIZE program --
oops).

Spent the rest of the afternoon finishing up the MPI 2 C++ bindings so that it can be released so that Elliott can continue working on what Mike Shepherd started -- finishing the rest of the C++ bindings for the MPI-2 functions. So 1.0.6 has been released and I created a tag in CVS, so now I'll go commit all of Mike Shepherd's stuff. Woo hoo! (also have to re-import the C++ bindings to LAM/MPI... mmm... find stupid CVS manual for 3rd party imports... ggggrrrrrphhhh...)

Gonna go meet Lynzo and some other random bones for dinner after the pep rally. Go Irish, beat Aggies! (I have to admit, I'm not hopeful this year, but good ol' 87 Jabari Halloway is one of the captains -- if anyone can lead that team to victory, it is he).

## September 3, 2000

### I am serious. And stop calling me Shirley.

Notre Dame won yesterday vs. Texas A&M -- 24 to 10. It was out home opener, it was hotter than two dogs... er... lying around after a big run (it was apparently 116 on the field).

Quote of the Day from Arun when we briefly discussed the yesterday's game when I came in today. I made some remark how I got a little sunburn and showed him my ultra-cool watch band tan line (chicks dig it, just like chicks dig MPI). Arun replied, "It must suck to be genetically inferior that way."

Classic.

Saw many old friends this weekend -- had dinner with Schleggue (although he joined us late), Lynzo, Vern, Pam Tyner, and Rachel Canata at Macri's on Friday. We then went to Corby's and then Senior Bar. As is my moral obligation, I got Lynn nicely drunk on vodka tonics. Game day was fun; hung out with Ed and Suzanne a bit and then saw them later about 10-15 rows below us in the student section. We sat with Dog, Jeremy Siek, Katie, Mike Niemier, Brian Bussing and his fiance Dana Collins. A good time was had by all, and we all drank a lot of water ("it's not just water -- it's Notre Dame Water"). We saw our boy Jabari out on the field, and he looked good -- he made some good plays, had some good catches and blocks; he generally did us proud. After the game, I even saw him take a bit of a leadership role with the guys on the field, further confirming my previous journal entry that if this team is going anywhere this semester, Jabari is going to have a lot to do with it. For those for have never met Jabari, he's a great guy -- really nice, tries to study hard (I can't even imagine trying to get all my work done *and* have a hellish practice and travel schedule; it's hard to be an NCAA athlete at Notre Dame...), etc. Jabari rocks.

We went out to dinner last night at Outback and all of us had too much to eat (Mary and Pete Calizzi joined us, too). A good time was had by all. We saw Ruth Riley there with some of her family/friends/whatever, but we didn't bother her. We went back to Dog's place afterwards and watched the Matrix. Then everyone hit the road (no one was staying close). It was good to see them all again.

I'm here in the lab for an inilib code review (and I'm late, 'cause I'm typing this entry...) with Brian, so with a big shot out to all my homies out there, PEACE, OUT.
(BTW, we're listening to "The Moog Cookbook" here in the lab. Does life get any better than this?)

## September 4, 2000

### Wedding 2K

Spent the day continuing setting up my new machine; still haven't got X quite right because I can't get KDE working right. I'm working in plain old twm, and it's stifling. Ugh.

Did some more cleanup around the house (it's still a wreck from all the wedding presents), and finally watched our wedding video with Tracy (it actually came last week, but we were both traveling). There are some utter classic moments in there (funny how everyone else's wedding video is cheesy, but yours is fantastic...):

• Renzo, while we're standing around before the ceremony: "You just give the signal, and we'll get you right outta here."

• Fr. Hesburgh: "Jerry and Tracy..." (actually, I have to provide some context here -- Fr. Hesburgh was fantastic, and he recovered quite well from his little error)

• Faller (off camera), "Hey Jeff -- seafood!" (the camera caught this whole scene quite well. Had to back it up and watch it a few times)

• Dog: "We couldn't get that bastard Sepeta up here because he's hitting on their dates!" (pointing at Barker and Faller)

There was much Meghan in the video as well. It was funny, too, to notice that Patrick got just about all the face time in the ceremony, and Chris got just about all the face time during the reception.

Some other funny scenes as well; some classic dancing/reception footage. One that Tracy didn't even see right away (it's off to the side of the frame, and it happens very quickly) -- we had to back this up and watch it a few times. After the wedding party dance, I stole Diann away from Darrell, who is left standing on the dance floor, looking forlorn. Shipman notices this, runs over into Darrell's outstretched arms, and they start dancing. The look on Courtney's face and her resulting body language is absolutely classic. Renzo quickly steps in with Courtney, and the camera pans away. The whole thing takes about 3-4 seconds.

Gotta answer some LAM mail now, then go to sleep...

## September 5, 2000

### Miles of code before I sleep

I was updating my xmms RPMs today (for Mandrake), and noticed that they have an ogg vorbis xmms plugin RPM. I installed it and played some .ogg files with it. I was pleased to notice that my previous concern about the vorbis xmms plugin hogging the CPU while playing songs has been fixed (or they just compiled it better than I did); playing a 160+kbps ogg stream has the load hover around 0.05 (i.e., comparable to .mp3). Very nice; perhaps this vorbis stuff has promise!

Spent much of the day working on pending LAM issues:

• Finally fixed the SCO user's problem. Turned out to be a race condition in the file descriptor passing code. Interesting that it never showed up on any other operating system; it may be a SCO-specific issue (the sender was sending three file descriptors and then closing the pipe; SCO apparently discards any unreceived messages when the sender closes, even if the receiver still has the pipe open). Who knows. Putting a simple sender-waits-for-an-ACK scheme fixed the problem. It's interesting to note how hard it was to find the problem, and how it was trivial to fix it once the exact problem had been determined. It was really hard to find the problem because my troubleshooting was limited to e-mail only; I do not have a SCO machine to test on, and the user's boss ixnay'ed the possibility of me getting a guest account to test with.

• Found a real race condition in the LAM code to launch executables on remote nodes (at lamboot time, not at mpirun time). It is possible for output from remote nodes to be dropped before mpirun has a chance to see it if rsh exits too quickly. It's not immediately clear to me how to fix this problem... It seems to only have become evident with a few LAM users with the advent of faster processors and networks.

• Fixed a minor issue with the --with-rsh logic in configure.in that a helpful user pointed out.

• Added some much more user-friendly "there is no lamd running" messages (via the lam-helpfile) to all the LAM executables and to MPI_Init.

Released 6.3.3b32 with these changes. Pending issues:

• The race condition with rsh.

• The MPI 2 C++ still seem to be broken under some conditions (e.g., when using --without-fc). @#$%#@$%#@$%#@!!!!! • An IRIX user is complaining about some socket issue at mpirun time. I've pinged him to try the 6.3.3 beta, but I doubt that this will fix his problems. We'll have to see how this one pans out. My 800mhz machine is fast (provided that it's only doing one thing at a time -- it is still an Intel box, after all...). Times expressed in min:sec: 800mhz machine Ultra 30 (athos) Run autoconf and friends for LAM/MPI: 0:07 0:23 Run configure for LAM/MPI: 0:32 1:22 Full build of LAM/MPI: 3:20 12:56 I did the build on athos, which is admittedly not the fastest machine (not only is it only 300mhz, it has limited memory; I should have used a hydra, I suppose, which would have been half the mhz of the intel machine and had a lot more RAM). But the build was about 4x faster (again, with the big caveat that the machine is doing little else at the time). But these figures certainly do inspire me to do some development locally rather than remotely to nd.edu. Happiness all around! ### We are pleased I got KDE working on my new box! And there was much rejoycing. ## September 6, 2000 ### Fun for the whole family Here's a fun trick than anyone with an Intel-class machine, 6GHz or below can try! Want to make your MP3s have that cool "Max Headroom" effect? While playing your MP3s, do some task that is even mildly CPU or I/O intensive. It's as simple as that! Wow -- listen to that distortion and "echo". And to think that real recording studios pay big money to make effects like this! ### T.P.R. Report / Initech When I drove down here a few days ago, I noticed some water dripping behind my glove compartment. We didn't go out and have a good look at it until today. We picked up the floor mat (which was good and wet still), and it was soaked underneath with a healthy chunk of mildew growing on my bottom carpet. Bonk. I have an appointment on Friday morning to take the car in and have whatever it is that is broken fixed (I am a code wizard, not a car wizard). Finally got my IO streams book from Amazon today -- I accidentally put the wrong apartment number on the "ship to" address, and UPS got really confused. I called yesterday and they re-shipped it again to the right address (no charge, whoo hoo!). Got the Office Space CD, too. Yummy (already ripped into MP3s, and I'm listening to them right now...). ROMIO and MPICH released new versions today. Luckily, the new ROMIO is just about the same as the old one (configure/build-wise), so since I had the foresight to document what I did last time, I mainly followed the same steps and ROMIO seems to be integrated into LAM/MPI just fine. \begin{bitch} CVS third party importing sucks, for multiple reasons: • It does not record which files have disappeared or moved from release to release. That is, the initial import is fine. But when you import a new version over the old one, you would think that it would just snapshot the new one and keep the old one as just history. i.e., files that existed in the first version but do not exist in the second version should not show up upon checkouts. Not so. For example, in the MPI 2 C++ bindings, we moved a bunch of header files from one directory to another. I did the 3rd party import in CVS of the new version, and then updated my local copy of LAM. Suddenly I had 2 copies of all the header files -- one in the old location, and one in the new location. Other than cvs remove'ing each old .h file, I didn't see any way to correct the situation. So I just blew away the old 3rd party imports (well, actually, I just moved them... never delete!!), and imported the C++ bindings as if it was their first import. • If you third party import a distribution tarball that uses automake, plan to be hosed. It screws up all the timestamps such that it tries to invoke automake and friends when you ./configure/make it. And since it's a distribution tarball, you don't have things like acconfig.h, so autoheader will fail. And it goes downhill from there. The only solution that I found was to do a massive touch of all the files in the third party source directory tree such that every file in the tree has the same timestamp. Icky. Horrible. Shrudder. But it works. But we shouldn't need something like this -- I'm open to better solutions (perhaps just including the tarball itself...? Hmmmm...!) \end{bitch} Did a bunch of LAM work today, but I might have just found a new issue under Solaris. It seems that mpirun is hanging. Ugh!!! Was it something that I did in the extra synchronization that I added for SCO? Miles to code before I sleep... ## September 8, 2000 ### It Madonna calls, I'm not here Spent much of yesterday thinking about the generalized manager/worker problem, and spent most of today re-writing my paper about it (I had a good quote to Tracy today, "if I were a theoretician, the paper would be done". But I'm not, and there were still a number of non-trivial practical issues that concerned me. So I re-wrote it). I solved the problem with the variable rate input/output in the input data unit queue -- a cool use of a condition variable and something I call a "reservation system" (which really amounts to two queues managed by mutex and condition variable... which are more or less the same thing anyway ;-). You get all the benefits of a variable rate I/O and blocking (instead of spinning). Here's the problem: The input thread reads in chunks of input at a time and preprocesses them (working on the idea that preprocessing takes much less time than the "real" calculation). The preprocessed data units go into a data queue. The calculate entities (which can be local threads or remote threads that are serviced by MPI proxies) are removed from the input data unit queue (which will involve an MPI_SEND/MPI_RECV pair if the thread is remote from the manager) and processed. This is the step that takes the most time -- the actual calculation. When each unit is finished processing, it is placed in the output data unit queue (which, again, will involve MPI_SEND/MPI_RECV is the thread is remote). The output thread removes the output data unit from the queue, postprocesses it (again, taking orders of magnitude time less than the actual calculation), and then writes it out to the output datafile. Believe it or not, this is not rocket science. Pretty standard stuff, actually. Here's the problem: When we mix local and remote threads, however, he need to give different amounts of work to threads based upon their location -- this can help hide latency for remote threads, because we give them larger amounts of work. They request work less often, which directly translates to less messages, and therefore less latency (latency is bad, Bad, BAD!). Since we're dealing with multiple threads that have to share the same data structures (i.e., the input data queue), that data structure must be locked in some fashion to prevent multiple threads from entering it simultaneously. That would also be BAD! So we lock it with a mutex -- again, pretty simple. The input thread locks the mutex, adds one or more input units. It's will probably be more than one, actually, for efficiency. Just like sending a smaller number of large messages is more efficient than sending a larger number of small messages -- even if the total number of data bytes is the same in both cases -- it is more efficient to lock something a fewer number of times for longer periods of time rather than a larger number of times, each for a short amount of time. The total lock time is the same in both cases, it's the overhead time (i.e., the time necessary to lock and unlock) that is different. But that's not the whole story -- there's concurrency as well! So it's not quite so simple; hopefully our worker threads will be busy enough with real calculations that their blocking time will be minimal in comparison. And it's even more complicated than that. In order to save latency again (provided that the input units are somewhat small -- and yes, this is problem-specific), we want to only send one message to each remote node, not one message per input unit. So we have to request M input units per CPU per node. Hence, the proxy MPI thread must determine at the beginning of time how many CPUs each worker node has (no problem -- MPI_GATHER into num_cpus[]). For each remote node, the MPI proxy requests M*num_cpus[i] units, packs them into a single message, and sends them off. A similar thing happens with the output data on the way back -- the MPI proxy on the worker node packs all the output data into a single message and sends it back to the manager MPI proxy. Anyway, back to the problem. So the input data unit queue is locked with a mutex. We want local threads to be able to retrieve X data units from the queue, but want remote data threads to retrieve Y data units. This is for several reasons: 1. The latency reasons that were cited above; when Y > X, the remote threads do more work than the local threads to hide latency. 2. When X and Y are relatively prime, given that one input unit translates to some T amount of time of computation, it can help offset the synchronizations necessary between local and remote threads. That is, it adds "jitter" to the scheduling, such that local and remote threads will be synchronizing at different times, which can reduce contention by preventing bottlenecks. The question is how to do this? My previous scheme used a semaphore and mutex such that any thread (either a local thread or an MPI proxy thread) could remove single units at a time. This is no good, because retrieving a series of individual units may not guarantee their contiguousness -- another thread may slip in and grab a unit from the middle of your stream. I needed a way to atomically get an arbitrary number of units from the queue, and to be able to do it with blocking, not spinning (spinning == eating CPU cycles, and therefore taking them from someone else; blocking == allowing the thread to be swapped out until some external event wakes it up). So here's what happens: the input works as before, putting in Q input units at a time (where Q >= 1). It then broadcasts to the condition variable (RTFM Tannenbaum). The calculate and MPI proxy threads are a bit more complicated. They get the mutex and check to see if the reservation queue is empty (this is different than the input data unit queue, but protected by the same mutex). If it is, it puts (threadID, num_requested_units) at the tail of the reservation queue. If it is actually at the head of the reservation queue, it checks to see if there are already enough units in the input data unit queue. If so, it takes them, removes itself from the reservation queue, and unlocks the mutex. If there aren't enough input data units, or this thread is not at the head of the reservation queue, it goes to sleep on the condition variable. The input thread's broadcast to the condition variable wakes up all the calculate threads that are waiting on it (if any) and they all check to see if a) they are at the head of the reservation queue, and b) if there are enough input data units to service their request. If a thread finds that both a) and b) are true, it services itself, removes itself from the reservation queue, and broadcasts to the condition variable again to wake up the next thread in line. And so on. Of course, it's not quite as clean as this -- there's other sticky issues like the input queue can be drained (i.e., the input is exhausted) but there's not enough units to fulfill a thread's request, etc. So the extra logic gets a bit corn-fuzing. Interestingly, the output thread protection is much simpler -- the output thread comes in and takes as many contiguous output units off the output queue as possible. It gets more complicated when we can have an arbitrary number of input and output files being processed simultaneously; dequeuing from the output queue becomes an absolute nightmare. Consider just one issue: if we're processing B input files into B output files, all B input files will be read fairly quickly. Their data will be processed in order, and the processing will take some time. But we want the output files to be ready upon output of the final output unit in each file -- i.e., we have to close() the file. Hence, since only the input thread knows when the input has been exhausted, it has to submit some kind of sentinel value into the input data unit queue for a given output file that filters all the way through the pipeline to the output thread such that it says, "when you get all the output data units from this file, you can close it." Pair that with the fact that the output data units will be coming in a random order from the calculation threads since they may all be operating at different speeds (indeed, since we're using MPI, the remote worker nodes may not be the same kind of machine as the manager machine). So some kind of order has be associated with input/output data (i.e., sequence numbers). The output thread has to re-order the output units into order, postprocess them, and then write them out to disk. And know when to close() the file since most POSIX systems work with write-on-close semantics. Whew! The paper is up to 18 pages now. I'll probably spend more time re-writing the paper again. It seems that I get all the way through it and then realize one more critical threading/synchronization issue that throws the whole thing off, and makes me start the pipeline from scratch all the way back at the input stage. Still, it's all good, because it's way cool stuff. I've redesigned the pipeline from input to output 3-4 times now, each time better than before. :-) I talked to Lummy about the paper today, and told him that I'd like to have something for him (and possibly Jeremy, Rich, and maybe even Jeremiah) to read in the next day or two. ### Hey Pac Man, what's up? I noticed today that Mandrake is shipping Netscape 4.75 with full 128 bit encryption through its normal "update" channels. Pretty cool -- you no longer have to go through hoops and hurdles to get a fully-secured (hah!) Netscape with all the kewl plugins and whatnot (i.e., in RPM form). ## September 10, 2000 ### Laser printer toner flavored ice cream I learned an important fact about searching the internet today: When searching for movie soundtrack CD's, thou shalt not include the word "soundtrack" in the search query, lest one incur the wrath of the Search Gods and receive their angry reply, "No matches found." It seems that "soundtrack" is frequently a category, not part of the title. Whoops. Network music search Naievely using "soundtrack" No one hears my screams Surfing for music I dare to use the "s word" Search God anger shows A quest for soundtracks Do not utter their true names For fear of evil Soundtrack soundtrack sound Track soundtrack soundtrack soundtrack Soundtrack soundtrack sound? ### Singing backup chicken Ahh... the Nebraska game. It was an amazing game. I really did not expect that ND would play so well -- we were ranked 23/25 (according to what poll you looked at), yet we stayed head to head to Nebraska (#1) for nearly the whole game. Our offense was a little off, but then again, Nebraska has a great offense. We had 2 amazing runbacks (one from a punt, the other from a kickoff) for 14 of our 21 points. At the end of regulation play, we were tied at 21. We lost in overtime; we got a field goal, they got a touchdown (sadly, overtime has never been good to us). So we lost by 3 points. But it's a helluva lot better than the spread -- 13.5 points. It was a fantastic game. Tracy, Jim, Anna, and I were watching it in a local Damon's (sports bar). When Nebraska finally won, a few Nebraska fans started cheering loudly. I turned to them and said, "You just beat #25." That shut them up immediately. So even though we lost, I can only picture it as a win. They won by a fluke (and a really, really fast quarterback); it really could have gone either way (and yes I would have been saying that if we had won, too). And then didn't play down to us, we played right on par with what the news media calls the #1 team in the nation. We must go up in the polls for this (it doesn't look like they've been updated yet). It would be nice to see Nebraska go down, but I don't know if that will happen (FSU, #2, barely won against Georgia Tech -- who isn't even ranked -- yesterday; it looks like Georgia Tech had a pretty amazing game as well). Michigan (#3) had a pretty convincing win over Rice, so maybe...? Who knows. I've become convinced over the years that the two sports polls are based on a random function, anyway. On a lighter side, Tracy, Jim, Anna and I went to a restaurant (can't remember the name...) after seeing "The Cell" (which I give about 2 minutes; it was... ok, but not good or great). We caught the tail end of the University of Louisville vs. Grambling football game. I've never seen comedy in football before, but this was definitely it. UL won the game 52 to nothing, and the score said it all. The Grambling players really looked like they were trying hard, but their attempts were just comical. I can only imagine that they don't get a lot of funding, or perhaps their coaching and practices are terrible, or... I have no idea. But it was the funniest thing that I've seen in quite a while. UL just stomped all over them (and I'm not even a UL fan!). ## September 11, 2000 ### Do elephants sweat? Wooooo-eeeeee.... the paper is up to 25 pages now (and I haven't written the majority of section 7 yet!). I spent the entire day revising it. Properly designing a software system is a lot of work. But (like I've said countless times before), it's cool stuff. There are some really delicious issues and problems that would never expect from a plain ol' manager/worker problem. I think I've got one more major revision before I unleash it to some others to read. Had some more interaction with the guys who are having rsh/lamboot issues. Seems like rsh is not the problem after all. It may be faulty handling of stderr/stdout processing. The guy was running some simulations on his cluster; he said that he would try some new code of mine when that finished. We'll see if we can finally solve this problem. Blockbuster sucks. They sent a threatening letter to me at my parent's house in Philadelphia claiming that I had not returned the Fight Club DVD to the Berkeley Blockbuster store for almost two weeks. The happened to mention that the matter had already been turned over to a collection agency. Great. I checked with Lummy and he definitely remembers returning it (we rented 3 DVDs; Fight Club had to be back in 1.5 days, the others were 5 day rentals). We returned Fight Club before it was due, and watched the other 2 later. I was with Lummy to on one of the "return to Blockbuster" trips, but not the other, and I couldn't remember which was which. Anyway, I called the Berkeley store and told them that I was absolutely positive that I had returned the DVD. They guy looked it up in the computer and said, "Oh yeah... we found it on the shelf later." Over 2 weeks later, apparently!! So I was about to be fined and have a big bad black spot put on my credit record because of some clerk kid's stupid mistake in Berkeley. Blockbuster was about to fine me without even checking with me (the letter claimed that Blockbuster tried to call and snail mail me, but I never got any messages or snail mail). What the heck is that all about? And then they send the final notice to somewhere that I haven't lived for well over a decade. The whole thing kinda pisses me off. I don't know how excited I'll be to go back to a Blockbuster. Ahhh.... screw it. Miles to code before I sleep. Mmmmm... code..... mmm..... ## September 12, 2000 ### The cockpit? What is it? I received 2 packages today -- how exciting! God, Internet shopping is great. • The first package was from Amazon, and it contained all the CD's that I ordered (I finally have all the CD's for the MP3s that I own -- some of which I have been looking for for quite some time. See yesterday's journal entry about the word "soundtrack" in internet music search engines... grr...): MI-2 soundtrack, Chemical Brothers/Surrender, Groove soundtrack, Go soundtrack, Fight Club soundtrack, Various Artists for the Masses. The ones that weren't already ripped are finishing MP3 encoding right now... I'm listening to the Groove soundtrack. Sound like hip stuff. Nothing earth shattering so far, but it's good coding background music. It's really heavy on the bass (even on my mondo sub-woofer's minimum setting!), so I can't turn it up very much because I live on the second floor of an apartment building. Since I like to have semi-loud music on while I'm coding/working, does this justify my saying "I need a house to support my coding style"? • The book Advanced Programming in the Unix Environment by W. Richard Stevens. It came highly recommended by fellow Llama Nick. This book has everything -- would that I had known of its existence before! It could have saved me much exploration and experimentation with pseudo-ttys, various IPC mechanisms, passing file descriptors, random issues with SIGABRT, and other insundry bits of Unix system-level things. <sigh> I was glad to see that I had gotten 5 of the 6 guidelines for daemon processes in Minime, though (I didn't set minimed's umask to 0 -- oops. I was very careful about every file that it opened, but setting the umask would be better). I can't remember where I ordered this book from; I found it on www.bestbookbuys.com. I highly recommend this URL for anyone who is buying books off the web -- it saved me somewhere between$10-20 on this book.

Speaking of handy URLs, someone pointed out http://www.amazing-bargains.com/ to me the other day, particularly their their section about buy.com. They always list some good deals for buy.com, like coupons for "$10 off any order of$50 or more" and whatnot. I wish that I had known about that a month or two ago -- I bought a PCMCIA network card from them. Ah well -- next time.

Still working on the paper. The text portion of the first half of the paper still heavily reflected that I originally wrote this as a list of bullets, and is requiring much re-writing. The second half was mostly ok 'cause I had already re-written much of it. :-)

### Happy, happy, joy, joy...

I think we finally fixed the race condition in booting LAM. Many thanks to some helpful LAM users and their patience for helping slog through this obscure issue. We've got a few more tests to run to ensure that it's done, and I sent the new code out to the Debian user who initially reported the bug, but I think that I finally understand what the problem was, and how I fixed it.

I found a new <blockquote> attribute the other day -- type=cite -- that looks really cool in netscape (be sure to check this journal entry out on the web). Doesn't appear to do much over normal <blockquote> in lynx. I wonder what it will do in pine.

Blockquote type eek cite
How I love thee, let me count
the ways, 1, 2, 3...

Rich Murphy points out that if I had listened to his wisdom, I would have owned Advanced Programming in the Unix Environment long ago. Oops.

Rich Murphy, wise man
Woe is he who ignores Right
Yea, a life of pain

I like strawberries.

Strawberry red car
My sugar momma wants one

The new Chemical Brothers CD that I bought, Surrender, simply rocks. I highly recommend it to others.

Chemical Brothers
Their rocktitude humbles me
True block rockin' beats

### Lights! Camera! Act... shit. Call makeup.

I really can't type. I'd like to correct some typos in the last journal entry (the humor value was probably lost because of the mistakes. Sigh)...

Rich Murphy, wise man
Woe is he who ignores Rich
Yea, a life of pain

This was a minor mistake, but the english major in me cringed when I saw it (what does that say that an english degree provides an inner sense of Badness about a Japanese form of poetry?):

Strawberry red car
My sugar momma likes them

## September 14, 2000

### Mysteries of the milkshake

Exciting changes today...

Darrell called me with the joyous news that PacBell finally hooked up his DSL today (it only took 2 months. The most comical part of the saga was, after 1.5 months, after 2 house calls from PacBell technicians, Darrell got a call saying, "We finally figured out what the problem with your DSL line is. Your local Bell office doesn't support DSL.")

I spent about an hour or two with Darrell setting up our DNS servers. Darrell already had experience with this, so most of the pain and learning curve was avoided. Seemed pretty straightforward afterwards, but took a little understanding to get there. So Darrell and I are now secondary DNS servers for each other (kresge.com and squyres.com). We did some testing and it all seems to be working. Pretty cool stuff.

Darrell's with NSI (the evil empire), and he submitted his DNS change to them earlier today. They supposedly updated at 5pm EST, but as of now (12:23am EST the following day), nd.edu machines still don't see the change.

I'm with register.com, and it took a little explaining to them exactly what I wanted to do (had to do it on the phone). Turned out that it was their silly web interface that confused me, and we submitted my DNS change as well. They supposedly update tomorrow morning. Indeed, nd.edu machines don't see the change yet, but when I'm on my machines, "whois squyres.com" shows all the new stuff. Cool!

I've already added a few names to squyres.com --
introducing the new, improved JeffJournal! When the DNS change propagates out to the world, the JeffJournal archives will be located at the following URL:

http://jeff.squyres.com/journal/

If that isn't vain, I don't know what is. But hey, I only do it... because I can.
Had to do some screwing with my apache settings to get the virtual hosting stuff working with www.squyres.com, wedding.squyres.com, and www.fhffl.com. Learned some things about how to get Apache really confused today. Could be useful someday.

Arf -- just got a bounced message from nd.edu from an automated message on wedding.squyres.com. It seems that I had router.squyres.com as the first entry for that machine in /etc/hosts, which doesn't exist in DNS. Oops. Fixed.

In other fronts, I was continued to be distracted by getting motivated to figure out what the numbered ports that showed up in netstat -a were on my 2 machines. Turns out that most of them had to do with NFS (which I used between my router and my desktop so that I can server my MP3s from the big disk on my router to the xmms on my desktop).

I got further inspired to ditch NFS because I thought of a truly cool way to serve up my MP3s without NFS -- using http and the streaming capabilities of xmms (I already have a web server running, so...). I wrote up a minimalistic PHP script that allows me to navigate the directories and files in my MP3 directory tree. Clicking any of them invokes a PHP thingy to generate an .m3u MP3 playlist file on the fly, and send it to xmms. With the directory-browsing aspects of the scripty-foo, I can queue up multiple levels of MP3s:

• My entire MP3 tree (and click the "random" button on xmms for [probably] weeks of no-repeat play!
• The directory for any artist (which contains all their CDs)
• The directory for any CD
• An individual file

Actually, I could have just said "I can enqueue any tree of files, to include the special case of a tree of one file."

It was surprisingly easy. It's truly cool. I may someday be inspired to make it a bit more aesthetic and have more options... but why?

xmms stops just short of offering a full set of remote controls from the command line (I had to add an appropriate application handler for .m3u in netscape to call xmms), but I guess it's sufficient.

Ok, back to work now... the paper is really almost finished. I was halfway through the last code review when Darrell called me today...

(BTW, the jeffjournal client is fantastic -- it just informed me that I left a <CENTER> unclosed from line 37 before I mistakenly submitted it, causing all kinds of formatting madness, and potentially threatening the world's existence. We are pleased.)

## September 15, 2000

### How to Succeed In Coding Without Really Trying

nd.edu finally joins the rest of the masses in recognizing my new DNS server. Welcome to the new and improved JeffJournal! For all of you out there who bookmarked the JeffJournal in your web browser, it has now moved:

http://jeff.squyres.com/journal/

And remember... I only do it because I can.

Had to re-rip some CD's 'cause their MP3s seemed to be a bit skewed. Sometimes they cut off right in the middle of a song or something like that. I attribute this to when I was ripping CDs on my laptop, which has limited disk space. Turns out that when grip runs out of disk space, it just merrily stops the current song and goes on to the next with no indication of warning. Hence, I believe that some percentage of my MP3s are flawed, so I think I'll have to re-rip some of them over the next few months.

I finally finished a first copy of the manager/worker paper yesterday. There really are some delicious complications in the whole aspect of Things that make it fun. I even wrote the whole paper without writing a single line of code -- it's 100% pure design. There's a good chance that I'll use that paper as a guideline to write a parallel vorbis encoder. Gotta practice what I preach, after all. And it can only make the paper better.

I missed an MPI talk at ND yesterday. Bonk. It sounded like it would have been interesting. :-(

Tracy and I won't be going up to the Purdue game this weekend; her travel schedule was too much this week. Oh well. :-\ Hopefully, the boys will rally with the loss of Arnaz and Irons and the Irish will still prevail.

I'm noticing that my bandwidth between my desktop and my router is really crappy -- I'm just copying over the MP3s that I ripped on my desktop and only getting anywhere between 47 and 69 kB/s. Ick. I see the collision light coming on on my hub a lot; seems like this may be causing too much binary backoff. Might be time to invest $50 in a switch... Spent some time on LAM yesterday. I noticed an annoying security issue yesterday, and spent some time hacking around in the lamd and the rest of the user-level LAM libraries ensuring that all internal files that LAM uses are opened with "other" and "group" permissions zeroed out. And then it turns out that Solaris doesn't like to abide by the umask when it opens named sockets. Ugh. So I had to go the ssh route and move all the LAM sockets and temporary internal files into their own directory (which does abide by the umask) to guarantee security. Ugh. That's all for now; more news from Washington as our reporters check in. ## September 16, 2000 ### Do you Yahoo? A good day. We beat Purdue with a last second field goal to make the score 23-21 in favor of the Good Guys. We watched the game at the local BW-3's, and met some subway alums there. I guess I haven't really watched too many games away from South Bend (where most everyone is an ND fan), and I haven't really met/talked to too many subway alums. They're interesting folk -- no ties to ND, but are completely rabid about ND and its football program. The people that we met were really nice and we had a good time with them. I'm sure that we'll see those folks again, as well as other subway alums here in Louisville (the NBC affiliate down here broadcasts SEC games, not ND games, hence we have to go to sports bars to see the game). There were some Purdue folks in the bar, too, and they were dumbfounded when the field goal actually went in (to be fair, we were too :-). By the numbers, we probably should have lost that game -- I don't know for a fact, but I'd be willing to bet that Purdue has us beat in just about every stat. Our guys played well, but we lost two key players (QB on offense, and ?DB&? on defense), so both squads were critically short. The new QB stepped up pretty well, but it was his first college game and he made a few mistakes. Still, he did pretty well and I certainly don't fault him for anything. At the end of the day, he delivered, and we won the game. He's got lots of time to improve, and I'm certainly pleased with what he did today. Good job, Greg. Looks like the students were pretty pleased at the end of the game; they were all over the field in and around the players. Rock on. So we'll see what happens in the polls tomorrow. Purdue was 11 or 12 or something, and we were 21 or something, and I think we'll both be 2-1. We'll see. We went to dinner with Janna (Jim+Anna) again, which was fun. New microbrew here in town. Not bad beer, but a little too sweet for me. Good conversation, and much fun was had. Janna has a satellite dish, and next week's game is on PPV, so we'll be heading over to their place to watch it. Hmm... actually, checking the network schedules, it looks like it's on ABC. That would make it a bit more convenient... I finished my paper other day (I think that I mentioned this in as journal entry previously), and posted it to the vorbis-dev list yesterday, too, just for the heck of it. Finally got a response from someone today who said that it was good stuff. Good to hear, but they didn't have any ideas, suggestions, comments. Oh well. Since my computer has been idle most of the day, I started running the distributed.net stuff. It appears that they're focusing on the OGR project. I don't really know what it is, but it appears that most of the keyspace has already been exhausted from the stats graph. It's really slow. Since I started the client last night around 11:30pm on my 800mhz machine, it's only done about 4.3 OGR packets. Wow. I haven't been running bind for 72 hours yet, and they just released a new version. Apparenly bind 9.0.0 has been released. I'm a lazy bastard -- I'll wait for the Mandrake RPM. :-) ## September 17, 2000 ### Your spleen and you: do you have a good relationship? Not much to report today. Spent a little time upgrading my PGP tie-ins to pine, so that it actually does things correctly (been meaning to do this for quite a while, actually). It will decrypt multi-part messages, messages that are signed, or messages that have additional content besides just encryption. Happiness. Did some more organizing of my finances and finally got my credit card statement to balance with what is on my bank's web page. Woo hoo! Signed up for a better AT&T plan today. The service is exactly the same, it's just an arbitrarily complicated pricing scheme to make plans seem different. It's amusing, though, 4 of AT&T's big plans (and don't consider these descriptions legally binding -- go to AT&T's web site for full descriptions) are: •$0.10/minute, any time of day. This is apparently what plan we were on.

• $0.05/minute from 7pm-7am,$0.09/minute from 7am-7pm ($5/month minimum). •$0.05/minute, any time of day, with a $7.95/month additional charge. •$0.07/minute, any time of day, with a $4.95/month additional charge. The interesting thing is that AT&T marketing makes it sound like they have actually calculated the mathematical derivative for each plan. For example, and says, "You should use this plan if you are spending over$x.xx a month, or if you are spending over
$y.yy, you should use this plan..." But here's the kicker (as I'm sure all good, thinking people out there noticed): spending$x.xx on which plan?!? I hate marketing dweebs. Do people actually fall for this stuff?

Anyway, we did the math (i.e., compared the plans over our last 3 phone bills), and signed up for the $0.05/minute any time plan. Indeed, 2 months ago, this would have saved close to$40 on our bill. Yikes! (Granted, there were some pretty long wedding planning phone calls, but still...)

## September 18, 2000

### The Art of Barbering

When was the last time you were in a barber shop?

I just got a haircut today in a local Louisville barber shop. I have a long-standing theory that you can tell a lot about a town from their barber shops.

Barber shops are a mostly male-oriented club. True, you'll see mothers in barber shops to bring their sons in for haircuts, and you'll even see the not-too-uncommon female barber (indeed, the barber shop where I went in South Bend had one male barber -- the owner -- and two female barbers). I guess it would be more correct to say that the clientele is almost entirely male.

Humorous anecdote: I went to my typical barber in South Bend a few days before my wedding to get a trim. The woman asked me if I wanted my normal military high-and-tight cut. I told her no, I was getting married in a few days and my bride-to-be told me that she wanted "some hair on my head lest flashbulbs reflect off my head and ruin all the pictures." An older guy was getting is haircut down the row from me. In a low, grisly voice, he said, "You're getting married? Come over here, boy, we gotta talk."

The conversations that flow around barber shops tends to reflect the popular attitudes of the area. Here in Kentucky, I hear about tobacco crops (they actually have pro-tobacco ads on TV here), the military, and University of Louisville and University of Kentucky football.

In Frank's barber shop on the campus of Notre Dame, it is filled with ND memorabilia. Frank loves to hear about student perceptions on campus, football, the band, ROTC, or any other ND-related or military-related topic (he was in the military himself, in younger years).

At the Ft. Knox barber shops, the talk is actually fairly sparse. There's some chatter, but mostly people are there because they have to be there (regulation haircuts and all); it's part of the job. But there are some retired folk who sill come on base for haircuts and the gossip with the barbers and soldiers.

The barber shops that I used to visit back outside of Philadelphia are much the same. Typically somewhat 40-60 year old male barbers who have the look and feel of someone who has seen and done everything. The ability to strike up a conversation about any random topic. Sports are common, the military is another. Politics, of course (especially with this being an election year), is a big topic as well.

My conclusion is that the barber shop is a social island in the midst of hustling and bustling metropolises. The pace tends to be a little bit slower there than the rest of life. Granted, South Bend and Louisville aren't huge cities, and neither are the suburbs outside Philadelphia where I would get my hair cut. There's typically some kind of talk going on about something, and -- especially in a small barber shop -- the barber knows many of the patrons by name and how they usually want their hair cut.

Indeed, I've asked most of my barbers why they chose to cut hair for a living. Most of them laugh and make some kind of remark about how the never-ending demand (how often have you ever walked in to a barber shop and been seated immediately?), but then they have all said that it's for the people. Many had careers before becoming barbers, but left them for one reason or another and became barbers because of the wide variety of people that they would meet. Hence, they're using barbering as a vehicle -- it's not for love of cutting hair, for example -- to see a sample of the world that we live in. The local barber probably has a pretty good feel for the community around him/her -- probably more so than most. Indeed, the Art of Barbering (as I call it) seems to have little to do with cutting hair. It seems to be very similar to salesmanship, or bartending. Some people are good at it -- naturally easy to talk to, good listeners (yet still expressing their opinions in order to keep the conversation going), etc., etc.

This is hardly a startling conclusion by any stretch of the imagination. But I sometimes wonder what a long-term study of barber shops, their clientele, and the conversations that occur there would show. Who knows -- it might even be worth some kind of degree in Sociology or something. :-) But the barber shop is something that many of us take for granted and rarely notice. It's just something that you have to do once a month or so.

There was no point to any of the above. I'm just pointing out something that most of us take for granted, and that we rarely notice. No real reason.

...but at the same time, has anyone else ever noticed this?

### The Art of Barbering Too

Followups for the Art of Barbering. Any other comments are welcome:

From Rich:

Absolutely true! In San Diego (I believe America's 6th largest city), the barber shops are remarkably similar to South Bend, or anywhere else I've been. (Ask Jason about Vitos... the cops... etc.)

There's just something about going to a place where they do your side burns and the back of your neck with host shaving cream and a straight edged razor. (To me, there's something particularly Arun-esque about this line of conversation.)

From Arun:

Interesting comments, I hadn't really thought about it, but thinking back it must be quite interesting. I imagine the barber shops/beauty salons of Las Vegas Hotels must be especially interesting. I got my hair cut at one and in the short time I was there there were 3 wedding parties passing through in one stage or another.

This raises an interesting point -- are there [at least] two fundamental kinds of barbers? Those who have a handle on the local community and those whose community is mainly composed of transients (e.g., tourists)? And of the second type (I have to admit, I don't think that I've met any of those type):

• Why did they get into barbering? The same reasons?

• What do they yield from the Art of Barbering? It certainly isn't a feel for the local community -- there isn't one. What do they get a feel for? What are the conversations in their shops like?

And in this case, I suppose the Art of Barbering can be abstracted to a higher level, such as those who primarily interact with tourists (but then again, Vegas is truly unique!). For example, what are the differences between clientele of the T.G.I. Friday's in South Bend vs. the clientele of the T.G.I. Friday's in Vegas?

Again, this has no point. Just idle wonderings of someone waiting for X latency between squyres.com and nd.edu...

## September 19, 2000

### Goulash or spackle: you decide

My car looks fantastic!
I had to take it in to be detailed to get rid of the mildew smell from when my AC self-imploded (read: the output valves got clogged and all the water ran off into my front passenger footroom. Eeewww!!). I took the car in this morning, and when I went to pick it up, I was amazed: the car looks 5 years younger. They vacuumed and shampooed everything, and used the make-the-plastic-look-new stuff. The buffed and shined, and gave my car a complete exterior car wash. It looks amazing.

I could see people that I drove by gaping at my car, then touching their nose, pointing to my car and saying to their neighbor, "You see? That's what a 1993 Honda Civic is supposed to look like."

Spent too much time on LAM/MPI today. But I resolved some important bugs:

• We finally got confirmation that we fixed the lamboot race condition. Hurray for the good guys!

• I found a bug in the lamd today such that any new process that it forked (e.g., via mpirun) would inherit all the file descriptors of the named unix socket client connections that the lamd had open. Oops. The spawn code now closes everything except stdin, stdout, and stderr (which it replaces with whatever mpirun/lamexec gives it, anyway).

• I made the show_help() function a bit more robust in that it will try harder (and smarter) to find the helpfile. It will even display a specific error if it finds the helpfile but can't open it (e.g., if the process is out of file descriptors). Indeed, we now save errno properly so that when we use the %perror or %errno tokens in the LAM helpfile, it will display the correct errno, not just the last one.

We still may be having issues with really large numbers of nodes, though. Theoretically, we should be able to go up to 1024 -
(stdin, stdout, stderr, and a socket to the local lamd) ranks since that's how large the type fd_set that is used with select(2) can handle, but we seem to be falling way short of that for some reason. There's a user in Germany who is trying to use LAM with 528 nodes (he was thrilled when I gave him a copy of the 6.3.3 beta with lamhalt in it -- he says that a lamboot can take up to 10 minutes!). I am still investigating this.

An engineer from GE Aircraft Engines mailed me today, concerned about the [accidental] inclusion of the GNU license in LAM 6.3.2, because they want to use LAM internally. I told him that all was well -- its inclusion was accidental and I would never cut off my shuga-momma's company like that.

Other random acts of goodness:

• Hooked Janna up with John's extra ND/Stanford tickets.

• Saw a neat article today (from dad) about how Scott Malpass has really, really grown the ND endowment since he started managing it. Did you know that ND was one of the initial investors of Yahoo!?
• Got into an interesting discussion with Arun and Rich yesterday about barber shops when Rich said something about "Arun-esque". This triggered a long forgotten memory about the word "Arunesque", which I shared with them. Long story short: "Arunesque" means "to celebrate", or "to perform a ceremony for".

• Since they don't seem to broadcast News Radio down here, I have had to replace it with something else. The Drew Carey show seems to do nicely. I've always liked Drew Carey, and his shows are pretty funny. I highly recommend them to anyone who hasn't seen them -- I'd rate most of them at 17.5 minutes.

• I took the most recent copy of LAM's inetexec.c (the code that uses rsh to spawn things on remote machines), C++-ized it, and started working on it to do tree-based boots, and to allow nodes to fail during the boot. I stole a bunch of minime code to do this as well -- the result will get merged back into minime before it gets merged back into LAM -- because I wanted to do it in a small system first. Minime isn't large, but it sure isn't small (12,000+ lines of C++ code).

• Tracy's music group at church had a little "congrats" reception for us last night. Free food and wine, plus they gave us a bunch of gift certificates. I love all the free stuff that you receive when you get married; I should do it more often. No, wait...

Miles to code before I sleep...

(I've pointed this out before, but I just love jjc. It pointed out 3 places where I didn't close my HTML tags properly,
and let me go back and edit it before I submitted. With all the
<code> tags that I used in this entry [which pine
does not show, sadly -- href="http://jeff.squyres.com/journal/">see the web page], I
accidentally repeated <code> instead of the proper
closing tag a few times. Happy, happy, joy, joy...)

## September 20, 2000

### Life is like a box of vanillas

Today's /bin/fortune wisdom:

If all the Chinese simultaneously jumped into the Pacific off a 10 foot platform erected 10 feet off their coast, it would cause a tidal wave that would destroy everything in this country west of Nebraska.

Where does potpourri go when it dies?

Having a /proc filesystem when you're tracking down a file descriptor leak is really helpful.

10/100 switch
I ordered from buy.com
Anxiously await

How do you normalize the average number of cars per hour (at any given on a random freeway) in terms of kumquats?

GNU Mailman rocks.

Apparently, Industry Day at ND was a success today. Lots of new toys in the LSC.

we will now go back in time a year or two, and you find me at someone else's house, working on his phone line...

Me: (with phone line in mouth for safe storage... still plugged in) whoopty whoo... i'll have this done in a jiffy!

Him:Are you sure you should have that in your mouth?

Me: Sure.... I'm being 'careful'.

Me: <pauses and contorts my body> F*CK!

Him: What?

Why is the above funny? Because I've done the same damn thing!! It's good to know that I'm not alone.

## September 21, 2000

### El Blockbuster sucketh

The saga continues.

Blockbuster sucks.

How much do they suck? Let me count the ways...

My dad mailed me today that I got a nasty letter from a collection agency demanding the return of the Fight Club DVD to the Berkeley Blockbuster. This is after I got a threatening letter from Blockbuster a while ago saying "return the Fight Club DVD or else". I had already called them and got it straightened out (I did return it on time -- they lost it... and later found it). See previous journal entries for the story so far.

So anyway, this collection agency is threatening to screw with my credit for some mistake that I had nothing to do with. I had to call the Berkeley Blockbuster store again to figure out what was wrong. The manager pulled up my account and said, "I see we cleared you on the Fight Club problem, but I see a late charge on Hot Boys..."

WHAT?!?!

I've never even heard of such a movie, nor does it sound like I would want to see it. Ever. I conveyed this to the manager and he sounded very skeptical.

"No -- I have it right here in my wallet".

Puzzled silence from California.

"Oh wait... I'm looking at someone else's account; they rented Fight Club as well. How do you spell your name again?"

<sigh>

So he finally pulls up my account. "Oops... looks like we marked you as credited here in the store, but no one notified the collection agency..."

Yeah, no kidding.

Thanks Blockbuster. You suck. Let's hope you get it right this time.

### Does anyone know what time it is?

I feel the need to announce this amazing fact...

My friend Darrell just bought himself a 1PPS GPS device so that he can run a stratum-1 time server for his home DSL network.

"When you really, really need to know what time it is..."

## September 23, 2000

### Bring it on!!

The threaded version of the booter, indeed, seems to improve performance. Again, these are not on unloaded machines, so we can't say for 100% sure, but it certainly seems like it (I know pine will display this table badly; deal):

Number of nodes 2-way 3-way 4-way 5-way
32 0:37.5 0:29.1 0:28.3 0:21.2
147 1:01 0:48.5 0:55.4 0:43.4

(same conditions as before, AFS-cached, etc., etc.)

We have weirdness with the trinary and quad trees in the 147 again. :-( I'm still guessing that there are some strategically "bad" (i.e., heavily loaded) machines in the mix that are causing this. Indeed, it seems to "hang" on the last few nodes on the 4-way in the 147 tests. But again, the only real way to test this would be with a large number of unloaded nodes. :-\

### Flying monkeys

Archiving some more test results...

Per Lummy's suggestion, I have compared lamboot vs. a serial ring-like boot of several different sizes to compare the two different topologies. My hypothesis was that they would be roughly equivalent -- the rsh latency would dominate any bookkeeping and efficiency of the two codes.

I used the threaded scaleboot version -- not that it mattered, 'cause there would only be one thread/child anyway. Here's the results:

Program Number of nodes
8 32 147
lamboot 0:23.1 3:18 15:xx
ring boot 0:22.6 3:15 15:06

I unfortunately forgot to run /bin/time on the biggest lamboot, so I could only go off the timestamps from my unix prompt. Doh...

Also, with all this big testing with lamboot, I am soooo glad that I wrote lamhalt (to replace wipe) -- it takes down a running LAM by simply sending messages to all the lamds, as opposed to doing a whole new set of rsh's to each machine to kill the daemons.

As Arun says, "'wipe' sounds silly and doesn't have the syllable 'lam' in it." lamhalt rocks.

### I'm wearing velvet pants

Got my 10/100 switch yesterday. Woo hoo! Finally plugged it in this morning and got it operating (plug-n-play -- hah!); no more ethernet collisions -- yeah! All 3 of my linux boxen immediately switched to 100Mbps/full duplex, but the windoze box stayed at 10Mbps/half duplex. Figures.

Yeah, ok, I'm still behind a 1.5Mbps DSL line. So what. (actually, I'm streaming MP3s around here behind my firewall, so ethernet collisions were becoming a bit of a problem in terms of performance)

I got version one of my scalable booting working. It does an n-ary tree-based boot across a group of machines. Seems to work pretty well, but is not 100% bug-free yet. That is, it still hangs sometimes -- I think it is because it has done an rsh to a remote node and fails. Here's some preliminary results (units are expressed in seconds):

Number of nodes lamboot Binary tree Trinary tree Quad tree
16 1:02 0:12.8 0:09.8 0:06.4
32 3:07 0:46.5 0:29.3 0:27.5
147 14:06 1:28 1:00 1:07

Pretty good looking so far. Some notes...

• All results were with the binary already AFS cached.

• The 16 node tests were conducted on unloaded machines. The 32 and 147 node tests contained nodes that were in use, some of which were heavily loaded (shh!). So these numbers are not perfect. But they are a good ballpark.

• The difference between 3 and 4 children is sometimes small. This can make sense -- consider the 32 node case. With 3 children each, the farthest leaf from the root will be 3 hops. With 4 children each, it is the same. Hence, with each of 3 and 4 children, we still have the same number of "timesteps".

• Also, the algorithm is sub-optimal, particularly where there are heavily loaded hosts. I believe that this explains why 3 and 4 children on the 167 test results seem weird (it seems that some of the key parent nodes in the 4 node tests were heavily loaded -- I checked). This is not conclusive proof -- I would need a large number of unloaded machines to be able to test this theory. :-( See below.

I had a brief discussion last night with Lummy about this. I presented some timings of lamboot vs. the tree boots. He wants me to run a ring boot as well, and compare. I initially didn't see why he wanted me to do this -- indeed, I thought that it would be the same as lamboot (and I'm still not convinced that it's not the same -- the majority of time is dominated by rsh latency), but he made the good point that I don't have any numbers to back this theory up. As such, I don't know for a fact that they're not different. And I do agree, they are different topologies, so there could be a difference. They're different code bases, too, so subtle differences could mean a lot (although the scaleboot stuff derived from the inetexec.c that is central to LAM's lamboot). I'll code up the ring and see what happens...

The current implementation essentially works like this:

1. Invoke the program with the -master switch and provide a hostfile.

2. The program figured out that it is the master, and decides 1) that it has no parent, and b) reads in the hostfile.

3. Switching into "parent" mode, it does what I call "multi-rsh" for its number of children (default is 2, but can be overridden on the command line). i.e., it fork/exec's rsh's into the background to the children's hosts. This is more complicated than it sounds...

• The multi-rsh routine is given a list of username/hostname pairs, and a list of argv to execute on each.

• First, you have to send a "echo $SHELL" command to the remote host to see what the user's shell is. • When that comes back, if they are running Bourne shell (and you'd be surprised at how many people do...), the Real argv (denoted by food) has to be surrounded with "( . ./profile foo )" so that their .profile will be executed, and paths will be setup, etc., etc. Goofy, but true. • Once this is determined, fork/exec the rsh with the real command to be executed. • Keep in mind that there are multiple rsh's fork/execed into the background simultaneously; they all have to be tracked by watching their stdout and stderr to determine where they are. • Additionally, when an "echo$SHELL" finishes, it has to be replaced with the real argv and re-launched.

This results in one big-ol' state machine. It's somewhat hairy, but once I figured out some nice abstractions in C++, it worked out ok.

4. After all the commands are executed, the parent waits for its children to contact it (we passed some command line parameters to each child indicating the parent's IP address and the port number that it was waiting on). This means sitting on an accept() N times, waiting for each of the N children to connect.

5. As each child connects, give them a list of (M - N) / N username/hostname pairs to boot (where M == total number of hosts that this parent has to boot).

6. The children go off and do their thing, potentially booting grandchildren.

7. As each child finishes multi-rsh'ing its children (but before doing the accept()s to give its children work to do), it sends a number upstream to its parent indicating how many children were launched. These numbers all filter up to the root/master so that cumulative stats can be kept about how far along the boot is.

8. The cycle is broken in two conditions (they're actually the same condition, but I call it two conditions here for ease of explanation):

• A child is executed who has no children. When it contacts its parent to get a list of children to boot, it will receive "0" and therefore recognize that it is a leaf in the overall boot tree. It will then send a "-1" up to the parent and close the socket.

• When a child has received "-1"'s from all of its children, it will send a "-1" up to its parent and close the socket.

Hence, these "-1"s are propagated up the tree to the root/master, so that when everyone finishes booting, the master knows. It would be fairly easy to put in a fan out after this fan in and complete the barrier process so that the whole tree knows when it has booted, but it wasn't necessary for these tests.

There is a limitation to this approach: we have to wait for the multi-rsh to finish before we can give work to our children. Depending on the number of children used, and depending on the relative speed of the children of a given parent, this may involve some children waiting an a period of time before being given work. This is conjecture at this point, but 1) it seems reasonable, and 2) I hope to prove it with the following...

A single-threaded approach was actually fairly difficult. It involved some big select() statements, and a lot of lookups and extra bookkeeping. i.e., when select() returns, you have to scan down the list of available file descriptors, figure out which socket is ready, figure out where in the process that socket is, and then react. This created a lot of code (thank god for the STL and hash maps!). While the approach seemed to work, I think a multi-threaded approach will be much simpler in design.

With a multi-threaded design, we can have a thread for each rsh. It therefore only needs to monitor its own progress. We don't even need to have select() statements, because it's ok for each thread to block on read() statements, waiting for I/O from the remote process. I believe that the whole programming model will become significantly easier. And, as I mentioned above, there's a chance that there will be greater performance because each child will be able to go at its own speed and not be forced to wait for any of its siblings.

So I'm off to go implement the multi-threaded approach. I should be able to scavenge parts from lamboot and the scaleboot stuff...

## September 24, 2000

### 24 hours of non-stop Wham!

My 100Mbps switch rocks. I'm ripping a few CD's for Janna, and I noticed that one of them was The Matrix soundtrack. No problem -- I already have it ripped, so why bother ripping/encoding it again? So I scp'ed it from my router machine (where the big MP3 hard disk is), and it shot across my LAN like a bat out of hell. And especially considering that it was The Matrix soundtrack, shooting like a bat out of hell is probably quite appropriate.

Not only was it way faster, it confirmed my beliefs that on my old hub, collisions were killing my performance (from watching the throughput and collision lights). With every collision, there would be a delay before transmission started again (binary backoff and all that). Hence, performance sucked.

But no longer. Wooo hooo!!

The new GNU Mailman (2.0b6) came out last night. Good stuff! Anyone who's running Mailman -- go update.

Perhaps its coolest new feature (IMHO) is that it now inserts special headers in the messages that it sends across lists that some mail clients (including pine, of course) understand. These special headers tell the mail client how to subscribe, unsubscribe, port to the list, etc. For example, at the end of a message with these headers in it, pine provides a link to "email list management functions". Selecting that link provides a bunch of links to subscribe, unsubscribe, post, etc., etc.

Way cool.

I'm still trying to get used to 6 virtual desktops in KDE. I finally made the switch from 4 desktops earlier this past week when I found myself [shrudder] overlapping windows just because I had too many things actively running at once.

You wouldn't think that this would be a major change -- I've been running tvtwm for years at Notre Dame with 8 desktops. However, I didn't use the linear key bindings for next-virtual-desktop and previous-virtual-desktop to switch between desktops with tvtwm. I do this all the time with KDE. Hence, I now sometimes have to hit next/prev 2 more times to get to where I want to go. Takes a bit of getting used to. But it's all for the best.

I've considered switching to other desktops -- SawMill, for example. But why? Everyone complains that KDE is huge and sluggish (and indeed, on Solaris, it was way too slow for me -- I stuck with tvtwm for that very reason. I think tvtwm is about the most bare-bones virtual-desktop-enabled window manager that you can get), but on my 800mhz machine, I don't notice it being slow at all.

That may well be a chicken/egg problem -- since it's a fairly hefty machine, KDE's slowness is not evident. But I've been using KDE on my little 233 laptop for quite some time (a year or two?), and haven't found it to be too bad.

So I think I'm reluctant to change mainly because I don't want to have to learn new key bindings. That is, I'm not impressed with new, cool features in a window manager. I want it to be fast --
that's the most important feature for me. And now, since I've started using KDE, key bindings for just about everything are important (it really bothers me in tvtwm that I can't use key bindings between windows. I've used key bindings in tvtwm to switch between virtual desktops for years, but tvtwm does have key bindings for switching between windows, but they're global --
not per virtual desktop -- which is useless, IMHO). That is, I rarely use the mouse with KDE; I only use it for selecting and pasting things in non-emacs environments (xterm, pine, netscape). And of course for GUI programs (like grip) that don't have command line or key bindings equivalents.

So here's what I look for in a window manager:

• Virtual desktops. This is a must. I can't work in a single desktop anymore. I am inherently multi-tasking; while something is running in a window that take more than 5-10 seconds, I will likely go to something else.

• Speed. If the window manager can't keep up with me, forget it. I only discovered that this was a prerequisite when I tried to use KDE under Solaris. It was so slow that it would sometimes lag my actions by multiple keystrokes, which was completely unacceptable.

• Key bindings for navigation. i.e., using key bindings to switch between windows and desktops. This is now just about a must as well
-- I'm so acclimated to KDE's key bindings that it has become a part of the way that I work.

Things that do not impress me in a window manager (admittedly, some of these are functionality items which can be turned off. KDE, for example, has many of these. I turn them all off):

• 6 billion widgets and gadgets. I won't use them.

• 6 billion options for how my background and windows and displays and ... look. I won't use those, either. A plain color or gradient color background is fine. Most default color schemes are fine. I'm trying to work, not customize up the wahzoo.

• Animated events. Scrolling title bars really annoy the crap out of me. And who needs to wait for windows to appear and disappear? When I dismiss or iconize a window, I want it gone -- I don't want it swirling around the vortex of an imaginary drain in the center of my screen for the next 30 seconds.

• Non-arrow pointer icons. These also annoy the crap out of me. If it's not an arrow, there's always the question of "which side is actually pointing?" i.e., what part of the icon do I have to have over a widget to be able to click on it?

• Sounds that accompany window manager functions. I don't need noises to tell me that I just opened or closed a window -- I just did it, I don't need an audio queue to remind me of what I just did.

It has been my experience that all of these things simply waste time, not just in the time that you have to wait for them to execute, but in the time that you spend setting them up. And then, a week later, when you are tired of all of them, you spend more time setting them up again, or perhaps you'll make or go download a new theme. Who needs it?

To be fair, I haven't tried KDE under Solaris in quite a while --
there have been quite a few releases since then. Indeed, KDE 2.0 is on the horizon, which may make it worthwhile on Solaris now. I'll probably give it a whirl in the not-too-distant future.

### Internet, internot

Bummer. We lost to Michigan State yesterday, and in the last few minutes of the game, too. Bonk. So much for the season...

We watch the game at Janna's house, and had a good time with them. We stayed for dinner. I hooked Jim up with a new version of WinAmp afterwards, and I have a bunch of his and Anna's CD's to rip this week.

Many errands to do today -- clean the apartment, thank you notes (no, really!), etc.

## September 27, 2000

### I am pepperoni

Heisenlocks are hard to fix (where "Heisenlock" == "a deadlock where you can't know the deadlock and it's location at the same time", a la Heisenbugs). Particularly the ones that seems to move around.

How do you know when you have fixed it? You stop getting deadlocks. But if it only locked periodically to begin with (as is the nature of Heisenlocks), how do you know that you just haven't tested enough to run into a deadlock?

I pose this question because a) it's happening to me today, and b) it happened to me with PIPT. After months of testing, the PIPT decided to lock up right in front of our sponsors. After I finally figured out the problem (several days later, mind you), I noticed that I hadn't changed the problematic code in a long time. That is, the bug had survived for months without causing deadlock. But then it suddenly did. <sigh>

It' rare to encounter Heisenlocks, understand the whole picture, and say "Aaaahhhh.... yes, this is exactly the problem that I am looking for." Indeed, the code is typically so complex and the race condition so thorny that it is difficult to get the overall picture until after the fact.

Hence, we have one of Jeff's laws of multithreaded programming:

Easy race conditions are typically obvious to find. Heisenlocks tend to be caused by extremely subtle race conditions that usually "could never happen" because of x, y, and z, where one or more of x, y, or z (or, more likely, some previously unconsidered "tautology" w) is proven to be false -- typically after multiple days of hacking, around 3am amidst much wailing, gnashing of teeth, and caffeine.

I certainly do not believe in changing random things until something seems to work as a whole solution. Sometimes I am reduced to this behavior (e.g., when I run out of ideas), but I always work to pin down the exact reason for success/failure after I find something that "seems to work". It is crucial to understand why it works, lest you fix only a symptom of a problem, not the real problem. This is the only way to be sure to fix a problem rather than guess that it is fixed because it "seems" to be fixed.

Heisenlock quandary
How can this be happening?
Effect without cause

## September 28, 2000

### Soup for everyone

The bandwidth between squyres.com and nd.edu is great in the morning (particularly since we're currently an hour apart). I can edit, CVS, and type with ease -- very little latency. Within about 2-3 hours, it all goes to hell, though. I attribute this to undergraduates waking up and realizing that their napster clients are no longer downloading free music. Even in a groggy wake-up state, experienced napsterphiles can restart downloads in 3.7 seconds or less.

Truly, a sight to behold (but not a pretty one -- they did just wake up, after all)

Hence, I have been forced to switch down to "emacs -nw" for my coding on nd.edu machines. Regular GUI emacs was just too darn slow after 9-10am. I suppose I'll live, but I really do miss the context color highlighting...

In the universal scheme of things, I guess I'm helping prevent the heat death of the universe (using the rationale that X interfaces generate more heat, 'cause, well, they make the computer think more). Plus, I'm freeing up cycles for my distributed.net client. So all things being equal, and all colors being black and white, all is well.

Emacs interface
Bandwidth forces termcap mode
Save the universe

So I'm working on making my thread booter skip failed nodes today (yesterday, too). It's easy to do stuff to skip nodes where rsh/ssh is rejected -- they return right away, and you can just go to the next node in the list. But when you try to rsh to a node that is down, rsh takes a long time to time out... What to do here? I don't know yet. Perhaps playing three songs at once through my speakers will help me understand...

Sidenote: this is a trick that I found that I can do -- run grip for each of my 2 CD devices and xmms to play an MP3. They all send their output to the same device -- my speakers. You can have interesting musical deathmatches this way. Consider: Amy Grant vs. Metallica vs. John Denver (clear winner here). Or The Matrix vs. Enya vs. a data CD (a tough call).

Tracy has lots of soft music CD's for me to draw upon whenever I wish to "fix" the competition and increase my winnings (shh!). The gambling commission is starting to snoop around, though. May need to lie low for a while.

### Be the ball

By request, I did a pseudo-release of the jeffjournal client and server today to a limited audience. We'll see if it works out for them.

Dog just installed the patches for the Solaris Forte6 compiler today, and I gave it a whirl. Initial impressions:

• It's slow.

• They still didn't fix the linker bug. Compiling minime (which uses STL heavily) still gets all the same STL linker errors. <sigh>

• They seem to have fixed much of the Memory Badness with using bcheck in C++, but I still get a fairly lengthy "blocks in use" report at the end of my run. <sigh> At least these are not potentiall fatal errors, though...

• Jeremy claims that they don't have iterator_traits. I hope he's wrong...

• Trippy message from running a multithreaded program through the new bcheck:
 RTC: Enabling Error Checking... RTC: Patching executable code. RTC: Done patching code.