« September 2005 | Main | November 2005 »

October 2005 Archives

October 4, 2005

Lotsa quickies

  • I went to Italy for the Euro PVM/MPI 2005. It was ok. Not a darn thing started on time, which was amusing (just a contrast between the Italian way of doing things vs. the American way of doing thing). Ironically, my best meal was at the hotel in Naples where I stayed overnight to catch a 6:40am flight home — I had rabbit and some kind of pasta that I had never heard of in a pumpkin sauce. Wow, it was fanstastic…
  • At the same time as my Ital trip, the girls went to Flordia and saw the grandparents and had a great time. Bethany, our nanny, went with Tracy to help manage the trip.
  • The squyres.com server has moved to a new, faster, bigger, and generally mo’ better machine. We’re still slowly making changes here and there (e.g., my photo album is half broken — we need to figure out how to fix it in light of some security issues; arrgh!), but it’s performing quite well. Part of the delay in this entry was getting the blog software going again.
  • Kaitlyn is now walking up a storm. Some video clips (having your audio volume up is essential — be sure to watch the “Rampage” video first) available at http://jeff.squyres.com/pictures/ in the 2005 October folder.
  • Kathryn just took her first 2 steps, but doesn’t really show as much interest in it as Kaitlyn (yet).
  • Took the munckins in for their 1 year old pictures. It was a total disaster, but we managed to get a small number half-decent pictures. We went to http://www.portraitinnovations.com/, and although we purchaed too many physical pictures, the nice thing is that you get everything on CD (including the pictures that we didn’t buy).
  • I’d just like to say: shared memory is the devil. Whoever thought that it would be a good programming model for high performance computing was on crack. Sure, any idiot can share a big, honkin’ matrix among several threads/processes and get reasonable performance. But to get really, really high performance, especially in NUMA machines where you really have to use process and memory affinity to minimize non-local memory accesses, is a PITA. Writing the code is a bit complex (if you don’t understand pointer math — really understand pointer math — you’re sunk), but debugging it s RPITA.
  • Kaitlyn likes to fling things
  • Kathryn likes to hug people (quite aggressively).
  • Saw Kicking and Screaming on the flight to Italy. 1 minute.
  • Saw The Wedding Date on the way back. 2 minutes.
  • Saw The Replacements on TBS. Some fun moments, even though it was wildly unbelievable. 7.5 minutes.

October 18, 2005

Linux processor affinity: a rant

Update in September 2007: Google Analytics tells me that people are continually finding this page while searching for terms like “linux processor affinity”. You should know that I created the Portable Linux Processor Affinity project to address the problems stated in this blog entry. Please go there after reading this entry. Thanks.


This is a technical rant that can be summarized quickly: the current state of Linux processor affinity sucks.

There are esentially three different variants of the API (that I can find); which one you have depends on a combination of several factors:

  • your Linux distribution/vendor
  • what version of kernel you are using
  • what version of glibc you are using

Annoyingly, regardless of which variant of the API that you have on your system, the man page for sched_setaffinity(2) and sched_getaffinity(2) is the same. Specifically, it looks like this one man page has been copied everywhere and never updated to be what you actually have on your system. So you have — at best — a 1 in 3 shot of having these functions correctly documented.

As far as I can tell, here’s what the three variants are:

  1. int sched_setaffinity(pid_t pid, unsigned int len, unsigned long *mask);

    This originated in 2.5 kernels (which we won’t worry about) and some distros back-ported it to their 2.4 kernels. It’s unknown (to me) if this appears in any 2.6 kernels.

  2. int sched_setaffinity (pid_t __pid, size_t __cpusetsize, const cpu_set_t *__cpuset);

    This appears to be in recent 2.6 kernels (confirmed in Gentoo 2.6.11). I don’t know when #1 changed into #2. However, this prototype is nice — the cpu_set_t type is accompanied by fdset-like CPU_ZERO(), CPU_SET(), CPU_ISSET(), etc. macros.

  3. int sched_setaffinity (pid_t __pid, const cpu_set_t *__mask);

    (note the missing len parameter) This is in at least some Linux distros (e.g., MDK 10.0 with a 2.6.3 kernel, and SGI Altix, even though the Altix uses a 2.4-based kernel and therefore likely back-ported the 2.5 work but modified it for their needs). Similar to #2, the cpu_set_t type is accompanied by fdset-like CPU_ZERO(), CPU_SET(), CPU_ISSET(), etc. macros.

Also note that at least some distros of Linux have a broken CPU_ZERO macro (a pair of typos in /usr/include/bits/sched.h). MDK 9.2 is the screaming example, but it’s pretty old and probably only matters because I use that as a compilation machine :-) (it also appears to have been fixed in MDK 10.0, but they also changed from #2 to #3 — arrgh!). However, there’s no way of knowing where these typos came from and if they exist elsewhere. So it seems safest to have a configure script to check for a bad CPU_ZERO macro.

Glibc itself shares a bunch of the blame. Case in point — look at this implementation of sched_setaffinity from Glibc 2.3.2:

int
sched_setaffinity (pid, len, mask)
     pid_t pid;
     unsigned int len;
     unsigned long int *mask;
{
  __set_errno (ENOSYS);
  return -1;
}

Why even have the function there if all it’s going to do is return an error? It’s better to not have it at all (because we already have to have a complex configure script to figure out which one to use) than to provide one that is simply broken. Arrrggggghhhh!!

Finally, note that even the syscal() interface won’t help — apparently the back-end kernel function has changed the number and type of parameters multiple times (so that may not actually be Glibc’s fault). So there appears to be no portable way to use sched_setaffinity() and sched_getaffinity() without a complex configure script and multiple implementations in your code. That totally, totally sucks.

This rant is therefore an open appeal for the Linux development community to get its act together and figure this darn thing out once and for all, and standardize on a single API.


Update in September 2007: Google Analytics tells me that people are continually finding this page while searching for terms like “linux processor affinity”. You should know that I created the Portable Linux Processor Affinity project to address the problems stated in this blog entry. Please go there after reading this entry. Thanks.

October 21, 2005

Linux as a desktop... err... "needs a lot of work"

Ok, another tech rant. Sorry!

Earlier this week, I had to turn in my Mac laptop (read: my primary working device) for service — its keyboard was going bad. As a temporary replacement, I have an IBM laptop running Fedora Cord 4 Linux (I could not bear the thought of using Windows for 1-3 weeks). I had used Linux on a laptop and various desktops for about a decade; I thought it should be pretty easy to adjust for the duration while my Mac is gone.

Wrong.

Linux sucks as a desktop. I don’t think I ever realized how much until I was totally spoiled by a Mac for the last 1.5 years. I spent 5+ hours yesterday morning getting a [pseudo-]reliable set of Mail, Calendar, and IM working. I’m certainly not going to claim that Macs are perfect — they’re not (far from it, actually). But they do a lot more things Right compared to most other platforms.

Don’t get me wrong — the Linux desktop is way better than it used to be. But I’ve come to realize just how far it has to go — Mac’s philosophy is to make tiny little tools and then integrate the heck outta them. For example, on a Mac, there’s an addressbook. It’s not a calendar, it’s not an e-mail client, it’s not a kitchen recipie database. It’s just an addressbook. But that one addressbook is integrated with everything, meaning that it can talk to all those other applications — Mail, Calendar, Instant Messenger, Kitchen Recipient Database, etc. In this way, Mac reflects the BSD philosophy of “one system” rather than Linux’s philosophy of “lots of parts put together.”

Why did it take 5+ hours before I got something sorta-reasonable? Here’s some points:

  • FC4’s “install/uninstall software” tool still resolutely shows that KDE is not installed, even though it’s runnning as my main desktop.
  • If I shutdown my laptop or put it to sleep with the ethernet networking active, and then boot/restore it with no ethernet cable plugged in, I have to wait for DHCP on ethernet to timeout (60+ seconds?) before it will finish booting/restoring. That’s just absurd; why doesn’t launching the network occur in the background?
  • Thunderbird refused to import my addressbook entries. That’s a total non-starter (I have hundreds of e-mail addressbook entries, and I’m not going to re-type them manually).
  • Thunderbird also makes you wait while it sends every single message. For someone that sends dozens of e-mails a day, that’s also a non-starter (one of my students later gave me a workaround for this; apparently you can go into an obscure panel and change some hidden setting to make it not show the progress while it’s sending).
  • So I switched to Evolution. It loaded up my addressbook ok; cool. But it’s slow. It doesn’t handle multiple identies without adding multiple accounts (pretty non-intuitive, if you ask me — an “account” with no incoming mail server… pretty weird).
  • The Evolution calendar sometimes locked up (I had to have KDE kill Evolution after waiting for 10+ minutes) when importing my .ics files from my Mac calendar.
  • The Evolution calendar definitely has bugs in it. Here’s a humorous example — some of the day-long events that I imported set the “mark time as busy” flag. If I disable that flag on one of these events, it automatically changes the recurrance of the event from once a year (e.g., someone’s birthday) to every day. How these two are related, I have no idea.
  • Evolution periodically locks up and is essentially unresponsive for minutes at a time. This can happen when I click on “reply” to a mail, to simply try to go to the next mail in my index (e.g., click on “reply” and don’t get a compose window for 60+ seconds). Quite frustrating, since e-mail is a central focus of my work.
  • I was editing my signature blocks in Evolution when it crashed. Twice. Resulting in me [somehow] sending the same message to a public mailing list twice (how does editing a signature cause re-sending of an e-mail?).
  • Every time you make a change in your account settings, Evolution re-scans the IMAP server for all your folders (and re-caches everything). This is painful (I have a lot of server-side folders).
  • Right now, Evolution is refusing to update my INBOX. The last mail it shows is from around midnight, but it’s giving some obscure IMAP error every time it checks for new mail. So I quit Evolution and restarted; wallah — problem solved. Oh, look — I suddenly have lots of mail from after midnight.
  • Gaim did horrid things to my buddy lists. Both my AOL and MSN buddy lists got horribly re-ordered (buddies mysteriously went to different groups).
  • I can’t seem to load more than the 16 basic smileys in Gaim.
  • There’s a million other little usability issues (e.g., clicking on a http link in mail or IM — after tweaking both the mail and IM clients — finally does bring up a new tab in my already-open web browser [which is my desired behavior], but then I have to manually go switch to the browser application, which is sometimes in a different virtual desktop. One would think that when I click on a link, I want to see that link, and that I would not have to initiate one or more actions to see that link), some of which are just “different” from my Mac, and others showing a lack of integration between various tools.

There are probably reasons for all of these items above. Indeed, I’m quite sure that there are hard-working programmers out there working to fix all these bugs (if they aren’t already fixed; FC4 is “new”, but software projects keep evolving even after a Linux distro releases a version). And I know that no software is perfect — even my own software has bugs that we continually work to fix. OSX software has plenty of bugs too. So don’t get my rant wrong — it’s certainly not an attack on any of these projects or the people working on them.

Although the individual applications are not entirely blameless, it’s mainly the level of integration that is the problem. The distros are getting better at making it better, but they still have a ways to go (and I’m sure my rant is not news to them). I’m sure that I could have fixed many of the problems that I listed above. I could have done something different and either not had the problem or gotten a workaround (the Thunderbird about: editor is a good exanple). But my question is — why? I didn’t have these kinds of problems with my Mac because someone thought through all these application and integration issues and distilled down the information to what 90% of the world wants and/or needs. I don’t see many useless controls on my Mac simply because I don’t need them — someone else put a lot of effort into trying to figure out what people really need to do their jobs. It’s for darned sure that your Grandmother does not want to have to go into an obscure about: editor to turn off a hidden setting in Thunderbird. It raises the question of why that progress box is there in the first place — what if I frequently send large attachments? Thunderbird makes me wait there for a positive acknowledgement that the mail was sent rather than later giving me a negative acknowledgement if something went wrong. The latter allows me to be much more productive — I can actively be doing stuff before a “hey, something went wrong with the last mail you sent…” notice comes up.

Also — and this is something that all programmers should take to heart — quitting and restarting an application to fix an error is not acceptable.

In short, the state of the Linux desktop is quite frustrating. My productivity yesterday was rock bottom because I was trying to get my machine to do what I wanted it to do (but inevtiably resinging myself to letting it do whatever it wanted to do). I’m sure I’ll adjust better over the next 1-3 weeks, but I can’t wait to get my Mac back where things tend to “just work.”

October 25, 2005

More complaints about desktop Linux...

  • None of my USB jump drives are recognized or mounted.
  • Evolution is truly evil. I gave up using it; it kept failing and/or dying in strange and mysterious ways (e.g., restarting the app made everything work, but I should not have to restart the app to get new mail).
  • Thunderbird’s threading view is equally mysterious; why does it not show threads when “All” threads are selected? If you select any of the other threaded options, it doesn’t show you all the mail in your INBOX. Even more confusing, if you switch to another folder and then back to your inbox, even more mail disappears from the index.
  • Despite editing a text file and telling Thunderbird to disable the sending progress window, the compose window still remains visible (and in focus) when you “send” a message. You have to either wait for it to disappear or manually switch the focus back to the main window. Quite annoying.
  • I ran “yum install kmail” twice and got different results (!). Specifically, I ran it once, and it apparently updated a bunch of internal tables (“Added 79 new packages, deleted 40 old…” — the fact that there are 79 new package in the last 6 days is somewhat frightening; all I want is stability, not bleeding edge!). It didn’t find the package I wanted, so I tried “yum install KMail”. yum then failed on the first mirror (it got an http 404!), so it moved on to another mirror. But then it said “Added 5 new packages, deleted 56 old…” This says to me that these mirrors are not in sync (and it frightens me — what just happened to all my internal yum tables?). But that’s not my problem — I can’t imagine any other command that I would expect to run twice in a row and get different results.
  • Thunderbird does not scroll the index when new messages arrive (especially if you have the newest messages at the bottom).
  • If you have the newest messages at the top, if you delete a message, Thunderbird goes to the next message down in your index. For example, say you’re on message X. The next message is Y. Then message Z arrives. If you delete X, you’d expect it to go to the next message (Z), but instead it goes to Y. So you have to do 2 actions to get to Z (delete X and then select Z). Quite annoying.

October 27, 2005

Relief

Back to my Mac; back to the land of [almost] sanity.

October 30, 2005

Paperclips are not tasty

Quickies:

  • I finally finished converting all my old MPI Mechanic columns (from the now-defunct Cluster World magazine) to be MPI Monkey columns at the new incarnation of our parallel/cluster-related knowledge repository — Cluster Monkey. As opposed to the Cluster World magazine, this is a free web site — anyone can read the material without signing up or paying a fee.
  • John S. and I went to the Notre Dame/BYU game last weekend up in South Bend. It was a weird game, but we smote them in the end — 49-23. Great fun was had by all.
  • We saw Lynne, Lorenzo, and Alex M. Aside from Alex sleeping through dinner Friday night (right at the table), all are doing quite well.
  • We tailgated at the siblings’ P. (Chris and Karen); they’re just as fun as ever and doing well.
  • Unexpectedly saw Stephanie R., who is still working in the ND Athletic Department and still lovin’ it. Here’s to working at your dream job!
  • My parents also showed up at the game (neither of us realized that we would both be there).
  • Finally, also saw V and Marty P. in the bookstore. Didn’t get to their their munchkin, but they are doing quite well with the whole parenthood thang.
  • ND just extended Charlie Weis’ football coaching contract until 2015. Wow.
  • We took the munchkins to a GE picnic thing yesterday and had them wear their pumpkin outfits. There’s nothing to draw a crowd like twin toddlers in pumpkin suits.
  • Got tickets to see Drew Carey and the Improv All-Stars on 20 Nov. That should be a hoot and a half.
  • Rich M. got married at ND yesterday. We couldn’t attend, but sent a Spiderman beach towel as a wedding gift (everyone needs a Spiderman beach towel).
  • We have far too much Halloween candy down on the kitchen table. Very, very dangerous…

October 31, 2005

Useless stat of the day

Notre Dame men’s football is undefeated on Halloween.

About October 2005

This page contains all entries posted to JeffJournal in October 2005. They are listed from oldest to newest.

September 2005 is the previous archive.

November 2005 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.34