You are here

movies, stat latency

It's all Jon Johansen's fault. Without him, I wouldn't be able to watch the DVD's I borrow from the library. Now I get home and expect to be entertained every night. Just in the last week:

  • L'enfer: Watch a guy slowly go nuts. I think I prefer a plot that's not one long straight line.
  • Fierce Creatures: It has some pretty funny ideas, but just not nearly enough. In places where "Noises Off" would have a surprising event every second, "Fierce Creatures" would have a bunch of people running around looking frantic.
  • Earth Girls are Easy: OK, this was more fun, though honestly I'm not sure I'll remember it a week from now.

I should read more. In fact, our book group meets tomorrow to discuss "Flow My Tears", and I only read it once fairly quickly a couple weeks ago. I should give it another look before tomorrow night's meeting.

Sara and Paul went birdwatching Saturday morning. I couldn't see getting up before 7am just to look at some birds. But I guess it was quite the morning for it--when I met them for breakfast afterwards they were full of stories of baby swans, vultures, bird nests, and more.

The version control system we use for kernel work, "git", has a problem when running on an NFS filesystem--it detects when files change by calling stat() on them and looking for differences in modification time, size, inode number, etc. That means it can skip having to examine the data of unchanged files, but it still requires stat'ing every file in your working directory--and the linux kernel source has over 20,000 of them. The average ping time to my NFS server is about .2 milliseconds, so a round-trip to the server to request stat information will take at least that long. That means the whole tree will take at least 20000*.0004 = 4 seconds. In practice it ends up being over 10 seconds. A lot of git operations require this, so the delay gets really annoying. On a local filesystem, by contrast, the time is less than a quarter-second, once you've done it once (and the operating systems has cached all that stat data in memory).

NFS does some caching too, but to make sure that it notices file changes made by other clients it has to go back to the server every now and then. And you can tell it to do that less frequently, but then it becomes annoying when you use two clients at once and have to wait for changes made on one to be noticed on the other.

One possible fix is to make the server give out delegations more aggressively--a delegation allows the server to tell the client when a file changes, instead of making the client ask all the time. Thanks in part to a bright intern that's been learning his way around the server code, I think I'll have some help with that.

Another approach is to teach git to do those stat's in parallel, instead of sending all 20-thousand-some requests sequentially. That was my project for Sunday, but I didn't end up getting any further than finding the spot in the code that I'd need to modify to make it work. Maybe next weekend. This is sort of a hobby project, so I'm mostly ignoring it during the week.