Friday, November 6, 2009

The first almost trivially tiny lil' thing...

(or: Chad spends way too much time researching trivial stuff.)

The locales code in [e]glibc isn't caching directory names correctly, so you have something like this:

open("/usr/lib/locale/en_US.UTF-8/LC_TELEPHONE", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_TELEPHONE", O_RDONLY) = 3

about 12 times for each setlocales() call - which happens to be done by most/all processes.

I wrote this mini-program to see how many microseconds a bad open() takes:

----
#include
#include

int main()
{
int i;

for (i = 0; i < 1000000; i++) close(open("/usr/lib/locale/en_US.UTF-8/LC_IDENTIFICATION", O_RDONLY));
return 0;
}
----

On a Core Duo locked to 1.0ghz, it takes about 3.75usec per call. This is miniscule, but multiply it by 12, and then the # of times setlocale() is called, and eventually you get to about a second.

So, instead of actually fixing the locale code (and dealing with getting it upstream), a very simple fix for ubuntu 9.10 et al is:

sudo ln -sf /usr/lib/locale/en_US.utf8 /usr/lib/locale/en_US.UTF-8
^ your actual locale goes here.

This should cache nicely.

No comments:

Post a Comment