Interesting Thread-Cacheing malloc() Replacement from Google
This morning I was looking into the problem with my ticker plant clients and the older kernel shipping with CentOS 5. Basically, when I increased the message rate, we crossed over some threshold on CentOS and we started getting a lot of heap corruptions manifesting themselves as malloc() or free() problems in ZeroMQ. On Ubuntu 10.04.1 everything was fine, most likely because the kernel in Ubuntu 10.04.1 was significantly newer than the one in CentOS 5. So I went on a search for a better malloc() and free().
What I came across was the google-perftools. This is pretty amazing stuff. It's a thread-cache replacement for malloc() and free() that is as simple as adding a -ltcmalloc to the build line. It's got profiling tools as well, but that's not what interests me as much, it's the amazing speed gains that it provides. The graphs on the paper how about a 4x increase in operations per second when using this.
It's not conceptually hard - the TCMalloc library grabs a large block of memory from the system and then offers it up to the application. This puts the calls in user space, and the control of memory there as well. Because their design has the smaller blocks held in the thread, it's possible to see no locking contention on the malloc() and free() which should be a major boon to me.
I have to get it built by the Unix Admins for Ubuntu 10.04.1 - I've already built the x86_64 RPMs for CentOS 5 and installed them on a test box I have access to, but I really want to start on the Ubuntu boxes. Simple change, should see major improvement. Very exciting.
UPDATE: it's built on all my boxes and ready to go for tomorrow. I'm excited about the possibilities.