Very Non-obvious Memory Leak

bug.gif

Today has been spent trying to track down a very non-obvious memory leak in my ticker plant code. I've been watching it run over the last few days - fixing little things as I see them, and each time the app runs longer and better. Good... we're moving in the right direction.

But it's odd that a few of my ticker plants have a problem with a growing memory footprint. Very odd indeed. So I started digging into these exchange feeds. My first mistake was to ignore all common functionality - after all, if it's common with the exchange feeds that aren't leaking, then it can't be that code. Right?

Wrong.

What I found was that I needed to be exceptionally careful even when using the compare-and-swap atomic operations. It's possible that two threads, on two CPUs, are doing their own thing on that one variable, and if it's only in their cache, it's possible that there might be a time when the caches are updated and the main memory isn't. This could cause me to "loose" a message, and leak memory.

What I did was put a simple boost spinlock mutex on the value and then we got a lot more stability. That's good.

Unfortunately, that's not the end of the story... but it's the end of the day. A long day for a little solution. Tomorrow I'll have to see what the remaining problems are.