Nasty Locking Issue with Boost Scoped Spin Locks
I'm typically a huge fan of the boost libraries. They have saved me a ton of time in the last year, and they are about as rock-solid as I've seen public domain software be. But today I've been fighting a threading issue that's really been vexing. I'm not 100% positive this afternoon, but the evidence is certainly pointing to a problem with the use of the boost::detail::spinlock::scoped_lock and using it in very tight loop situations like the following:
size_t MessageSource::getCountOfListeners() { boost::detail::spinlock::scoped_lock lock(mMutex); return mListeners.size(); }
where mListeners is just a boost::unordered_set containing pointers of some kind. The problem seems to be coming from the use of this as an argument in a function like the following:
cLog.info("[process] %d kids", getCountOfListeners());
Now I've clearly simplified things a bit, but I'm guessing that I'd have had a lot better luck had I done the following:
size_t MessageSource::getCountOfListeners() { mMutex.lock(); size_t sz = mListeners.size(); mMutex.unlock(); return sz; }
where I'm explicitly locking, getting the value, and unlocking the mutex. But there were a lot of places in the code where this was occurring, and rather than go that route, I chose to try the Intel TBB concurrent_hash_map where the key is the pointer and the value is just a dummy uint8_t. This compiles and runs just fine, with the added benefit that I was able to remove all the locks:
size_t MessageSource::getCountOfListeners() { return mListeners.size(); }
In this implementation, the hash map handles the size for me, and I don't have to worry about the locking or the scoped lock's lifetime. I believe this is going to make a huge difference tomorrow, but we'll have to wait and see. It's certainly in the area of the problem.