Interesting Application of Thread-Safe Lockless Data Structures

cplusplus.jpg

I was looking at lockless data structures for multi-threaded programming again this morning, and I came across a link to TransactionKit. This sounds very interesting to me. Basically, the group has implemented a thread-safe NSDictionary and NSMutableDictionary pair that are completely safe for multi-threaded environments, use no locks, and offer the basics of transactional integrity: begin transaction, commit and rollback functionality. And I have to admit I'm very intrigued.

The biggest challenge to me in working with these lockless data structures is that you might not know you're going to need them when you start the project. It's only after you build the project, see it scale for a while, and then hit a fundamental limit that you see it's time to switch over. Problem is, by then, you're stuck. You've got all this code that uses one form of thread-safety, and now what you realize you need is something else all together. But you can't justify the complete re-write.

What you need is lockless alternatives to standard data components that are easily swapped in when the 'going gets tough'. For example, I like TransactionKit's take on the inclusion in Objective-C - make it look like a simple NSDictionary and NSMutableDictionary so that it's a simple search and replace for the vast majority of the retrofitting. This makes things a lot simpler.

But now I'm looking at the CKit stuff. It'd be one thing to have a truly lockless CKFIFOQueue. Then another that uses mutexes for those times when it's not a major issue for speed. Finally, one that uses the lockless atomic calls to make it appear to be lockless, in fact, you can even simply No-Op the lock() and unlock() methods in the subclass, but all the while, the thread-safe behavior is in play.

The problems with general coding are a lot more difficult, however. Take for example, the retain/release code I posted a few days ago. That's not so simple. Previously, I had:

void Instrument::retain()
{
    // first, lock this guy up
    CKStackLocker      lockem(&mRetainReleaseMutex);
    // now up the count
    ++mRetainReleaseCount;
    // see if it's in the pool to be released, if so, remove it
    if (mRetainReleaseCount == 1) {
        CKStackLocker      lockem(&mReleasePoolMutex);
        mReleasePool.remove(this);
    }
}
 
void Instrument::release()
{
    // first, lock this guy up
    CKStackLocker      lockem(&mRetainReleaseMutex);
    // no decrease the count
    --mRetainReleaseCount;
    // see if it's an error
    if (mRetainReleaseCount < 0) {
        getLog() << l_error.setErrorId("Instrument.release")
                 << "the release count for " << mmSym() << " went negative! ("
                 << mRetainReleaseCount << ") This is a serious problem." << endl;
    }
    // see if it's at zero, and then if so, add it to the pool
    if (mRetainReleaseCount <= 0) {
        CKStackLocker      lockem(&mReleasePoolMutex);
        if (!mReleasePool.contains(this)) {
            // log that we're going to dump it in the trash
            getLog() << l_error.setErrorId("Instrument.release")
                     << "adding " << mmSym() << " with retain cnt="
                     << mRetainReleaseCount << " to garbage" << endl;
            // ...and then do it.
            mReleasePool.addBack(this);
        }
    }
}

where the critical points were one (1) in the retain() code and anything less than or equal to zero in the release() code. This was due to the logical assumption that the value of mRetainReleaseCount would be initialized to 1. But what if we have it initialized to zero? Then the critical points are zero for the retain() code and anything negative for the release() code.

#include <asm/atomic.h>
 
void Instrument::retain()
{
    /*
     * Increment the count. If it's zero, then we were ready to flush this guy
     * and now we need to pull him out of the garbage before he gets flushed.
     */
    if (atomic_inc_and_test(&mRetainReleaseCount)) {
        CKStackLocker      lockem(&mReleasePoolMutex);
        mReleasePool.remove(this);
    }
}
 
void Instrument::release()
{
    /*
     * Decrement the count, and if we are negative then it's time to add this
     * guy to the release pool for deletion.
     */
    if (atomic_add_negative(-1, &mRetainReleaseCount)) {
        CKStackLocker      lockem(&mReleasePoolMutex);
        if (!mReleasePool.contains(this)) {
            // log that we're going to dump it in the trash
            getLog() << l_error.setErrorId("Instrument.release")
                     << "adding " << mmSym() << " with retain cnt="
                     << mRetainReleaseCount << " to garbage" << endl;
            // ...and then do it.
            mReleasePool.addBack(this);
        }
    }
}

if I then initialize the value of mRetainReleaseCount with:

    ...
    mRetainReleaseCount(ATOMIC_INIT(0)),
    ...

in the constructor's code, then we'll start with zero, climb positive, and then when we go negative we'll by tossed into the release pool for the next cycle on the garbage collector thread. Since the garbage collector thread also checks the retain/release count before it wipes out the instance, we're safe.

At least I'm pretty sure we are...

Well... here's the problem I see now. Because we have removed the mutex on the retain() and release() methods, we can be doing both at once. The count will be handled properly, but what if the release() hits it's counter operation first, but the retain() method's if() body executes first to get the lock on the second mutex?

Now we're in trouble. The counter will be saying 'leave the instrument out of the pool', but the code will execute in the order indicating that the instance will be placed into the pool. Crud.

Yup, this is not easy. I was ready to put the code into play, but there's no guarantee that the body of the conditional will execute in the same order as the atomic operation. This means that it's possible to have them in the wrong order. This, in turn, tells me that the greatest application of these atomic operations are in the thread-safe lockless data structures. That's about it. When you have two operations that need to be treated atomically, then you're sunk. Darn.