Lockless Multiple-Producer/Single-Consumer Ring FIFO Queue
I'm still on the quest for better performance out of my ticker plants and their clients, and to that end today I spent quite a bit of time on building a multiple-producer/single-consumer, lockless, FIFO ring buffer. The problem I have with my LinkedFIFO queue is that it's constantly doing malloc and free - OK, new and delete, but it's the same thing. The problem is all this thrashing. What I want is to be able to pre-allocate the memory for the buffer and then just use it.
So I spent a lot of time - more than I'd like to have spent, getting this guy right. It wasn't as easy as I'd have hoped, but in the end, I think it's going to be very useful for me.
The hard part was the multiple-producers. No question. The obvious solution to the multiple-producers is to get a concept of the 'new' Tail, put your data at the old Tail, and update the new Tail. You have to check for queue crashes in there, and all this has to be atomic. It's that last part that's the kicker.
What I come up with was the idea that the queue had an insert point. This was not the Tail, because the insert point would be atomically moved by the push() method and then, after the crashing was checked and we've set the value, we can update the tail.
bool push( T & anElem ) { bool error = false; // move the insertion point and get the old value for me size_t insPt = __sync_fetch_and_add(&mInsPt, 1); // see if we've crashed the queue all the way around... if (__sync_bool_compare_and_swap(&mHead, (insPt + 1), mHead)) { error = true; cLog.error("[push] the queue is full! Can't push this item"); } else { // save the data in the right spot for the size const_cast<T &>(mData[insPt & eBitMask]) = anElem; // ...and finally update the 'tail' for a new item __sync_fetch_and_add(&mTail, 1); } return !error; }
If you move the tail before it's got a value, then it's possible that the consumer can see that the tail has moved, assume that it contains valid data, and process it. Bd move. So it's clear you have to update the tail last. But then how do you know where to put the new data? Hence the insert point.
This is working fine for me, and while it's still pretty new code, I have a feeling it's solid - there just isn't that much to it and the tests I've come up with are pretty inclusive. It's nice to have a choice of efficient FIFO queues to choose from.