Creating a Solid, Reliable C++ Wrapper for hiredis Library
Most of today has been spent trying to get my simple C++ wrapper around the hiredis C library for redis working in a way that allows for a significantly more robust usage pattern than I originally had. Specifically, all was fine until I shut off the redis server, and then my client would try to recover and reconnect and end up dumping core. The problems are only made worse by the fact that I really had no support docs on the hiredis site - only the source code, which is optimistic in the extreme. No argument checks, etc. make it ripe for a problem if it's not used exactly right.
Clearly, I wasn't using it exactly right, and those misusage patterns were what was causing the code dumps. So the first thing was to track down what I was doing wrong, and that meant that I really needed to become much more familiar with the hiredis source code. To be fair, it's a decent open source library, but it's missing so much that would have added so little to the runtime load and would have made it far more robust to the kinds of misusage patterns I had in place. After all, my code worked, so it's not that it was totally wrong, it's just that when things started to go badly, the things that you needed to do become far more important than when things are going well.
For example, if I wanted to send multiple commands to the redis server at once, you can run several redisAppendCommand() calls, but each really needs to be checked for it's return value. This isn't clear in the code, but it's very important in the actual system. Then there's the calls to redisGetReply() - typically one for each call to redisAppendCommand() - but not always. Again, you need to check for the critical REDIS_ERR_IO error that indicates that the redis context (connection object) is now so far gone that it has to be abandoned.
Then there's the reconnection logic. It's not horrible, but you have to be careful that you don't pass in any NULLs. There simply is no checking on the hiredis code to ensure that NULL arguments are skipped. It's simple to do, but it's not there - not at all.
In the end, I got something working, but it was hours of code dissection and gdb work to figure out what was going wrong and what needed to be done to handle the disconnected server and then the proper reconnection. Not fun, and several times I was wondering if it just wouldn't be easier to write my own as it's all TCP/telnet based anyway… but I kept going and in the end I have something that's reliable and solid. But it was nasty to get here.