Creating a Nice C++ Binding for cURL

CKit.jpg

The other day a co-worker came up to me and asked me if I knew anything about this particular project in the Bank for getting index decompositions - basically, the instruments that make up the different indexes on the world markets. I hadn't heard about the project, but was very interested in the idea of adding it to my market data server as an additional data source.

So I looked at their web site and they had APIs in Java and C#, but nothing in C++ for several more months. I could wait, but I decided to fire a note to the developer as I'd worked with him in the past, and didn't know if he might have had a pre-release C++ API.

He mentioned that for the request/response work I'd be doing, the WebServices API was every bit as fast as the C++ was going to be, and since I could use the WebServices API right now, I decided to give it a go. Problem was, in CKit, I didn't have a nice way to get data back from URLs where you have any kind of complex request. So I decided to build one.

I looked around, and it was pretty clear that the best web services tool was going to be based on cURL. I know it's on every linux box, I can get it from SunFreeware.com for my Solaris boxes, and it's part of Mac OS X. It would handle a ton of different protocols and options, and it was pretty simple (looking) to work with. So I set off wrapping up cURL into CKit.

I have to say that I was more than a little surprised about the state of cURL. First, it does cover a ton of platforms. It's also got a ton of features. But what amazed me was the seemingly lack of attention to the details of really using the code. I mean it's not hard, but it is on ver. 7.18.1 and by that time I would have expected that they would have figured out how to get rid of these issues:

  • Global Initializer - this blows me away. I read the docs and they say that it's because some of the libraries cURL uses are not themselves thread-safe, but to say that the cURL global initializer is not only not thread-safe, but needs to be done with only one thread active in the application is downright crazy... and sad. I've done what I can to try and make it as nice as possible, but it's certainly possible that I'm going to run into serious problems because of this. I just hope it's a good plan for the simple stuff.
  • Keeping String Pointers - while I can certainly understand why they don't copy arguments passed in, there's no reason to have the 'easy' interface do that. Face it, 'easy' ought to mean 'fool-proof', and the way to do that is to control as much of the data as you can. Copy those arguments - don't require the caller to retain them for as long as you might need them. How's he to know when you're done with them?
  • URL Structure Knowledge - this is something that I think they should have done - don't require the developer to know how to for a URL. Why make the developer encode the data when you know full well how to do it? Have the user give the API the data it needs and then have the API piece it together, encode it as necessary and then ship it off.
  • Field Manipulation - when you add the POST variables to the handle, why not make it so that you can add them as key/value pairs? Why make the user encode them as a single string (which he has to keep around) and then pass them to you? Make it smarter than that. It's not hard - a list of key/value pairs - it's all strings, anyway. Make the data in the handle more manageable than it is now.

These are just the biggest problems I have with cURL. I mean it works, and it's found on almost all platforms, so I'll keep using it, but when I think that there has to have been a ton of revisions on this and these things aren't addressed, it make me think that the person writing it isn't really thinking about how it's being used.

That said, the CKURL does work, and does overcome each of these limitations. It doesn't require the user to do any global initializers, it copies all the data it needs from the arguments passed in, it creates the URL syntax from the general data you've given it, and it allows general field manipulation. All these things are making it a much more enjoyable piece of code to work with. But at the heart, it's still cURL. I'd just love to see them make a really 'easy' version.