Today I spent most of the day on a few issues - the biggest of which was the problem that a restart lost those Option calculation results that I was working so hard to retain. What I needed to do was to persist them to the redis cache server. No way around it. The problem was, I had nothing written for this, and a few other little testing bugs came up and I had to fix them as well.
The persistence system is modeled after all the other save/load persistence methods I have in the code. I have a save state method, and a load state method, and all I needed to do was to put the save method where I was saving other state data, and put the loader after I'd loaded up the instruments and before I continued on with the initialization of the system. In theory, not too bad.
In practice, I had to do quite a bit of detail work, but that was expected. It's non-trivial to persist a class that's not been fitted for persistence, but it's all straightforward. It just took time. No… the problems were after I had that code working, and I went to test it.
The original data structure I used to hold this Option calculation results was a boost::unordered_map:
struct Results {
uint32_t volatility;
msg::greeks_t greeks;
uint32_t impVolatility;
msg::greeks_t impGreeks;
// …and more fields like this…
};
typedef struct Results results_t;
typedef boost::unordered_map<msg::ng::secID_t, results_t> ResultsMap;
The secID_t is a SecurityID - a 128-bit number that packs all the critical instrument identification information into one nice little package. This makes sense as a key in the map because it's unique, easy to use, and the core of a lot of look-ups in the rest of the system. Sounds great.
When I made the deserialization method for the data coming out of the redis server, it looked very simple:
bool unpack( const std::string & aBuffer, uint32_t & aPos, ResultsMap & aValue )
{
bool error = false;
// first, let's get the number of pairs in the map
int32_t sz = 0;
if (!msg::ng::unpack(aBuffer, aPos, sz)) {
error = true;
} else {
msg::ng::secID_t id;
results_t res;
// now, for each pair, read them in and set them
for (int32_t i = 0; i < sz; ++i) {
id.extractFrom(aBuffer, aPos);
res.deserialize(aBuffer, aPos);
aValue[id] = res;
}
}
return !error;
}
I get the size of the map, then read in each pair and save it in the map. Simple.
Well… simple - yes. Fast? No.
The initial tests had this at 3 minutes for 88,000 pairs. That's horrible results. I did some profiling and it's in the one line:
aValue[id] = res;
That makes no sense. Boost's unordered_map is the fastest around. I don't understand what's happening. So I did more digging. The next thing I found was that there wasn't an exposed hash function for boost to use. So I added it:
namespace msg {
namespace ng {
size_t hash_value( const secID_t & anID );
}
}
and implement it simply as:
namespace msg {
namespace ng {
size_t hash_value( const secID_t & anID )
{
return anID.hash();
}
}
}
I expected this to speed things right up. The actual results? Same slowness. Amazing!
Maybe it's the security ID? Maybe it's be better with a Security Key - a std::string version of the same thing? Maybe boost is a lot more efficient dealing with string keys, even though they aren't more space efficient? Worth a try… we can't leave it at a 3 minute start-up.
So I went to:
typedef boost::unordered_map<std::string, results_t> ResultsMap;
and then:
bool unpack( const std::string & aBuffer, uint32_t & aPos, ResultsMap & aValue )
{
bool error = false;
// first, let's get the number of pairs in the map
int32_t sz = 0;
if (!msg::ng::unpack(aBuffer, aPos, sz)) {
error = true;
} else {
std::string key;
results_t res;
// now, for each pair, read them in and set them
for (int32_t i = 0; i < sz; ++i) {
if (msg::ng::unpack(aBuffer, aPos, key)) {
res.deserialize(aBuffer, aPos);
aValue[key] = res;
}
}
}
return !error;
}
With these changes, the insertion of the values into the map went from 3 minutes to 61 msec! That's a factor of almost 3000x! Wow! OK… now I know. Don't use security IDs as keys in the boost::unordered_map. It's just not even close to fast enough.