Boy, I Wish I Had an In-Memory Database

CKit.jpg

Today I realized that one of my price injectors wasn't properly updating the data structures. Well... that's not really right... it was working, it just wasn't doing what I needed. Basically, the first cut I had of the data structure was to have a map where the key was the price identifier (RIC for Reuters, BBG Symbol for Bloomberg, etc.) and the value was an array of instruments that would need to be updated if a price with this identifier came into the injector. Pretty simple. Price comes in... we pick off the identifier... we hit the map and get the array of instruments, and then for each instrument we send an update. Simple.

But it's got a flaw.

What if I received a new price identifier for an existing instrument? Then, I'd add another key to the map with one instrument on it. I wouldn't remove the old one, and so there might be two price injections for the same instrument. Bad idea.

The simple fix would be to remove the old instrument from the array in the map - but that would require a large scan - first, of all the price identifiers, and then for each element in the array associated with the identifier. I didn't like this scanning as it was bound to be inefficient when the numbers got very large. So I had to change, or at least augment, the data structures.

What I chose to do was to have another map - this one from the instrument to the identifier so that I could easily look up the identifier given an instrument. This would then allow me to quickly find all the instruments for an identifier, and then the identifier for an instrument. With this, I was able to quickly remove an instrument if the identifier changed, and also easily send out the instrument updates when a price (with identifier) came in.

But it got me thinking... what I really wanted was a simple database table. Something where I could say 'SELECT identifier WHERE instrument=blah', and then 'SELECT instrument WHERE identifier=zip'. This would allow me the freedom to look at the same data two ways, and even if I didn't have the complete relational database, a simple SELECT on a table would be all I'd really need.

There's a lot to think about here. Maybe it would be easier to just use the multimap in STL and see if that doesn't handle all my needs. If I did a multimap of identifier to instrument, I could easily find all the instruments for a given identifier, and with a reverse map, I could find the one identifier for a given instrument. I guess that making a simple template might be all I'd need and then I'd have all this functionality.

But that in-memory database table would be really nice. I can think of a lot of uses for it. I may have to spend more time on this tomorrow. It's a really interesting idea.

UPDATE: I looked more at the STL multimap and I don't like it's insertion methods at all. Yuck. Why not put in the operator[] like the map? It's got to be possible - but I can't fit it in after the fact. The best I can do is to subclass it or something. I'm really a bit surprised at this. In any case, I'm not going to keep going after the multimap for the price feeder. Ick.