Got My Archive Server Working

Building Great Code

Today I finally had time to devote to my archive server to get the final query form working the way I wanted. The server is really the reader part of a reader/writer pair, where the Feed Recorders are the writers of the data. The recorders are very simple little apps - they use the basic framework I've built for the exchange feed processing, and then instead of processing the datagrams, they simply buffer them and then every 30 mins or 10MB, they are written to disk in a directory structure that includes the feed name, the side, and the date. The point is that when we go to read the data, we don't want to have to look at thousands of files to get the few we want, so using directories is a very good plan.

Once these files are written, it's a simple matter of reading them and parsing the datagrams into the messages and then serving them up. By having this stored in smallish files, it makes it easy to cache these files-turned-messages in the archive server. The only key left is to make the server smart about the requests it gets.

The current format of the requests are pretty simple: feed name, side, starting and ending times, list of instruments to return, types of messages to return, and an optional sequence number. The main bulk of the requests won't use the sequence number, and that's OK - it's for special requests that I just got done finishing, but more on that later. The vast majority is really about a time range and a message type: "Give me all the Quotes for IBM from 10:30 to 10:45" - that kind of stuff. For this, the service was working pretty well.

But there was a slight hitch - what if the request was't on the filesystem? What if it was sitting in the recorder buffered up, waiting to be written out? Well… then I had to put in a scheme where the recorders were actually services themselves. The recorders would then answer a simple request: give me your data. The archive server can then see if the request is fulfilled by the filesystem data, and if not, it'll go to the appropriate recorder service, ask it, and augment the response as necessary.

It was pretty neat.

It was also pretty fast, which is nice.

The final thing I needed was to have a time/sequence number request for restarting the feeds and greek engine. Basically, if the server goes down, even if it's got a saved state, there will be some time between the last save and the time it's back up and processing messages where it's lost the data and it doesn't have any way to get it.

Enter the time/sequence number request.

When the server gets back on it's feet, it can look at the last data it has in the saved state, and then issue a request to the archive server and say "Hey, send me everything you have after this sequence number, which is about this time". Processing the returned messages means that the server will be able to catch up the lost messages, and if they aren't needed - no big deal, we'll throw them away. But if they are needed, then we have them.

Well… today I finished the archive server part. I haven't worked in the feeds and engine requesting the data, but that shouldn't be too hard. I'm in the middle of trying to get a lot of little things fixed up for the greek engine in testing, so I'm liable to hold off a bit before pushing ahead with that feature. But it feels really good to get this part done and in the can.