Fun with Exchange Codecs – FIX Adapted for Streaming
Tuesday, August 24th, 2010Well, it turns out that the ASCII-based exchange protocols NASDAQ, and some of the other lower-volume exchange feeds use is fine as far as that goes, but OPRA decided that it had pushed the limits of the ASCII protocol, and decided to make/adopt this FIX Adapted for Streaming - or FAST, protocol. In a sense, I can see why they'd adopt it - as opposed to writing their own, but I've read enough on the net to know that they really didn't adopt it 100% - just the compression of data part.
Basically, the FAST protocol is based on a few ideas:
- Very Little to no ASCII to decode - no longer will there be numbers represented as ASCII digits. Most numbers are now simply integers. In fact, they only allow for three data types: 32-bit integer, unsigned 32-bit integer, and a string. WIth those, and a few decoder tables, you can handle anything an exchange needs.
- Delta Encoding - there will be fields that are required in each message, but for some fields, the value present will be a simple increment, and in fact, it's possible to have nothing in the message, and have the assumption be that the value is simply incremented. This helps a lot. There are also values that are simple changes from the last value in the field, so duplicates can be removed. It's small, efficient, and makes for a compact encoded data stream.
The problem is, of course, that there is now state in the decoder. In general, this isn't bad, but what it requires me to do is to completely decode all the messages that I get, and the shortcuts I had that would extract just the sequence number, or just the flags for skipping the message - those are tossed out the window. I need to get all the data, and then deal with it.
This took a little while to work into my application, but in the end, I had the concept of a decoded message, and that message included the elements I had originally extracted, as well as the actual message. Thankfully, this is still pretty fast as OPRA isn't messing around with a lame decoder as it knows the point of this is to get more through the system.
I still need to do a lot of tests, and even finish writing my codec for the OPRA data, but at least I've got all the essentials of the FAST decoding working, and should be able to get moving forward again tomorrow with the messages.