Working on Magic Map Implementation in C++
Tuesday, August 31st, 2010Today I spent a good bit of time working on extending the variant class I have in my ticker plant to handle byte arrays and then adding in the map key space encoding and decoding. The idea is that map keys really don't need to be just any string - they can be a limited subset of the ASCII space, and in making that a limited set, we can pack more characters into fewer bytes, and then unpack them once they are on the receiver's machine.
Say we limit the ASCII space to 64 characters - any 64, really - just so long as there's a mapping from the 255 ASCII values to the 64 acceptable values, and back again. At that point, we know that we'll be able to store the mapped key space in 6 bits - 26 is 64. So if we look at a series of three bytes we can pack four of these characters into that 24-bit space:
Byte 1 | Byte 2 | Byte 3 | |||||||||||||||||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
Char 1 | Char 2 | Char 3 | Char 4 |
The code for the conversion is pretty simple - look for now many even 'blocks' of four characters to encode, map them into the limited key space, mash their bits and pack them into three bytes. The remainder is a simple process of doing part of that.
The decode is simple as well - we just need to have some terminal condition - and that can be a simple 0x3f (all 1s) in the last (terminal) 'character'. So we look at the length of the byte stream, see how many "full sets" of three bytes there are - convert each into four characters, and then based on what's left, we have only a few options. It's pretty simple to decode, and you have the original string back.
I needed to get this all in the variant class as the next thing I needed was to add in the update stream that will be sent by the data service in response to updates at the source. This is similar to the kind of updating I did in Java back in InfoShop: there was a HashMap that understood a 'path' concept, and transactions. You could put a map into a transaction, do things to it, and then commit the transaction and get a transaction log that could be sent to remote copies of the same map, and be applied to bring it up to date.
It worked very well for me at the time, and it's something very similar this time. There's no transaction, and the updates are all streams with an action, a path, and an optional value. It's simpler, but for the data sets we're dealing with, it's sufficient.
I got a lot of the decoding of the update stream done, but I think it's time to put together a client and see how it functions to see what I'm really getting from these servers. There's nothing like the socket-level byte stream to remove all the questions and ambiguity in the specification. So that's going to have to wait for tomorrow.