Archive for the ‘Coding’ Category

Finished the Good, Fast, Message Cache

Wednesday, September 8th, 2010

Professor.jpg

This morning I was able to finish up the message cache that I started work on yesterday afternoon. It's been pretty straight forward right up until I got to the part about being thread-safe and fast at the same time. This meant lockless data structures, and that meant the compare and swap operations.

Since I'm only having one producer and many consumers, it's a thought that maybe I didn't have to worry too much about this. Well... that's a mistake. I do. But thankfully, there's only one place I need to worry about the complexities of this map - in the "setter" of the individual value.

The code I ended up with looks like this:

  boost::unordered_map<uint64_t, Message *>   mCache;
 
  Message   *oldMsg = mCache[key];
  while (!__sync_bool_compare_and_swap(&(mCache[key]), oldMsg, newMsg)) {
    oldMsg = mCache[key];
  }
  // now handle the old message
  if (oldMsg != NULL) {
    delete oldMsg;
    oldMsg = NULL;
  }

where the value newMsg is the value to place in the map at the key of key.

Sure, this is greatly simplified, but the single core issue is there - we have to check to see if the value has been changed, and if it has, we need to get the new value, and then try and set our value again. The idea is that it's thread-safe, not that it's the right element in the map, but that it's the one that's put there last.

This is working great for me and while I haven't had the chance to run tests against it, it's as fast as I'm going to be able to get regardless if it's fast enough. I think it will be. I've used the boost::unordered_map which is supposed to be faster than the STL std::map which uses a read/black tree for the key space, and since my key space is a uint64_t, it's pretty easy to hash that guy - it's just the value.

I'm going to have to test this guy with the real data feeds, but I think I'm going to be OK. I feel good about it.

Working on a Good, Fast, Cache for Messages

Tuesday, September 7th, 2010

Professor.jpg

The last half of today I spent trying to figure out a really good way to incorporate a good, fast, caching scheme for these messages that are the heart of my ticker plant. The problem is that I know I'm going to be hammering the messages as fast as I can possibly take them, so a cache that locks is just right out. I wanted to make it simple as well - something parametric so it'd be easy to turn on or off and see what the difference in timing and throughput was.

The problem was, it wasn't really obvious how to do this with the given design I had.

Failed Attempts

I thought that it'd be best to put this into an existing class in the design - as low a level as possible. I was thinking maybe the component that handled simple message distribution to a list of registered listeners. It'd be nice there in that any component that "pushed" messages would be able to take advantage of this by simply "turning it on".

But there were a lot of conceptual problems. First, the messaging API used references for the messages, which is what I wanted to use, but that meant that in order to make the caching fit in without altering the API, I'd have to be making copies of the messages. That's not a good plan. Nothing good is going to come from copying a hundred thousand messages a second. That's a recipe for poor performance.

Next, if we changed the API to use pointers, and thought of the send() method as a "sink" of the message, then we'd have a nice, transparent, caching scheme, but the problem is that the messaging system relies on calling send() several times - once for each listener. This means I can't have the one method "eating" the instance as there'd be nothing left for the other listeners.

I messed around with this for a while, each time coming to the conclusion that the design I had wasn't really workable. I finally backed off and tried to think of the problem in vastly different terms.

What came to me was very simple, very clean, and in the end, far better than I'd ever have been able to do in the previous approaches.

The Component Approach

I began to think of the cache as something of a component as opposed to a capability of the objects in the design. I started to look at the cache as something that I could put as an ivar in the few components that needed it, and make it self-contained and easy to integrate and use, and all of a sudden things started to really look up.

Now I didn't have to worry about the passing of references and how many times something is called. In anything that deals with messages, it's possible to have a cache, and that cache will "eat" messages created on the heap. It will either delete them right away (if it's inactive), or it'll save the latest, and delete the older version, so as to keep the most recent in memory and yet at the same time handle all the housekeeping for the memory management of the created messages.

Most of the components that need this caching are things that create messages. Maybe it's coming in from a serialized data stream. In that case, the created message needs to be passed to all the registered listeners, and then deleted - or cached. It's similar for different "translation components", but the model fits very well. So I started coding it up.

I didn't get finished, but it's well on it's way, and it appears to have beaten all the problems that were really dogging me with the other approaches. I know there's more to deal with - like the "fast" part, but I'll deal with that in the morning.

Finishing Up the New Client Library

Friday, September 3rd, 2010

GeneralDev.jpg

This morning I finally finished up the client library I've been working on for a while. I finished up yesterday with it functional, but not the way I wanted it. I spent a little time this morning cleaning up the code - adding comments here and there, and clearing out the logging using only during the debugging phase.

Nothing very exciting so far...

Then I was beginning to think that I might be able to get the best of both worlds with the existing design. One of the problems with my one request - one socket is that we can possibly end up with a ton of sockets in use. We can also be spending a lot of time creating them, only to tear them down right away. Not as efficient as it could be.

Yet I didn't want to have the one bad apple problem, so I think I've got a nice idea that I'm going to be putting into the codebase this morning - pooling of sockets. It's a simple thing - have a list of "unused" socket connections, and when a request comes in, grab one from the list, or if there are non there, create one. Then, as you use it, great. Everything is isolated. Then, when you're done with it, return it to the pool, and it'll get re-used the next time someone needs one.

Very simple, but the socket stays open, all the resources stay allocated, and in general, we have the best of both worlds. Very nice. That's going to be my morning - pretty excited about it.

Tricking a C++ Union into Having Methods

Thursday, September 2nd, 2010

cplusplus.jpg

I've got an interesting problem that I sure wish I'd checking into the linux C libraries a lot more before getting too far into this, but so it goes... It's the old UUID problem. It's a 16-byte, unsigned number, but it's far too big to hold in one atomic data element, so it's often a union:

  union UUID {
    uint8_t   bytes[16];
    uint16_t  words[8];
    uint64_t  blocks[2];
  };

but the problem with this union is that you can't do something like this:

  UUID   a;
  UUID   b;
  a = b;

or:

  UUID   a;
  UUID   b;
  // ...
  if (a = b) {
    // do something...
  }

C++ just doesn't allow for the default operator=() or operator==() overloading like it does the default constructor, the default (shallow) copy constructor, etc. You have to build these things into a class.

However, there is one nice thing that these unions allow: it's the un-biased use of the union as the start of it's member data. For example, if we have:

  UUID   a;
 
  // ...
  memcpy(&a, src, 16);

we can write the 16-bytes of data into the union's elements by simply referencing the UUID itself. This is not the case with a class. There's the vtable, and that ends up throwing off all the offsets for the ivar data.

But what can a guy do to get these methods on a union?

The Answer is really the anonymous union:

  struct UUID {
    union {
      uint8_t   bytes[16];
      uint16_t  words[8];
      uint64_t  blocks[2];
    };
 
    // the constructor/destructor set...
    UUID();
    UUID( const UUID & anOther );
    UUID & operator=()( UUID & anOther );
    UUID & operator=()( const UUID & anOther );
    // the equality/inequality operators...
    bool operator==( const UUID & anOther ) const;
    bool operator!=( const UUID & anOther ) const;
    bool operator<( const UUID & anOther ) const;
    bool operator>( const UUID & anOther ) const;
    bool operator<=( const UUID & anOther ) const;
    bool operator>=( const UUID & anOther ) const;

where I added in the inequality operators so the UUID could be used in a std::map as a key. The problem now is that I'd really love to have the following still work:

  UUID   a;
 
  // ...
  memcpy(&a, src, 16);

but it won't. I have to reference the first part of the data to get past the vtable, etc.

  UUID   a;
 
  // ...
  memcpy(&a.blocks[0], src, 16);

It's not horrible, but it's no better than really building a class. I got the syntax I wanted, but the consequences weren't without their costs. Kind of bumming to get it all going and then remember the vtable, but that's life.

Banging Away at a New Client Library

Wednesday, September 1st, 2010

GeneralDev.jpg

Today I spent the entire day banging away hard at a new client library to a data system that another group in The Shop has created. They built a Java client, but I needed to have something in C++, and have it interface nicely to the variant data type that I've already created. Someone working with me started the work on this client, but then was called away on another project because it was having lots of problems in production. So it goes.

So I took the start he had on the client library and took a good, long look at the code to see what needed to be done, and what was really looking really pretty good. As is so often the case, the start he'd gotten was OK, but it wasn't really good enough for the kinds of conditions that I'm used to seeing, and maybe that explains the problems in production a bit as well. So it was a total do-over.

The key points of the service, insofar as the design is impacted, is that there are socket connections, then on a single socket connection, there can be multiple simultaneous calls and subscriptions. The call is a one-time subscription, which returns a single result, whereas the subscription will send updates to the client with every change to the source data that generated the result.

It's a nice model, and the only problem I see with it is that boost asio is set up to have a single buffer of the incoming data, and so it's possible to really have that one buffer, or socket, be a bottleneck. Additionally, if there's a problem in my request, it's possible that the service will not know how to respond, and it's only solution is to kill the socket. That's a bit drastic, but I can understand why - the service doesn't know how to respond to me, so it's only solution is to tell me "that's VERY bad" by killing the socket.

With that, all my pending calls and subscriptions are dead, and so it's possible for one bad apple to spoil the whole bunch. Not my idea of a great way to go.

So what I ended up with as a first cut on this is to have an "updater" object that has a socket and all the supporting data with it. Then, a single call is a single socket, a single buffer, and a target variant. This means that as data comes in, boost's asio can handle as many of these as necessary. Much cleaner, but there's a cost - multiple open sockets, and starting one up isn't fast.

Still... it is a good start, and I spent today building that. At the end of the day, I had something that was working, but needed to be cleaned up a little bit. Still... stayed late to get that going, and was very happy to see all the bytes line up and get back solid responses.

Working on Magic Map Implementation in C++

Tuesday, August 31st, 2010

Today I spent a good bit of time working on extending the variant class I have in my ticker plant to handle byte arrays and then adding in the map key space encoding and decoding. The idea is that map keys really don't need to be just any string - they can be a limited subset of the ASCII space, and in making that a limited set, we can pack more characters into fewer bytes, and then unpack them once they are on the receiver's machine.

Say we limit the ASCII space to 64 characters - any 64, really - just so long as there's a mapping from the 255 ASCII values to the 64 acceptable values, and back again. At that point, we know that we'll be able to store the mapped key space in 6 bits - 26 is 64. So if we look at a series of three bytes we can pack four of these characters into that 24-bit space:

Byte 1 Byte 2 Byte 3
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Char 1 Char 2 Char 3 Char 4

The code for the conversion is pretty simple - look for now many even 'blocks' of four characters to encode, map them into the limited key space, mash their bits and pack them into three bytes. The remainder is a simple process of doing part of that.

The decode is simple as well - we just need to have some terminal condition - and that can be a simple 0x3f (all 1s) in the last (terminal) 'character'. So we look at the length of the byte stream, see how many "full sets" of three bytes there are - convert each into four characters, and then based on what's left, we have only a few options. It's pretty simple to decode, and you have the original string back.

I needed to get this all in the variant class as the next thing I needed was to add in the update stream that will be sent by the data service in response to updates at the source. This is similar to the kind of updating I did in Java back in InfoShop: there was a HashMap that understood a 'path' concept, and transactions. You could put a map into a transaction, do things to it, and then commit the transaction and get a transaction log that could be sent to remote copies of the same map, and be applied to bring it up to date.

It worked very well for me at the time, and it's something very similar this time. There's no transaction, and the updates are all streams with an action, a path, and an optional value. It's simpler, but for the data sets we're dealing with, it's sufficient.

I got a lot of the decoding of the update stream done, but I think it's time to put together a client and see how it functions to see what I'm really getting from these servers. There's nothing like the socket-level byte stream to remove all the questions and ambiguity in the specification. So that's going to have to wait for tomorrow.

Case Sensitivity and New Developers

Tuesday, August 31st, 2010

Crazy Lemon the Coder

I believe, though have no factual data to support this belief, that most developers these days are learning to code on Windows with some IDE - like Eclipse, NetBeans, etc. It's the easiest way to learn to code, I'll grant you, and it's the cheapest platform to use. Heck, you can get a decent Windows laptop at BestBuy for under $800. Peanuts.

But there's a good contingent that seems to have fallen in love with the Free Software movement, and picked up linux to put on their $800 laptop, and are using Eclipse on linux. This is not bad. But it's the clash of these two worlds that often leads to problems.

Then again... it could just be that the developers aren't all that good, and need to be whipped into shape before they're set loose on the codebase.

So here's what happened today: I'm trying to clone this git repository onto a windows box because the way I have the monitors hooked up, it's easiest to see the code from the Windows screen. My first problem was with the git that ships with Cygwin. Turns out, if there's a problem with the repo (as there was - sort-of), the Cygwin git gets all messed up. Maybe in a subsequent release they'll get it fixed, but the consensus at The Shop is that the better solution is to get msysgit from Google Code.

(As an aside, I've also decided to upgrade my Cygwin to 1.7.7 as it seems I was pretty out of date, and that could possibly have contributed to the problems.)

So I clone this repository and I find that I've already got a changed file. Hmmm... that's odd... so I check that file out again... still changed. Very odd.

Turns out, the developer had created two files - one a Java source file, the other a shell script - one named MyTest.java, the other myTest.java. OK... that's got all kinds of wrong written all over it. Case is not the way to distinguish files - not in the multi-platform world. And who makes a shell script have an extension of .java?

When I pointed this out to him, he cleaned it up and there wasn't a problem any longer. But it reminded me that what I take for granted is not exactly universally understood.

Tough Day Full of Avoidable Problems

Monday, August 30th, 2010

Today was a day I'll be glad is over... there were just so many avoidable problems and delays that it makes for a day that really is best forgotten as soon as possible. It started out reasonably well - I was finishing up the work I'd start on Friday. I had really wanted to get it done on Friday, but it took another three hours (roughly), so it would have been silly to really stay and see it through. Also, there was no way I could have tested it Friday evening - and this morning I was able to use the live data to verify that things were working as planned.

But at this point, things were still going pretty well. The code was done, everything tested out, I checked it all in and pushed it to the central repo - pretty nice. But then the avoidable stuff started to bite me.

I needed two machines to test the throughput of ZeroMQ as a reliable multicast distribution system for my tick data. Nothing fancy, but I needed to have some way of replacing 29West as we weren't really using it as a solid middleware - just a multi-channel reliable multicast system. Given it's limited usage, I looked at ZeroMQ and thought Hey, if this works, I'm in business! But in order to know if it'll work, I need to actually get the ticker feeds working, put the messages into ZeroMQ, and pick them up on another box. Hence the need for two boxes.

Well... they got a few boxes with 10Gb ethernet NICs in them to make sure that I didn't have to worry about the NIC being the bottleneck, and they were ready for me to check the boxes out. As per the way things are at The Shop, the standard mechanism for getting to these servers is NXMachine or SSH. Given that I'd be testing and building, I decided to go with NXMachine. It installs pretty easily, and with this simple fix, it should work just fine.

Silly me...

I spent a full morning trying to get the NX Server to work. I knew the client was working, and I knew the server could work, but it wasn't allowing me to get a complete connection. Well... I got the connection, but when I went to actually display the X session on my box, it disconnected me. Very odd.

I tried logging the NX Server - no luck even when following directions. I tried re-installing the software - no good, either. I tried different parameters for the client - no good. In the end, I went to my boss and asked him for help. He couldn't get in either, but then realized that maybe it was because GNOME wasn't installed. Specifically, the GNOME Desktop environment.

That was the problem.

When he did a simple:

  $ yum groupinstall "GNOME Desktop Environment"

he got some 124 packages that needed to be installed. It seems that they didn't put the full GNOME install on the box as it was a "server". My previous box had the GNOME Desktop installed prior to me using it - which is why it worked. Had I stuck with a bunch of SSH sessions into the box, it would have been fine. It was just the desktop login that was the problem.

After that was solved, I was able to install boost, log4cpp, ZeroMQ, and a few other things to get this new box to the point that I was able to verify that all the code worked, and that everything compiled and ran.

Lots of grief for something as simple as not having the login desktop stuff installed.

Pushing Forward with More Codecs

Friday, August 27th, 2010

Today was the final push for the last two codecs I needed to write. I was really hoping to get both of them done, but the amount of code required in the first was just too much to get both done in a day. I had to send an email saying that I had slipped on this guy, and that I'd get back to it on Monday.

Not really hard, just a lot of code to get the messages decoded. Bummer...

Google Chrome dev 7.0.503.0 is Out

Friday, August 27th, 2010

The second nice update on the browser front this morning is that Google Chrome dev is now at 7.0.503.0. This represents the jump of the major version number after the release of Chrome 6.x to "beta". While I'm not unhappy with 6.x, I think the major updates the Chrome Team puts into the browser will be in the 7.x branch now, so I'd like to stay on the 'dev' branch as long as it's even reasonably stable. It's good to see them pushing forward.