Archive for the ‘Coding’ Category

Handling Fast Market Data Efficiently – Hint: Go Lockless

Thursday, August 26th, 2010

Today I was doing some testing on my latest data codec in my new ticker plant, and I ran across some performance issues that I didn't really like. Specifically, the processing of the data from the UDP feed was not nearly fast enough for me. As time went on, we were queueing up more and more data. Not good. So let's see what we had in the mix that we needed to change...

First, the buffer I was using was assuming that the messages from the exchange were not completely within a UDP datagram. This was a nice "luxury", but it's not true, and it was costing us time in the processing. It's better to assume that each UDP datagram is complete, and queue them up as complete units to process, than to have the logic in the buffer to "squish" them together into one byte stream, and then tokenize them by the ending data tags.

That was really quite helpful because at the same time I decided that it was a bad idea to use the mutex/conditional I had set up to allow the one producing thread and one consuming thread to efficiently access the data. Instead, I grabbed a very simple lockless circular FIFO queue off the web and cleaned it up to use for this UDP datagram buffering. It's easy enough to use - there's one thread that moved the head, and another that moves the tail. Simple. As long as the head and tail aren't cached on the CPUs, it'll work without locking. Simple enough.

But when I get rid of the locking/waiting, then I have to handle the case where the queue is empty and we need to try again. My solution there is to start simple and put a simple 250 msec wait. When I started testing this, I saw that there were significant pulses in the incoming data because a lot of datagrams arrived while we were waiting. So I got a little smarter.

I added an expanding delay - starting small, and building, so that we can hit it quickly if it's a short delay, but when the close comes, we'll only do a few checks before it goes to only a few times a second. That's very reasonable.

I did more tests and finally ended up with a variable scheme that had no delay for a few hits and then started stretching it out. Very nice.

In the end, I had something that emptied far faster than the UDP data source, and that's critical for a ticker plant. There's enough to slow it down later in the processing, so it's essential to start out as fast as possible.

Finally Finished Major Addition to Ticker Plant

Wednesday, August 25th, 2010

MarketData.jpg

Well, it's taken me a few days, but I've finally finished the code in my ticker plant to handle the options data feed. It's a biggie because instead of doing the same ASCII encoding that the other exchanges do, they switched some time ago to a FAST (FIX Adapted for STreaming) encoded stream to reduce the bandwidth needed to move the data from them to us. This just added a new wrinkle as we had to incorporate their FAST decoder implementation (initially), just to get the data into a binary format that we could do something with.

Then we had to adapt the code to allow for the fact that some messages from the exchanges, specifically OPRA right now, generate multiple messages to flow downstream. This wasn't hard, but it was in all the codecs, so it took a little time to get it all right and working properly.

I got it all finished, compiled correctly, and looking like it's ready to test. Time to commit it all to git and then get to the business of testing.

Fun with Exchange Codecs – FIX Adapted for Streaming

Tuesday, August 24th, 2010

MarketData.jpg

Well, it turns out that the ASCII-based exchange protocols NASDAQ, and some of the other lower-volume exchange feeds use is fine as far as that goes, but OPRA decided that it had pushed the limits of the ASCII protocol, and decided to make/adopt this FIX Adapted for Streaming - or FAST, protocol. In a sense, I can see why they'd adopt it - as opposed to writing their own, but I've read enough on the net to know that they really didn't adopt it 100% - just the compression of data part.

Basically, the FAST protocol is based on a few ideas:

  • Very Little to no ASCII to decode - no longer will there be numbers represented as ASCII digits. Most numbers are now simply integers. In fact, they only allow for three data types: 32-bit integer, unsigned 32-bit integer, and a string. WIth those, and a few decoder tables, you can handle anything an exchange needs.
  • Delta Encoding - there will be fields that are required in each message, but for some fields, the value present will be a simple increment, and in fact, it's possible to have nothing in the message, and have the assumption be that the value is simply incremented. This helps a lot. There are also values that are simple changes from the last value in the field, so duplicates can be removed. It's small, efficient, and makes for a compact encoded data stream.

The problem is, of course, that there is now state in the decoder. In general, this isn't bad, but what it requires me to do is to completely decode all the messages that I get, and the shortcuts I had that would extract just the sequence number, or just the flags for skipping the message - those are tossed out the window. I need to get all the data, and then deal with it.

This took a little while to work into my application, but in the end, I had the concept of a decoded message, and that message included the elements I had originally extracted, as well as the actual message. Thankfully, this is still pretty fast as OPRA isn't messing around with a lame decoder as it knows the point of this is to get more through the system.

I still need to do a lot of tests, and even finish writing my codec for the OPRA data, but at least I've got all the essentials of the FAST decoding working, and should be able to get moving forward again tomorrow with the messages.

Indiana Jones and The Legend of the Lost Codebase

Wednesday, August 18th, 2010

Detective.jpg

Well... I'm donning the old fedora again, and off in search of the Lost Codebase. It's really quite amazing the skill that some people have to hide code. I'm sure they don't think of it that way - they probably consider it to be exactly where they want it to be - the right spot. But if I can't find it after working in the repository for nearly two months, then it's time to call it "hidden". Yup... hidden. And that means I need to get out the fedora and get exploring.

The first thing I check is of course, the most obvious - the name of the directory. Clearly, this is a trap, for who in their right mind would put the code in a clearly labeled directory. No, that's the location for some of the code. Maybe. Hard to tell as the class files are nearly completely empty, and one would wonder if the code even compiles. I'm not fool enough to fall for that trick - typing 'make' could end of wiping out my entire machine's drive. I'm no fool.

Next, I check the similarly named directories. No luck there, but not nearly as complex, and some of the traps aren't even well constructed. In one there's no Makefile - a dead giveaway, if ever there was one. In another, they foolishly only include a handful of files. This is too easily scanned and I can see what I'm looking for isn't there. In all, a minor detour, but I have no idea where to go next.

Next I have to go with the big guns - I grep for a keyword in the entire source tree. As expected, this yields far too many hits, and I need to filter it down. Doggedly, I wrestle the filter on the grep to give me something I can work with. I struggle weeding out the false hits. I finally think I may be onto something only to have my hopes dashed when it's a simple comment and not the real code I'm looking for.

It's frustrating, and in the end, I realize I've met my match. I have to back off, regroup, and hope that when the author(s) decide to come in for the day, they have some answers to where the hid the secret directory to the code.

Oh yeah... I even checked for the hidden directories... no luck.

Google Chrome dev 6.0.495.0 is Out

Wednesday, August 18th, 2010

It looks like they have fired up the 'dev' channel again as Google Chrome dev 6.0.495.0 was released this morning. I went back to the dev channel after moving to the beta when it was released a few days ago. I have to say, this has become an incredibly stable platform. It's fast, looks like a Mac app, and it just plain works. Nice.

The release notes indicate that, for the Mac at least, we're getting a fix for the download shelf, and a few fixes for CSS and plug-in handling. Looks good to me.

Swatting Flies is an Annoying Thing to Do

Tuesday, August 17th, 2010

cubeLifeView.gif

I've been working (still) on getting more exchange feed codecs into the system, and while it's not really hard work, it takes a little thought, and a lot of attention to detail. So when I get some kibitzing from those that would love to see me fail, but are too afraid to really stand up to this project, it's like swatting flies - not hard, they aren't going to do me any harm, but it's annoying nonetheless.

When it gets bad, I just get up, take a little walk, get a pop, and clear my head. That usually does it. Oh... and getting another feeder done in less than a day makes me feel good. It shows the "flies" that they really might want to take notice of the different way I've put this together. But that's really hoping for too much, I suppose.

Time to get some bug spray.

Google Chrome beta 6.0.472.36 is Out

Tuesday, August 17th, 2010

This morning I noticed that Google Chrome beta 6.0.472.36 was out - still no word on a new 'dev' release, so it appears that for now, they are simply sticking with the 6.0.x branch and not starting anything new for the time being. It seems reasonable that if they aren't making major changes, they can keep the 6.0.x branch moving along from dev to beta to stable. It's only if they have great new ideas that it makes sense to open up the dev branch again.

So it's out there - a few little UI fixes - nothing major.

Once Again – Amazing Progress with a Good Design

Monday, August 16th, 2010

MarketData.jpg

Today I spent all day working on getting two exchange feeders written and tested. This kind of speed is not because I can copy/paste very fast, it's because I've got a solid design that allows me to leverage the work I've already done and customize it very quickly and easily. Given that the last developers of these feeds took months to achieve what I've done in a day, there's a lot to be said about the power of the design. The previous one was particularly ill-suited to this task.

So it was a hard day, but I'm getting a lot closer to the point that I'm caught up with all the exchange feeds we have. At that point, I can look to data enrichment, and really start to add value to the data feeds.

Google Chrome beta 6.0.472.33 is Out

Friday, August 13th, 2010

GoogleChrome.jpg

Well... I'm a little surprised (again) at the Google Chrome guys... this time, the update to 6.0.472.33 didn't work from the application, and I had to get the update directly from the web site. In addition, the permissions on the existing app package made it impossible for a new user to replace the old with the new. Very odd. But in the end, I got what I needed, and I hope they have these updating issues fixed. It's amazing that they don't just use Sparkle, it's almost a defacto standard for the Mac.

UPDATE: I see the point... they promoted this to beta from dev, and that's the reason it wasn't updating. I'm going to have to go back to the dev channel when it's on the next major release. Makes sense now.

The Amazing Power of Really Good Design – And Hard Work

Thursday, August 12th, 2010

Today I was very pleased to see that I could add a second exchange feed to the codebase. Yeah... just one day. Pretty amazing. I know it's primarily due to a good design because the number of lines of code I had to write was very few - on the order of 600 lines, but there's still a little bit of good old hard work to attribute to it as well.

But really, it was the design. What a great design. This is something I'm going to enjoy over and over again as I keep working with this codebase. I need to add in at least six more feeds, but if they are only a day or two per feed, I'm still done long before I had expected to be. Amazing.

So after I had it all done, I looked at the code and realized that when I was "unpacking" the time data from the exchange into milliseconds since epoch, I was making a few system calls, and that was going to come back to bite me later as the loads got higher and higher. The original code looked like:

  /*
   * This method takes the exchange-specific time format and converts it
   * into a timestamp - msec since epoch. This is necessary to parse the
   * timestamp out of the exchange messages as the formats are different.
   */
  uint64_t unpackTime( const char *aCode, uint32_t aSize )
  {
    /*
     * The GIDS format of time is w.r.t. midnight, and a simple, 9-byte
     * field: HHMMSSCCC - so we can parse out this time, but need to add
     * in the offset of the date if we want it w.r.t. epoch.
     */
    uint64_t      timestamp = 0;
 
    // check that we have everything we need
    if ((aCode == NULL) || (aSize < 9)) {
      cLog.warn("[unpackTime] the passed in data was NULL or insufficient "
                "length to do the job. Check on it.");
    } else {
      // first, get the current date/time...
      time_t    when_t = time(NULL);
      struct tm when;
      localtime_r(&when_t, &when);
      // now let's overwrite the hour, min, and sec from the data
      when.tm_hour = (aCode[0] - '0')*10 + (aCode[1] - '0');
      when.tm_min = (aCode[2] - '0')*10 + (aCode[3] - '0');
      when.tm_sec = (aCode[4] - '0')*10 + (aCode[5] - '0');
      // ...and yank the msec while we're at it...
      time_t  msec = ((aCode[6] - '0')*10 + (aCode[7] - '0'))*10 + (aCode[8] - '0');
 
      // now make the msec since epoch from the broken out time
      timestamp = mktime(&when) + msec;
      if (timestamp < 0) {
        // keep it to epoch - that's bad enough
        timestamp = 0;
        // ...and the log the error
        cLog.warn("[unpackTime] unable to create the time based on the "
                  "provided data");
      }
    }
 
    return timestamp;
  }

The problem is that there are two rather costly calls - localtime_r and mktime. They are very necessary, as the ability to calculate milliseconds since epoch is a non-trivial problem, but still... it'd be nice to not have to do that.

So I created two methods: the first was just a rename of this guy:

  /*
   * This method takes the exchange-specific time format and converts it
   * into a timestamp - msec since epoch. This is necessary to parse the
   * timestamp out of the exchange messages as the formats are different.
   */
  uint64_t unpackTimeFromEpoch( const char *aCode, uint32_t aSize )
  {
    // ...
  }

and the second was a much more efficient calculation of the milliseconds since midnight:

  /*
   * This method takes the exchange-specific time format and converts it
   * into a timestamp - msec since midnight. This is necessary to parse
   * the timestamp out of the exchange messages as the formats are
   * different.
   */
  uint64_t unpackTimeFromMidnight( const char *aCode, uint32_t aSize )
  {
    /*
     * The GIDS format of time is w.r.t. midnight, and a simple, 9-byte
     * field: HHMMSSCCC - so we can parse out this time.
     */
    uint64_t      timestamp = 0;
 
    // check that we have everything we need
    if ((aCode == NULL) || (aSize < 9)) {
      cLog.warn("[unpackTimeFromMidnight] the passed in data was NULL "
                "or insufficient length to do the job. Check on it.");
    } else {
      // now let's overwrite the hour, min, and sec from the data
      time_t  hour = (aCode[0] - '0')*10 + (aCode[1] - '0');
      time_t  min = (aCode[2] - '0')*10 + (aCode[3] - '0');
      time_t  sec = (aCode[4] - '0')*10 + (aCode[5] - '0');
      time_t  msec = ((aCode[6] - '0')*10 + (aCode[7] - '0'))*10 + (aCode[8] - '0');
      timestamp = ((hour*60 + min)*60 + sec)*1000 + msec;
      if (timestamp < 0) {
        // keep it to midnight - that's bad enough
        timestamp = 0;
        // ...and the log the error
        cLog.warn("[unpackTimeFromMidnight] unable to create the time "
                  "based on the provided data");
      }
    }
 
    return timestamp;
  }

At this point, I have something that has no system calls in it, and since I'm parsing all these exchange messages, that's going to really pay off in the end. I'm not going to have to do any nasty context switching for these calls - just simple multiplications and additions. I like being able to take the time to go back and clean this kind of stuff up. Makes me feel a lot better about the potential performance issues.

Oh... I forgot... in the rest of my code, I handled the difference in these two by looking at the magnitude of the value. Anything less than "a day" had to be "since midnight" - the rest are "since epoch". Pretty simple.

It works wonderfully!