Archive for the ‘Coding’ Category

Google Chrome dev 14.0.794.0 is Out

Friday, June 17th, 2011

Google Chrome

This morning they 'jumped the version' and the Google Chrome team put 13.x into 'beta' and started the dev series with Google Chrome dev 14.0.794.0. This guy is supposed to have the latest V8 javascript engine - 3.4.3.0, and quite a few fixes on different platforms. It's the inevitable march of progress for Chrome, and it's getting a little boring, to be honest. There's nothing really new coming out of that group, and in a way, that's OK. Browsers are OK to be boring - they are supposed to get out of the way and let the user to their thing. So OK... I'll give them boring.

Added New Calc Mode for my Greek Engine

Thursday, June 16th, 2011

High-Tech Greek Engine

This afternoon I got a request from the GUI guy to make some requests - specifically the price-to-implied Vol and implied Vol-to-price calculations where they just want a singular calculation done and then forgotten. I didn't plan on this for the code, so I had to think about it for a few minutes. Then it really hit me - I have the instruments getting the current prices and all the supporting data all the time. If I just copy that out and don't hook up the ticker plants, then I can do the calc and then toss the copy away. It's pretty clean as it's the same workspace and environment, it's just not ticking because it's not hooked up - but it doesn't need to. It's only one calculation - done right then.

Sweet.

I still needed to spend quite a bit of time refactoring the code to make it simpler to deal with these two new ways of getting the data. Pretty much the entire afternoon was spent doing this - but I got it. And it works perfectly. Very nice to see.

An added bonus is that I was able to clean up the code in the refactoring so it's easier to follow now than before. And that's always a good thing.

Added Start-of-Day Clearing Tools for Greek Engine

Thursday, June 16th, 2011

High-Tech Greek Engine

As my Greek Engine is progressing, it's time to think about running it 24x7 - and that means that we need to handle the start-of-day issue with the caches and data. Basically, we need to have some idea of a "clear and reload" of the data, but we don't want to do it all at once, and we certainly don't want to do it if we're not going to get data back into the system quickly. So we want to refresh the instrument data at a good time - say in the early hours of the morning, and then we'll refresh (clear out) the tick data right before the open.

It's a lot of little housekeeping like this that needs to get done in order to have this ready for production. After putting these things into the code I needed to add them to the IRC client so I could test them and manually control the apps. It's all pretty standard stuff, but it's this stuff that makes a system really easy to use and enjoyable.

The Importance of Comments

Wednesday, June 15th, 2011

Today I've once again been shown the critical importance of good comments. To set up the problem and it's resolution let's look at what was before the problem, and see how a good set of comments (by me) could have avoided a complete day of frustration.

We start with the statement of the problem:

Each exchange feed comes on both A and B lines where the same data is on both lines, but they arrive at the datacenter through different paths so that should one be lost, the other has the same data to carry on.

The problem comes in when you try to arbitrate these two lines in your code. You need to look at key sequence numbers and see which one has arrived first for the next number in the sequence, and then there are always the special cases where the exchange sends a reset sequence number message and you have to deal with that. So we need to carefully arbitrate between these two streams of messages to make sure we get the first copy of each message as soon as it's available.

The problem is that I want all this to be multi-threaded and lockless. It's that last part, combined with the fact that I don't want to waste threads, that brings us to the solution I arrived at. Basically, I had the 'primary' channel (A channel) processing thread doing all the arbitration, and the 'secondary' channel (B channel) processing thread sending it's results to a queue that would be read by the 'primary' thread. This merging of the two data streams was a pretty nice idea - I just needed to work out the processing of the 'non-primary' feed's queue with the primary processing thread.

What I had done was to make a method to process all the pending messages:

  bool UDPExchangeFeed::processPendingMessages()
  {
    bool       error = false;
    Message    *msg = NULL;
    while (mQueue.pop(msg)) {
      if (msg != NULL) {
        if (!processMessage(msg)) {
          error = true;
        }
      }
    }
    return !error;
  }

This seemed fine and worked great for quite a while. Then another developer looked at the code and said "Hey, this is going to starve the primary channel!", and because I didn't have a really good comment as to why I chose to do it this way, I said "Wow... yeah, I can see that." and so we changed it to look like this:

  bool UDPExchangeFeed::processPendingMessages()
  {
    bool       error = false;
    Message    *msg = NULL;
    if (mQueue.pop(msg)) {
      if (msg != NULL) {
        if (!processMessage(msg)) {
          error = true;
        }
      }
    }
    return !error;
  }

so that we're doing a 1:1 of the primary and the secondary. Which sounds fair, and a better solution, but the problem is that it really isn't, and it takes a little real-world thinking to get there.

Two feeds decode messages in the same time - one then transmits one, and the other just pushes the message on a queue. Clearly, the second one is going to process messages a little bit faster than the first. This means that it may take 10 or 20 messages, but sooner or later, there will be two message in the queue to process, and only one is going to be processed. Repeat.

Pretty soon, the queue overflows. After all, we're talking upwards of 50,000 msgs/sec. It doesn't take too many to get out of hand. So how to fix it?

Always empty the second queue. It's not unfair, it's the equalizer. Most times it's only going to have one message in it in the first place. But when it has two (or three), it's best to clear them all out right then, rather than to let them sit there and build up. So my initial implementation was really the better one - but I hadn't put this level of documentation with it so as not to be seen as wrong.

Now there's a four paragraph comment on that little method just so we're clear about why it's doing what it's doing, and it's not a bug.

Setting SQLAPI++/iODBC/FreeTDS for Minimal Impact

Wednesday, June 15th, 2011

database.jpg

This morning I spent a little time looking at the SQLAPI++ manuals looking for the way to make it minimal impact on the SQL Server I'm hitting. I was hoping to find a way of setting a timeout on the SQL statement's execution. What I'm seeing now is that every so often the act of reading from the database will hang the thread doing the reading, and it doesn't give it up for long enough that I restart the process.

This isn't good.

So I wanted to put in a timeout without resorting to a boost ASIO timeout. What I found was that there isn't a timeout in the SQLAPI++ code, and there isn't really one in the iODBC layer, either. There is one in the server configuration on FreeTDS, but I'm not really keen on putting a timeout value there for all connections and queries to a database. I just wanted to be able to put one on this set of queries.

What I did find was that I could make the SQLAPI++ command quite a bit nicer to the database with a few options on the command:

  cmd.setCommandTest(aSQL.c_str());
  cmd.setOption("PreFetchRows") = "200";
  cmd.setOption("SQL_ATTR_CONCURRENCY") = "SQL_CONCUR_READONLY";
  cmd.setOption("SQL_ATTR_CURSOR_TYPE") = "SQL_CURSOR_FORWARD_ONLY";
  cmd.Execute();

where the middle three lines are new this morning. The default for the command is to fetch only one row at a time - that's very bad, and to allow a more liberal reading/updating policy with the cursor. I don't need any of that, and this will make sure that I'm about as lightweight on the database as possible.

With no timeout to fall back on, I'll have to just see if these changes are enough to make sure I don't get the lock-up again. Sure hope so...

Creating an Exchange Feed Recorder (cont.)

Tuesday, June 14th, 2011

A few days ago I started creating an exchange feed recorder, and today (finally), I got the time to finish things off. The outstanding issues were that I hadn't tested anything, and that I hadn't really worked out the start/stop scripts, etc. So today it was a simple matter of testing the code, fixing a few issues, and then setting up the infrastructure so it'd be easy to start/stop the recorder for daily use.

It wasn't all that hard, but it took an hour or so to get it all nailed down. I then spent a little time writing a reader app to allow people to see how you would read from the file, find the datagram, and then process it. Not bad, but it needed to be done in order to show that the recording process wasn't corrupting the data.

Nice to finally get this all nailed down.

Helping Folks Avert Disaster – It Could be a Full-Time Job

Tuesday, June 14th, 2011

This morning I had a nice, long, talk with one of the developers in the shop about a project he was going to be working on. The problems I saw in his design were significant. The application was non-trivial, I grant you, but that should have been an indicator to him that the simple solution of a few structs in a trie wouldn't really be sufficient. Because of the way the feed is coming from the exchange, we really need a far more complex data structure. It's going to make things much faster, but it's going to take a significant amount of time to build.

He was going to build a train wreck, and I hope we averted it. We'll see. There's a lot that needs to be built. I'm sure this isn't the last time we'll talk about it.

Problems are Solved by People that Show Up

Tuesday, June 14th, 2011

This morning I found myself sitting here with nothing at all to do because one of the systems I depend on wasn't up, and there was no one here who knew how to get it up. That's a common problem for me, and in the past, what I've done is to learn how to get these things up, and then get them up myself. It always reminds me of the line on West Wing I saw so many years ago: People... Problems are solved by people that show up.

I could not agree more. Show up. Solve problems. After the day I wasted yesterday on this horrible double-bug, I'm not one to say bad things shouldn't happen - they do. Period. But it's the showing up that's really the key. If there's a problem, then by showing up, you can help solve them.

And showing up after it's solved is like not showing up at all.

When Two Wrongs Make a Right – Finding a Nasty Bug

Monday, June 13th, 2011

bug.gif

Today I've spent all day tracking down the most devilish bug in my code - yup, right there in my code. And the reason I didn't see it right away is that this code hasn't changed in several days, and it's been working perfectly for quite a while. But the trick was that it unknowingly depended on another bug, that was fixed yesterday evening, and because it was fixed, my bug became a real bug. But figuring this out was a painful, and laborious, task.

The set-up for the way it used to 'work' was that I had two services, on two boxes, and each service was hosted by a Broker:

The old, broken way

The client would randomly contact one of the locator services - most likely, it would be the one on the same box as he was running - but there was no guarantee to this. But for the sake of example, let's say it hits the red box. The locator service is asked "can you handle this symbol?" and if it can, it responds to the client immediately. If it can't, then it asks the Broker to list all services that start with the same name, and then proceeds to ask each if they can handle the symbol.

The red locator hits the blue locator, and since it's got to be one or the other, it answers pretty quickly. So where's the bug? Well... the first one is that if there are two services with the same name, we should 'prefer' the service on the same host as the client. This minimizes the network traffic and keeps things "local" as much as possible. You can see it coming, can't you?

With the preference set to local services, the red locator will ask the Broker for all similarly named services and get - you guessed it: itself! This places it in an infinite loop - but with boost asio, there's only one thread to process things, and that one thread can't receive and send at the same time, so we lock up.

Just One Error

So the fix was simple - don't ask for all similarly named services - make sure you exclude yourself! With this simple one-line fix to the code, everything worked again. It was just a complete day trying to figure out where the problem existed in order to figure out that one line that would do the trick. Ick.

Creating an Exchange Feed Recorder

Friday, June 10th, 2011

This afternoon, while I was watching my Greek Engine just hum along as pretty as you please, I decided it was time to get busy on some of the lesser projects that needed to be done. One of the first was an exchange feed recorder. Basically, we needed to just take all the UDP datagrams, tag them with the time they arrived, and write it all out in some manner that makes it not too horribly difficult to read and subsequently process.

Interestingly enough, this wasn't that hard. The hardest part was doing the writing of the file. The basics were already there for me in what I'd done already, and I just had to spend a few hours to get everything cleaned up and ready to go.

Now all I need to do is test it. Pretty nice.