Archive for January, 2012

Firing Up C++ Functors – Beautiful Solution to the Problem

Tuesday, January 24th, 2012

cplusplus.jpg

This morning I've been battling a really nasty problem in my Greek Engine - at the start of the business day, I need to clear out the previous day's values for some daily summary values - like open/high/low. I do that fine in the instruments themselves, but there's an component in the engine that listens to all trades and creates this summary data. And it wasn't getting cleared out. So when I reloaded all the instruments at the beginning of the day, and flushed all the messages from the exchanges to them, these "lingering" values from the previous day were slipping in.

Ideally, we'd just clear out these values, and I could just clear out the cache in the component. But then I'd loose the last valid trade as well - and that I don't want to loose. So I really to operate on all the messages in the component. This seemed to scream "Iterator!" to me, and I spent several hours on it. The problem is not with the basic iterator concept, it's that we're trying to iterate over a trie, and that's not really all that easy when your trie is lockless, and it's possible for someone to change the data as soon as you're sure it's there.

Yeah, it's nice to think of an iterator on the trie, but it's not really all that practical.

Sure didn't stop me from wasting several hours trying to get it to work, though.

What I really wanted was to be able to pass in some function to operate on all the messages in the trie, and then let the trie itself, handle all the thread-safety issues. This seemed like a much better idea, so I started looking into something I haven't used up till now - functors.

The basic idea is just to have a simple class that has a few standard operators on it, and then use this base class to derive all the operations that you need. For example, in my case what I needed was to operate on Messages. Sure, I only needed to operate on one kind of Message - the Summary Message, but if I'm going to make this, I might as well make it capable of dealing with all the message functor usage I'm going to need.

So I create the following header:

  namespace msg
  {
  struct MessageFunction
  {
    // this is the overt method that can be called
    bool doIt( Message *aMsg );
    // …and this is the simple operator to make it look like a function
    bool operator()( Message *aMsg );
 
    // these are the methods to override for the different message types
    virtual bool doHello( message::Hello & aMsg );
    virtual bool doGoodBye( message::GoodBye & aMsg );
    virtual bool doQuote( message::Quote & aMsg );
    virtual bool doPrint( message::Print & aMsg );
  };
  }      // end of namespace msg

and I implemented all the specific message type methods so that if you don't need to implement it, you don't have to, and it'll just be a no-op on the message:

  namespace msg
  {
  // this is the overt method that can be called
  bool MessageFunction::doIt( Message *aMsg )
  {
    bool      error = false;
 
    if (aMsg != NULL) {
      switch (aMsg->getType()) {
        case eHello:
          error = !doHello((message::Hello &)(*aMsg));
          break;
        case eGoodBye:
          error = !doGoodBye((message::GoodBye &)(*aMsg));
          break;
        case eQuote:
          error = !doQuote((message::Quote &)(*aMsg));
          break;
        case ePrint:
          error = !doPrint((message::Print &)(*aMsg));
          break;
        default:
          error = true;
          break;
      }
    }
 
    return !error;
  }
 
  // …and this is the simple operator to make it look like a function
  bool MessageFunction::operator()( Message *aMsg )
  {
    return doIt(aMsg);
  }
 
  // these are the methods to override for the different message types
  bool MessageFunction::doHello( message::Hello & aMsg )
  {
    return true;
  }
 
  bool MessageFunction::doGoodBye( message::GoodBye & aMsg )
  {
    return true;
  }
 
  bool MessageFunction::doQuote( message::Quote & aMsg )
  {
    return true;
  }
 
  bool MessageFunction::doPrint( message::Print & aMsg )
  {
    return true;
  }
  }      // end of namespace msg

Now in my code, I only needed to implement a subclass that did the one thing I needed it to do - reset the values. In this case, I'll use the Quote as an example:

  struct QuoteClear :
    MessageFunction
  {
    virtual bool doQuote( message::Quote & aMsg );
  };

implemented as:

  bool QuoteClear::doQuote( message::Quote & aMsg )
  {
    aMsg.clear();
    return true;
  }

Then all I needed to do was to add in an apply() method to the trie, and have it run through all the values and on each one that is non-NULL, call the functor. The signature for the apply() method is simple, and it makes it clear how to use it:

  virtual bool apply( MessageFunction & aFunctor );

This was an excellent use of the idea, as it allowed me to pass in an arbitrarily complex function to be operated on a general Message, and the apply() method simply knows how to scan it's internal data structure, and call this application as needed. Very sweet.

While it's not as generally useful as an iterator on a trie, it's perfectly suited to what I needed. Love the idea.

Knowing What to Do is Harder than it Looks

Monday, January 23rd, 2012

cubeLifeView.gif

I'm sitting at my desk (to call it a desk is an overstatement - it's a 6 ft. section of a 24 ft. table) this morning and trying to come to terms with what I should be doing next. And 'next' is not always defined to be the very next thing I'm working on. No, this is a more long-term 'next', like What am I doing here? and Is this place the right place for me? kind of 'next' questions. The Shop is undergoing a lot of changes lately, and I'm feeling more isolated than I've felt in a while. It's not comfortable, and I'm just not sure why I'm subjecting myself to this - long-term. After all, in this market, in this day, the skills I have are in high demand, and there's no shortage of places to work.

So I'm sitting here, looking at my code - very happy with it, but fundamentally wondering if this is the right place for me. WHat should I really be doing? I'm not really advancing my career here. Not that it's that important to me. But still… I'm helping them out, I'm getting more years of experience in this industry, but I'm not really getting myself any closer to what I want to be doing.

Or maybe I am?

I just want to do fun, interesting things with fun, interesting people. I believed this was the place when I came here almost two years ago, and I've certainly built some amazing systems while I've been here. So I've done the fun and interesting code, but the fun and interesting people have been pulled away from me. The guy that I came here to work with became the Head of IT, and I don't get to work with him any more. The guy I worked with for a few good months, got pulled away to be the head of a large development group, and so I don't get to work with him either.

So it's like the fun and interesting people are here, and I did get to work with them, but as soon as I did, they seemed to get pulled away into management duties. I don't begrudge the organization pulling good devs out of the trenches, and making them management. It's what most organizations do - right or wrong. I support my friends that have moved away from development into these management roles because if they are happy, then I'm happy for them.

But with each move, I feel all that much more isolated. What's the point of staying here if there are a few guys I want to work with but will never again get the chance to work with?

That's my Big Next Question.

I have no idea. I sure wish I knew.

Sampling Intervals and the Dangers of Common Sense

Friday, January 20th, 2012

bug.gif

This afternoon I've spent quite a bit of time working on a few issues that popped up in today's testing of the greek engine. One was a bad copy constructor that was leading to bad calculated values, and another was the calculation of the high and low for a composite instrument. The guy doing QA looked at the formula for the composite:

  Value = (Comp1 * 0.700) + (Comp1 * 0.400) + 0.55

and thought that the high/low on the day should be the simple application of this formula to the individual high/low values for the components. But that is not the case. The computed value of this function is a time-based function and unless the components are perfectly correlated, the high of one will not coincide with the high of the other. This means that the computed high will be the largest value of the computed quantity, and will most likely be less than the "expected" high value.

But it gets more interesting…

Because of the speed of ticks, we accurately track all trades for the determination of the high and low. However, downstream, we conflate the messages, so that when the calculation is actually performed, it may only be once every 100 trades (depending on volume). This means that the formula is not accurately calculating every value of the function - only those that it needs, and that, too, will effect the high/low as shown.

All this is just too confusing for the traders to cope with. So I used the equation:

  Value(high) = (Comp1(high) * 0.700) + (Comp1(high) * 0.400) + 0.55

and decided it was better to have a clear, explainable value, than risk the confusion and effort of having to explain it to all the traders several times.

Pricing Systems are Details, Details, Details (cont.)

Friday, January 20th, 2012

bug.gif

OK, I just finished the latest fixes for today, for testing Monday. I know it's the way it goes, but it's still tough. I'd really like to know it's fixed, but the best I can do it run it through in my head and be sure that, at least there, it's right. I then have to hope that my mental picture of the system is accurate. Sometimes yes, sometimes no.

Today's issue was really pretty significant. It appeared as though the previous close and adjusted previous close weren't being loaded. But I'd tested that code, and I knew in my tests it was working. Something was amiss.

So I resorted to logging, and in that I realized that the problem was systemic. I was indeed loading them properly, but then the first message was clearing out the summary data, that contained these values. Therefore, it appeared that the values weren't set when they had been set. So what was the problem? I was resetting all parts of the summary data when all I wanted to clear were the open/close/high/low. Ah! Simple fix.

But what about the reloading of an instrument where we replayed the messages from the previous day and the last trade last night was then setting our new high/low/close? Hmmm… this is just as obvious, but it's a lot less clear how to fix it. We can't block the messages, they are needed. Also, we can't stop the resetting process, that's crucial.

What I came up with was a very surgical change: if the trade was yesterday (i.e. not today), then do not allow it up update the summary values of open/high/low/close. Simple and surgical. This is the complete context of the problem. I'm hoping that there are no unintended consequences on Monday, but we'll have to wait until then to see. I've run it through my head so many times I can't see a hole in the logic.

I have my fingers crossed.

Updated the MFDS Codec for Ticker Plants

Friday, January 20th, 2012

MarketData.jpg

This morning I needed to finish up just two new messages in the NASDAQ MFDS data feed - the two Corporate Action messages. I had built the MFDS codec for the Ticker Plants more than a year ago, but then sometime in the middle of 2011 they changed the feed and added six new messages and dropped four, I think. In any case, it was a big change, and I spent several hours with the docs and my code to put in the new messages, and chop out the old.

In the end, it was nice to know that it took less than a day to get the new messages in, and decoding happily. Getting mutual funds will be nice, too. Just don't have a need for them right this second.

Pricing Systems are Details, Details, Details

Thursday, January 19th, 2012

High-Tech Greek Engine

I've been here before, and yet it doesn't make this process any easier. When you get a new pricing/calculation system into users hands, the thing you spend a ton of time on is the testing of the prices. Open prices. Closing prices. Volumes. All are exceptionally important to traders, and when they have an existing system to compare to, it's even harder.

So it comes as no surprise to me that now that we're in that phase of my greek engine, that's what I'm doing. The really annoying part is that I can only test most things once a day. Opening prices? That's only one time. Closing prices? Again, once. It's hard to say "OK, I'm on this, have a change and you'll see it tomorrow". It can't be avoided, but it's annoying to me nonetheless.

Clearly, I'm an impatient person - I write code for a living. If I had patience, I'd be back in VLSI design. There, you test something a lot in simulation, and then build it once a month. No, that won't do at all.

So I'm slugging out changes with the open/close/high/low right now, and it's a big puzzle. Most of the code is working perfectly, but some edge cases are causing problems. You don't want to change too much, but you have to make a nice surgical cut and put in the code that will have the effect you need. It's a lot of thinking and going over possibilities in your head, and then writing two lines of code.

Slow going, but if it's progress, at least it's one less thing to hassle with.

iBooks Author and Book Publishing

Thursday, January 19th, 2012

IBooks Author

Today at their little press get together, Apple released iBooks Author and I have to say, I'm a little envious of my teacher friends that get to use this to make their notes and ideas come alive. When I was back at Auburn University, I taught Electrical Engineering, and every class I taught, I always made really extensive notes to make sure that my lectures all "worked", and fit in the time for each class. Sure, some were a little long, but not too many, and after the first few weeks of teaching, you get to know how much you can cover.

The beauty of iBooks Author is that I could do this same thing on my Mac, give it to the students, and then there's never a reason to take notes. It's all there. Sure, it's maybe not as polished as some books, but it's there, it's free, and students would eat it up because it means never making mistakes in the copying of examples, all the things I might skip through in class would be laid out there to everyone to see.

Plus, it's possible to put multimedia in the books, and I could put simulation results, visualization movies, all the things that I had to hope they'd get form doing this on their own.

I got iBooks Author today because I'm convinced that even if I don't do a lot with it, I want to do a lot with it, and that's the important thing. Who knows? Maybe it'll get me back in the classroom.

Fixing Tricky JSON Decoding Issues

Thursday, January 19th, 2012

bug.gif

This morning I've been fighting a problem that one of my web clients has been seeing. They are hitting the greek engine and trying to calculate implied vols for given option quotes (bid/ask). When they send me a value like 25.21, it works well, but 26.00 fails. Makes no sense to me, but then again, maybe I'm missing something.

So I make a little C++ test case and it checks out nicely. So far so good. I can put in just about any value and get reasonable values out. Nice. So what's the problem?

I ask the developer to send me exactly what he's sending me, just to make sure I know what it is, and that it's formatted properly. Here's what I got:

  { "values" : { "O:AAPL:20120121:400.00:C" : { "bid.price" : 26 }},
    "instruments" : ["O:AAPL:20120121:400.00:C"] }

and the second I saw the request, I knew what it was: JSON doesn't like unnecessary data in it's encoding. THe number 26 was sent as 26 and not 26.00. The Broker then looked at this from JSON as an integer and it sent it to me as such. I was expecting a double, and so I passed on the value.

Obvious!

To clear up the problem I changed my code from:

  if (key == "bid.price") {
    freeze();
    mQuote.bid.price = dbl2int64((double)val);
  }

where val is the variant, and we're using the casting operator of the variant, to:

  if (key == "bid.price") {
    freeze();
    mQuote.bid.price = dbl2int64(toDouble(val));
  }

where:

  static inline double toDouble( const msg::ng::variant & aValue )
  {
    double    retval = NAN;
    if (aValue.isDouble()) {
      retval = (double)aVariant;
    } else if (aValue.isInteger()) {
      retval = (int64_t)aVariant;
    } else if (aValue.isString()) {
      retval = atof(((std::string &)aValue).c_str());
    }
    return retval;
  }

Now I am sure the casting is being done properly, and the user can send me integers, doubles, and even strings that format into integers or doubles, and I don't have to make them deal with it. This is the kind of bug I like to find. The solution makes it even better than before. Nice.

Great Description of SOPA/PIPA

Thursday, January 19th, 2012

Government and Laws

This morning Boing Boing had a link to this wonderful explanation of the creepiest aspects of the SOPA/PIPA legislation that's working it's way through the House and Senate. As with the other Khan Academy videos I've seen, it's excellent - what's new is only the subject matter. And in this case, the essential points of this very bad legislation.

What amazes me is not that this was written - government is pushed by money, that's a given. No, it's that they made it like so much other legislation with regards to illegal activity - an accomplice is as guilty as the perpetrator. But in the world of computers, you can easily force an accomplice without their knowledge - posting to a blog, posting in general.

That's the thing that I think the law hasn't caught up to: The ability for someone to implicate another so easily. It's got to be understood that there's a difference between supporting the activity, and supporting the mechanism of the support (posting). But I have no belief that they'll get this right in the next 20 years. It's just not their technology.

So I'm sad to say that I'll live in a time when the law makers are hopelessly out of date with the technology I make a living from. Sad.

Got My Archive Server Working

Wednesday, January 18th, 2012

Building Great Code

Today I finally had time to devote to my archive server to get the final query form working the way I wanted. The server is really the reader part of a reader/writer pair, where the Feed Recorders are the writers of the data. The recorders are very simple little apps - they use the basic framework I've built for the exchange feed processing, and then instead of processing the datagrams, they simply buffer them and then every 30 mins or 10MB, they are written to disk in a directory structure that includes the feed name, the side, and the date. The point is that when we go to read the data, we don't want to have to look at thousands of files to get the few we want, so using directories is a very good plan.

Once these files are written, it's a simple matter of reading them and parsing the datagrams into the messages and then serving them up. By having this stored in smallish files, it makes it easy to cache these files-turned-messages in the archive server. The only key left is to make the server smart about the requests it gets.

The current format of the requests are pretty simple: feed name, side, starting and ending times, list of instruments to return, types of messages to return, and an optional sequence number. The main bulk of the requests won't use the sequence number, and that's OK - it's for special requests that I just got done finishing, but more on that later. The vast majority is really about a time range and a message type: "Give me all the Quotes for IBM from 10:30 to 10:45" - that kind of stuff. For this, the service was working pretty well.

But there was a slight hitch - what if the request was't on the filesystem? What if it was sitting in the recorder buffered up, waiting to be written out? Well… then I had to put in a scheme where the recorders were actually services themselves. The recorders would then answer a simple request: give me your data. The archive server can then see if the request is fulfilled by the filesystem data, and if not, it'll go to the appropriate recorder service, ask it, and augment the response as necessary.

It was pretty neat.

It was also pretty fast, which is nice.

The final thing I needed was to have a time/sequence number request for restarting the feeds and greek engine. Basically, if the server goes down, even if it's got a saved state, there will be some time between the last save and the time it's back up and processing messages where it's lost the data and it doesn't have any way to get it.

Enter the time/sequence number request.

When the server gets back on it's feet, it can look at the last data it has in the saved state, and then issue a request to the archive server and say "Hey, send me everything you have after this sequence number, which is about this time". Processing the returned messages means that the server will be able to catch up the lost messages, and if they aren't needed - no big deal, we'll throw them away. But if they are needed, then we have them.

Well… today I finished the archive server part. I haven't worked in the feeds and engine requesting the data, but that shouldn't be too hard. I'm in the middle of trying to get a lot of little things fixed up for the greek engine in testing, so I'm liable to hold off a bit before pushing ahead with that feature. But it feels really good to get this part done and in the can.