Archive for the ‘Coding’ Category

Of Setbacks, Redirections and Fantastic Days

Wednesday, January 5th, 2011

Ringmaster

Today was one of those days that you remember for a long time. It has re-defined my ideas of flexibility and gifts presenting themselves as problems in disguise. It's been a big day, but it started out small, and got ugly before it turned around. And all this is a matter of perspective. Mine.

The day was going to be a continuation of the work done yesterday on the Broker by some of the guys in the group. It's undergone a massive re-write in order to give it better functionality and speed, and it's just not quite all back - yet. So I thought it'd be a simple thing to get it all finished today. Unfortunately, I'm a user of the Broker, not one of it's developers, and while I could possibly dig into the code, it's not what I probably should be doing. At least not today.

So I'm waiting for the lead developer on the Broker to get in, and talked to a few of the other developers. It's still broken like it was late last evening, but it's not horrible... it'll get fixed today.

Well... maybe.

Cake.jpg

Turns out, it's the lead developer's birthday. Yeah... and he's not coming in. FANTASTIC! What am I going to do all day? I can't run any of my ticker plants as the Broker is an integral component of the way they load their configuration, etc. and without the Broker, it's impossible to run anything I have.

I'm dead in the water for an entire day. I hate being unproductive for an entire day. I might as well be home, but I can't leave... I have to stay and try to make the best of it.

Really crappy start to the day.

But then I realize that I might as well look at a few things to see what I can get done that has nothing to do with the running ticker plants. I look at my little notes and see that I might be able to have a look at my SecurityID code to see if there's possibly room in the 128-bit integer for a few more bits.

The problem I was facing was that I needed to have a conflation key for each message coming out of the ticker plant. This key needed to be message-specific, and include (at a minimum) the following components:

  • The security key, or complete name of the instrument. This has to include all the relavent facts about the instrument (in the case of an option) and it needs to be two-way - meaning lossless and easily recovered.
  • The type of the message - Quote, Print (Trade), etc.

and in the case of an Quote from certain feeds, we needed to include:

  • The Exchange the quote is coming from.

The point is that two quotes for the same instrument from difference exchanges cannot conflate one another. They have to remain separate. This is the big reason for the ConflationKey to be defined in terms of the message, as the Quote message needs to add the Exchange to the key, and the other messages don't.

Anyway... I was starting to look at the key and saw that it seemed to be using up all 128 bits for even the simplest of stocks. This shouldn't be. So I looked at the constructor for the SecurityID class, and it's default constructor was blank:

  SecurityID::SecurityID() :
    UUID()
  {
  }

but the default constructor for the UUID class is to fill it with a random UUID pattern:

  UUID::UUID() :
    UINT128()
  {
    // simple... fill in this bad-boy with 16-bytes of something
    fill();
  }

Lo and behold... that was the problem. I needed to simply clear out the 128-bit int for the SecurityID class, and things would be fine.

  SecurityID::SecurityID() :
    UUID()
  {
    // make sure we start with 0 - like any other integer you might use
    clear();
  }

Sure, we end up calling fill() and then clear(), but it's only on construction, and it's for the sake of the natural state of the two classes.

Now, at least, I was seeing the "empty" space in the SecurityID code. What I found was really amazing - there was tons of space left in even the most complex of instrument names. All I needed for the message type was 4 bits, and if there was space (and there was), another 4-bits for the exchange code.

smiley.jpg

This was a major bonanza! I had several nibbles to work with and I only really needed two! I was then able to go back into all my code and where I had used the SecurityID and Type as two separate values, I could drop the Type as it was now embedded in the SecurityID! This was going to make several things in the code a lot faster as I don't have to deal with std::pair of the two values.

Then I could add in the Exchange to the ConflationKey for the Quote and now I could get the one most worrisome feature into the code - that of proper conflation of the Quote messages. But I didn't stop there, I was on a roll.

I ended up cleaning out a lot of the code based on the non-embedded values in the ConflationKey, and with a little extra work, packaged up the OPRA Quotes into multiple messages allowing for the additional BBO appendages. It's nasty, but rather than have an Option NBBO Engine, we can use the data coming from OPRA for the BBO and send out messages for it. It's not 100% accurate as the sizes aren't right, but it's faster and cheaper than putting up the hardware for the Option NBBO server.

In the end, I had made significant progress in some very nasty problems with the ticker plant, and all because I was forced to look at the SecurityID bits because I didn't have "real work" to do. I have a new respect for the subtle ways of fate and how I really need to stay far more open and accepting of what appears to me to be set-backs, as they are often the greatest of opportunities in disguise.

Great lesson to learn.

Great day.

You Just Gotta Love GCC

Wednesday, January 5th, 2011

GeneralDev.jpg

This afternoon I was working on my unsigned 128-bit integer class, UINT128, and realized that I needed to be able to swap byte orders in place, and so I wrote a little method on the class:

  #include <byteswap.h>
 
 
  void UINT128::byteSwap()
  {
    mBlocks[0] = bswap_64(mBlocks[0]);
    mBlocks[1] = bswap_64(mBlocks[1]);
  }

and the thing I love about GCC today is that bswap_64() is a built-in function. Could it be any better? I did a little googling on byte swapping in GCC, and sure enough, there was exactly what I needed. Previously, I had been individually assigning bytes, and while it allowed for optimization, I'm guessing that this is way faster.

I need to remember to google first. It's almost always worth it.

iTerm2 Alpha 15 is Out

Wednesday, January 5th, 2011

iTerm2

Well, this morning I got another little treat - iTerm2 Alpha 15 is out and has a rather impressive set of features (scraped from the Sparkle update dialog):

Alpha 15
This release has one big feature: split panes! You can divide a tab into rectangular regions, each showing a different session. Also, you can now save your window arrangement and have it automatically restored when you start iTerm2.

Big new features
- Horizontal and vertical split panes. Use Cmd-D to split vertically, Cmd-Shift-D to split horizontally, and Cmd-Opt-W to close a pane. Navigate panes with Cmd-Opt-Arrow keys (but note that you may have conflicting key mappings in your bookmark, as this had been a default setting).
- Save your window arrangement including window positions, size, tabs, and split panes. You can open new windows with the saved arrangement with a keypress or use a new preference ("Open saved window arrangement at startup") to do it automatically.

Small Enhancements
- Make window title and tab labels more configurable so you can remove window number and job name.
- Add global keybindings for cmd-arrow, cmd-pageup/pagedown, cmd-home/end to scroll.
- Make opt-arrowkey send proper escape codes.
- Preferences and Bookmarks windows can now be closed with Esc key.
- When the "hotkey" is pressed, prevent the running app from getting the keypress.
- Made Find With Selection (cmd-E) behave like a normal cocoa app.
- Allow the OS to pick the initial window position if smart placement is off and no saved location exists.
- Improved wording of context menus.
- Fixed context menu's "send email" feature.

Bugs fixed
- Fix arrow key behavior in paste history and autocomplete windows (bug 407)
- Various memory leaks fixed.
- Improved behavior of full-screen tab bar display.
- Fixed applescript bug with setting window size before adding session.

While I'm not a huge fan of the split screens, I have a lot of friends that swear by them - and interestingly enough, they are emacs users. Hmmm... coincidence? Maybe not.

In any case, it's a nice update because I can get rid of the instance number on the window title and just get what I want. Also, the saved window positions is as nice as Terminal.app, and I don't have to deal with the scroll bars any longer. Unfortunately, the window positions don't include the Space they were on. I'm really hoping that Apple fixes this soon as it's one of the most annoying things about the Terminal.app (after the mandatory scroll bars).

Working on Lisp-Like Parser Functions (cont.)

Tuesday, January 4th, 2011

Whew! It's been a heck of a day... again. Today I was able to get a complete first cut of the lisp-like parser all done and checked-in. As with yesterday, the bulk of the work was really on the variant class and then just a little work in the functions of the parser. Today it was math operations and equality tests. I didn't think about it at the time, but my equality tests were more restrictive than I really wanted.

For example, originally, if I had the code:

  variant   v(25.0);
 
  if (v == 25) {
    ...
  }

it would fail. Why? Because the variant is a 'double', and the test is an integer. I had two different data types for these - as opposed to one "number" type. While that seems like a little thing, it's not - the rounding and the storage, not to mention the serialization size, all favor heavily the two distinct types.

So I had to go in and re-write all the equality and inequality operators to take this into account. Not horrible, but again, it was a few hours to get everything done right.

Thankfully, it's all done now, and the tests run great.

Working on Lisp-Like Parser Functions

Monday, January 3rd, 2011

Today I've been hard at work filling in the parser I'm creating with the obvious collection of functions that any script is going to need. Things like add, subtract, multiply, divide... plus the more advanced math calculations - they all need to be done, but what makes this all slow going is the fact that to do this right, we need to implement these features in the variant class first, and then use them in the parser. In fact, using them in the parser is the easy part. It's getting them all in the variant class in the first place.

The reason it's slow going is that it's not enough to make a simple inequality set:

  bool operator==( variant & anOther ) const;
  bool operator==( const variant & anOther ) const;
  bool operator!=( variant & anOther ) const;
  bool operator!=( const variant & anOther ) const;

we're going to need the complete compliment of inequalities:

  bool operator==( variant & anOther ) const;
  bool operator==( const variant & anOther ) const;
  bool operator!=( variant & anOther ) const;
  bool operator!=( const variant & anOther ) const;
  bool operator<( variant & anOther ) const;
  bool operator<( const variant & anOther ) const;
  bool operator<=( variant & anOther ) const;
  bool operator<=( const variant & anOther ) const;
  bool operator>( variant & anOther ) const;
  bool operator>( const variant & anOther ) const;
  bool operator>=( variant & anOther ) const;
  bool operator>=( const variant & anOther ) const;

For the most part, this isn't too terribly hard, but you have to remember that I've got more than a dozen different types of values to check in a variant, and then there's the real work - all the convenience methods.

To make it easy to write:

  variant    v(25.0);
  if (v < 12) {
    ...
  }

I need to have a lot more operators as well. If I look at just the equality operator, I end up with something more like this:

  bool operator==( variant & anOther ) const;
  bool operator==( const variant & anOther ) const;
  bool operator==( varmap & anOther ) const;
  bool operator==( const varmap & anOther ) const;
  bool operator==( varlist & anOther ) const;
  bool operator==( const varlist & anOther ) const;
  bool operator==( uint8_t aValue ) const;
  bool operator==( int aValue ) const;
  bool operator==( int64_t aValue ) const;
  bool operator==( uint64_t aValue ) const;
  bool operator==( float aValue ) const;
  bool operator==( double aValue ) const;
  bool operator==( bool aValue ) const;
  bool operator==( std::string & aValue ) const;
  bool operator==( const std::string & aValue ) const;
  bool operator==( char *aValue ) const;
  bool operator==( const char *aValue ) const;
  bool operator==( uuid_t & aValue ) const;
  bool operator==( const uuid_t & aValue ) const;
  bool operator==( secID_t & aValue ) const;
  bool operator==( const secID_t & aValue ) const;
  bool operator==( error_t & aValue ) const;
  bool operator==( const error_t & aValue ) const;

So even if we are as clever as can be, this is a ton of code to write, and test. It's just brutal at times to make sure you haven't made any typos in the code.

I will say that I was very pleased with the one simplification I made. It's pretty easy to see that you can write the inequalities in terms of one another. However, that can take a while, and you might not end up with valid inequalities. For example, when you have two lists - which one is greater than the other? It's not easy.

My solution was to implement the:

  bool operator>=(...) const;
  bool operator<=(...) const;

methods, and then define the others in terms of them:

  bool variant::operator<(...) const
  {
    return !this->operator>=(...);
  }

So that I could be assured of at least having the "equality" in the code. Then, the others are logical combinations of these. It's not perfect, but it saves time and code, and it's a lot less headache than trying to implement a greater than on a list.

The Simplicity of Well-Done C++ — Can it Best Java? Maybe…

Thursday, December 23rd, 2010

java-logo-thumb.png

I'm to the point that I believe all my Broker re-write issues are solved. I can use no pooling on the sockets, and it works, and I can use pooling and it should work just fine. The guys working on the Java side of things aren't having as good a time of this as I am. Don't get me wrong, they're sharp guys, and I believe they understand the Java classes they are using, but the complexity of some Java libraries is really quite astounding.

Case in point: J2EE. It couldn't be harder to get something going on J2EE. Tomcat is hard enough - there should be something like JSPs or PHP that will allow you to place the java file in any directory and have things "just work", but I understand the "Java way", and while it's not perfect, Tomcat is awfully nice, and very capable - you just have to work within it's paradigm.

But this work we're doing on the sockets... Holy Cow! I"m glad I'm not doing the Java side. The C++ side has had it's problems, but still, it's very easy to see what's going on in the code, it's just a question of what's happening in the libraries. It's not a question of smarts, it's really a question of familiarity with the libraries. If you're new to the way it does things, as I was with boost asio in the beginning, it's a pain. But once you really know how to use the libraries, and how the library wants you to formulate a solution, then things get a lot better a lot faster.

I just feel for the guys doing the work on the Java side. At least one of them is new to this library, and this is a real trial by fire. The complexity is staggering to him, and it's clear to me that a really well designed C++ library can make a far simpler solution than a lot of the Java library/frameworks I've seen out there.

Tracking Down an Annoying Boost ASIO Problem

Thursday, December 23rd, 2010

Boost C++ Libraries

In the midst of working on the changes necessitated by the Broker's re-write, I found myself in a very nasty little problem. I am trying to do things quickly, and my test cases are often far worse than any real-world use is going to be, but they have served me well, and they were pointing out a problem I was having this morning.

If I created new updater instances for each request, and deleted them after they were no longer needed, I ended up with a very fast create-use-delete lifecycle. This lead to a segmentation fault in boost's io_service - specifically, in it's run() method. The core dumps were of little to no help whatsoever, and I was left trying to diagnose the problem from my end.

If I didn't delete them right away, but threw them into a pool, and still created new ones, only clearing out the pool at the end of the application, then everything was fine. It seemed like it was just the short lifecycle connections that was the problem. Very nasty.

The seg faults weren't on anything related to boost asio, either. They were on the line right after the context switch after the closing of the socket connection. I spent hours debugging the code to find that guy.

I came to the conclusion that there was something in the io_service that wasn't getting a "chance" to handle the socket closing before I deleted it. So I changed my code ever so slightly. Originally I had:

  if (si->second.pool.size() >= MAX_UPDATERS_IN_POOL) {
    cLog.debug("[recycle] the pool is full - deleting the updater");
    delete anUpdater;
  }

to:

  if (si->second.pool.size() >= MAX_UPDATERS_IN_POOL) {
    cLog.debug("[recycle] the pool is full - deleting the updater");
    anUpdater->mChannelID.clear();
    anUpdater->disconnect();
    // we need to let the io_service have a go at it
    boost::system::error_code  err;
    mIOService.poll_one(err);
    // ...and now we can successfully delete the updater
    delete anUpdater;
  }

The difference was stunning. No more crashes, and the code was rock solid every time I ran it. Amazing. I'm going to have to remember this. It's like a little context switch for the io_service so it can detect the close of the socket before it's deleted.

Several things to finish on the re-write, but it's getting close now. Nice.

Not So Quick Detour for Broker Re-Write

Wednesday, December 22nd, 2010

Ringmaster

As I was doing a little testing this morning I realized that the Broker wasn't working like it should. When I pinged a few folks, it was clear that there had been several changes I hadn't been told about and so I had a very abrupt detour to update all my Broker-facing code to make sure it was all working.

Some things weren't too bad - the codec's code for a map went from 'M' to 'm'... the Dates are now passed as microseconds since epoch... simple things. But then I quickly found that there were some that weren't simple at all. Not in the least.

The protocol for service registration changed. The interaction with the registration service itself changed. The recycling of open sockets wasn't working - in short, it was a massive change to the codebase.

I've spent the bulk of the day getting things converted. It's touched a lot of the code, but I think I'm getting very close. There are still a few things I need to hammer out, and I'm hoping to get to them in the morning. The prickly one is that the 'close channel' message - that which indicates that the socket is clear to be recycled, can, if the conditions are right, close the actual socket. When that occurs, it's happening too fast, and the object is getting dumped before it's work is done.

I need to figure out how to fix this so the events happen in the right order.

Fleshing Out the Lisp-Like Parser with Functions

Wednesday, December 22nd, 2010

This morning I'm continuing to work on the functions for my lisp-like parser. What I'm working on now is really the functions themselves and not how they fit into the other classes of the parser to compute values. It's a little slow going at first because I want to duplicate as much functionality as possible from the Java-version, and as you'd expect, there are absolutely no comments in the Java code. So I'm ending up chatting to one of the developers on the project and he's giving me the details on the functions.

Nothing hard, but it's slow going. I'm hoping once I get a few done, it'll go a lot faster.

Heh... that's optimism for you...

Lots of Meetings Bringing New Features to Ticker Plants

Tuesday, December 21st, 2010

I've been in a lot of meetings all afternoon, but in the end, the good outweighed the bad. These other groups had their own biases, of course, and no one wanted to "just give in" and use the stuff I've been working on, but that's really understandable. After all, a few lifetimes ago, it wasn't until after my ticker plant was proven for the most demanding client (my own risk engine) that other people started to see the value in it. SO it goes... I'm going to have to prove it, and I understand that.

Still... they had several good points:

  • If we conflate the messages by type and instrument, how will they know that any conflation happened? It seems reasonable that they should be given some kind of information about the conflation being done.
  • They'd like to be able to know what "stream" the message came from. There's 24 OPRA channels, and they'd like to know which one sourced this message. That's fair, but I have no intention of putting that in the message - that's going to be a client-side look-up method to keep the messages small and useful.
  • The exchange is an important part of the conflation - meaning they want to have one quote message for the AAPL Jul 350 C per exchange. Yikes! I hadn't planned for that. That's going to make the cache much larger, but so it goes.

Many of these aren't too bad to add, and it's reasonable to get them out quickly, but some of the others are a little trickier, and it might be better to think of the real solution to the problem as a completely different kind of answer. For example, what about having a message with all the exchange prices and sizes for a given instrument. Then, when it's shipped out, the user gets all the prices - at once. It's got possibilities.

It's late and I need to catch a train, but it's a lot to do in the coming weeks.