Archive for the ‘Coding’ Category

A Letter to a Dear Friend

Thursday, February 2nd, 2012

This morning I was thinking about the particular situation I find myself in at work. Interestingly enough, the one guy that I thought could really give me great advice is one of my oldest friends - Bret from grad school. I've known Bret since 1980 - that's more than 31 years now. We've worked together, laughed together, and lived a long time together.

To this morning, I wrote to him to ask him his advice:

I've been struggling here at work for the last few months - amid some massive re-orgs (yes, multiple massive re-orgs in that time), and in the midst of all this, I thought of the one person that I could really trust to give me some solid advice - you.

So here's what I'm struggling with: When I hired on here at The Shop about 2 yrs ago it was all about who I was going to be working with, and how we were going to be developing, and no more crap for HR… all the things that after a long stint at First Chicago, then UBS, I was happy to hear. It started out great, and my manager was just made partner, so it seemed like it was going to be great for a long time.

Then things changed. My manager, Clive, was put in charge of all IT for The Shop. Everything. And it's changed Clive. We no longer work together. For a while, I found someone that reminded me a lot of you - funny, easy to laugh, good coder, thoughtful. A really nice guy to work with. And while it was a little team of the two of us, it was great.

Then Clive decided that his view of IT needed to change, and that guy, is now managing the group I'm in - a group of 14 people.

Out the window goes the "who" I work with. Now I'm working with regular (which is to say, junior) guys that are dolts in comparison.

Out the window goes the "how" I work. Now things can't be released unless we have a meeting about it and it'e perfectly acceptable to leave bugs in production until that time. There are times they will have to check to see if it's OK to fix a bug - priorities are important, after all.

Out the window goes everything that I once liked about this place.

And so I'm asking you: How do you do it?

How do you work with people, systems, organizations, etc. that are clearly more like Roman galleys than places for creative people to work. It's not that I mind hard work, it's the conditions under which it's produced. Maybe I'm just fooling myself that a place like this Shangri-La even exists, but I'd like to think it does. But maybe that's my problem.

Maybe I need to just accept that people that want my effort, my energy, my work really aren't interested in my best work - they would be happy with 80% - if they get to choose the terms under which it's given.

Anyway, I'm hoping that you have some words of advice for me. Something that I can use to re-adjust my thinking, to re-align my sights - to get to a place that I don't dread coming to work.

Anything you have would be really helpful.

I'm hoping he's got some good advice for me. Stay tuned.

[2/13] UPDATE: I wasn't disappointed… his letter was right on target and it got me to thinking about what I need to do:

Hmmm, well, I think I should tell you a story. This is how my thinking has changed during the last 6 months of my last job. It has to do with all that's happened before but took a form I could articulate last year.

I started working for Avocent in 2008. It was a new team building a pretty cool product. Long story short, it was the best team I'd ever been part of. Best is terms of mutual respect, fun, and actual quality and quantity of output. Then we were bought buy a much bigger company. Things changed like black and white. One day when I was thinking about my options a light bulb went off. Every job I've ever had started out hopeful and for varying lengths of time was pretty rewarding. But something always happened to change that. What I realized was not that things always change. It was that *I* have been wrong every time about my estimation of the longevity of the job. Every time. On that day I made two decisions. Or rather two changes in my thinking. One is that I don't care one wit about the longevity prospects of a job opportunity I'm considering. Everyone tries to sell you and the vast potential of whatever they are selling. Now what I'm about to say will sound harsher than I really think in general (I mean I've not turned into a hopeless cynic, far from it) but to the job salesman I say bullshit. But really it's my desire to assume more than I should that I call bullshit on. Here's the deal. I've been wrong EVERY time. It's not that I didn't have educated assumptions, I believe I did. Doesn't matter. There are too many factors that can change. I NEVER saw the purchase coming by a company that was both large and insane at the same time. So, to be clear, I'm not jaded, I just don't consider longevity to be a factor. I just want to know if the work is interesting. If things change I'll look again. But I said I made two decisions. The second I'm still working out in real life. Since I can't count on others for long term job satisfaction, my goal has changed. I used to want to find a job that was "interesting" (there are many dimension to what "interesting"means). What I realized is the reality that I could continue this path of going from job to job (really meaning from employer to employer) as things change, to I could seek to become independent of that rat race. The best word I have for what my goal is right now is independence. There are just way too many ways today to make your own path and divorce yourself from the work you want to do and a bunch of other factors (where you live, who you work with, etc.).

I guess in answer to your question of how I do it, I don't think I do really. I've always moved on. That takes time sometimes, but the mental switch flips pretty easy and hasn't ever flipped back. In the meantime, be yourself, advocate the quality you expect. THAT is hard and I've failed many times but that's the standard to measure against. Remaining true, that is. This has been a bit of a ramble. There's probably more to say so feel free to call anytime. I mean it. I'm living this out everyday right now so talking this stuff through would be helpful to me too. It's been good for me to reflect on this as I've typed this much to you so far.

Take care and let me know how things go.

He's dead right, and I knew it before he even wrote back. The problem is me and my expectations. I need to lower them. Way, way, lower. When I was new here, and had lower expectations, things were a lot better, but as I started doing more work here, they rose on the hopes that things were really going to be great. Big mistake of mine.

Focus on the things that are important to me. That's the ticket. It's not important that I'm a convert to the cause, I just need to be a solid, good, hard worker, and that's always going to happen. It's when I think they have the same vision as I do that things go sour. I just need to keep a respectful distance. It's not easy for me, but it's important.

Thanks, old friend. I knew I could count on you!

The Realization that Things are Exactly as They Want

Wednesday, February 1st, 2012

Today has been a pretty big day for me. I've written a lot of code to try and get things up to date and operating at good speed, and there hasn't been a single thing that's stumped me. I got it all done, exactly when they needed it, and all in the branches they needed - merged into the right branches and ready to go. Can't complain about that at all.

But I've also been doing a lot of talking with my new manager, and yesterday I had a long talk with my old manager. I think I've come to the realization that this place is not in transition, at least not completely. It's evolving, and in that evolution, it's changing exactly how it wants to be changing, and my concerns about where it was versus where it is, and where it's going, are really all my fault. That is to say - all my issues.

This place is doing exactly what it wants. And more of it. It's moving away from the structure and people that I hired on to work with, and hired on to work like. I can write code anywhere, I came to this place to work with specific people. Now that's no longer possible. I came to work in a specific way - again, no longer allowed.

In short - everything I liked about this place in coming here is being pulled away, and it's very clear that it's not that my concerns aren't heard, it's that they don't seem them as concerns! They see them as achievements!

So where does that leave me? Well… it's a place I no longer want to be, but it'll take me time to find a place that, once again, appears to be the kind of place I want to work. I'm hoping the search isn't too long, but no matter how long it is, I'm here, and I'll play their games, but I can't honestly pretend to like it. It's nothing like what I want to do.

But for now, it's my paycheck.

But not for long.

Fantastic Difference in Keys for boost::unordered_map

Wednesday, February 1st, 2012

Boost C++ Libraries

Today I spent most of the day on a few issues - the biggest of which was the problem that a restart lost those Option calculation results that I was working so hard to retain. What I needed to do was to persist them to the redis cache server. No way around it. The problem was, I had nothing written for this, and a few other little testing bugs came up and I had to fix them as well.

The persistence system is modeled after all the other save/load persistence methods I have in the code. I have a save state method, and a load state method, and all I needed to do was to put the save method where I was saving other state data, and put the loader after I'd loaded up the instruments and before I continued on with the initialization of the system. In theory, not too bad.

In practice, I had to do quite a bit of detail work, but that was expected. It's non-trivial to persist a class that's not been fitted for persistence, but it's all straightforward. It just took time. No… the problems were after I had that code working, and I went to test it.

The original data structure I used to hold this Option calculation results was a boost::unordered_map:

  struct Results {
    uint32_t         volatility;
    msg::greeks_t    greeks;
    uint32_t         impVolatility;
    msg::greeks_t    impGreeks;
    // …and more fields like this…
  };
  typedef struct Results results_t;
 
  typedef boost::unordered_map<msg::ng::secID_t, results_t> ResultsMap;

The secID_t is a SecurityID - a 128-bit number that packs all the critical instrument identification information into one nice little package. This makes sense as a key in the map because it's unique, easy to use, and the core of a lot of look-ups in the rest of the system. Sounds great.

When I made the deserialization method for the data coming out of the redis server, it looked very simple:

  bool unpack( const std::string & aBuffer, uint32_t & aPos, ResultsMap & aValue )
  {
    bool       error = false;
 
    // first, let's get the number of pairs in the map
    int32_t   sz = 0;
    if (!msg::ng::unpack(aBuffer, aPos, sz)) {
      error = true;
    } else {
      msg::ng::secID_t  id;
      results_t         res;
      // now, for each pair, read them in and set them
      for (int32_t i = 0; i < sz; ++i) {
        id.extractFrom(aBuffer, aPos);
        res.deserialize(aBuffer, aPos);
        aValue[id] = res;
      }
    }
    return !error;
  }

I get the size of the map, then read in each pair and save it in the map. Simple.

Well… simple - yes. Fast? No.

The initial tests had this at 3 minutes for 88,000 pairs. That's horrible results. I did some profiling and it's in the one line:

        aValue[id] = res;

That makes no sense. Boost's unordered_map is the fastest around. I don't understand what's happening. So I did more digging. The next thing I found was that there wasn't an exposed hash function for boost to use. So I added it:

  namespace msg {
  namespace ng {
  size_t hash_value( const secID_t & anID );
  }
  }

and implement it simply as:

  namespace msg {
  namespace ng {
  size_t hash_value( const secID_t & anID )
  {
    return anID.hash();
  }
  }
  }

I expected this to speed things right up. The actual results? Same slowness. Amazing!

Maybe it's the security ID? Maybe it's be better with a Security Key - a std::string version of the same thing? Maybe boost is a lot more efficient dealing with string keys, even though they aren't more space efficient? Worth a try… we can't leave it at a 3 minute start-up.

So I went to:

  typedef boost::unordered_map<std::string, results_t> ResultsMap;

and then:

  bool unpack( const std::string & aBuffer, uint32_t & aPos, ResultsMap & aValue )
  {
    bool       error = false;
 
    // first, let's get the number of pairs in the map
    int32_t   sz = 0;
    if (!msg::ng::unpack(aBuffer, aPos, sz)) {
      error = true;
    } else {
      std::string  key;
      results_t    res;
      // now, for each pair, read them in and set them
      for (int32_t i = 0; i < sz; ++i) {
        if (msg::ng::unpack(aBuffer, aPos, key)) {
          res.deserialize(aBuffer, aPos);
          aValue[key] = res;
        }
      }
    }
    return !error;
  }

With these changes, the insertion of the values into the map went from 3 minutes to 61 msec! That's a factor of almost 3000x! Wow! OK… now I know. Don't use security IDs as keys in the boost::unordered_map. It's just not even close to fast enough.

Google Chrome dev 18.0.1205.1 is Out

Wednesday, February 1st, 2012

Google Chrome

This morning I noticed that Google Chrome dev 18.0.1025.1 was out and it looks like a few nice things for me in the release notes. SPecifically, the latest V8 javascript engine (3.8.9.0), and then a few nice Mac-specific things like fixed Lion gestures, fixed momentum scrolling in frames, and an issue about the devtools closing prematurely. All nice things to have. I'm really glad that aren't focusing on Windows and leaving Mac OS X a version or two behind. That would be sad.

Added Reload Persistence of Option Calcs

Tuesday, January 31st, 2012

Building Great Code

One of the things the traders have complained about in their testing this week has been that the values of the calculations change when we restart the servers. Well… yes, they do. We load up the instrument data, flush in the ticks, and then calc the results. If the prices have changed, and they will in after-hours and pre-hours trading, then the values you see at 8:00 am the ext morning are not the same as the ones you saw at 3:00 pm the night before. So it goes. The prices change, and I was thinking they'd want to see that.

Nope. Count not possibly have been more wrong.

OK, so what's the solution? Well… the easiest thing is just to not calculate the Option values outside of the trading day. The problem with that is: how do we get the last known good data back in after a reload or a restart? Ah… the plot thickens.

We don't want to save all the input values - that's just too much data. So let's just save the results. Thankfully, these are already confined in a simple object in the code, so that's all we have to deal with. Let's tackle the reload first. (In truth, I forgot about the restart at this point, and had to go back and put it in. Ick.)

Every morning, we drop all the instruments we have in memory and load them up from the local replica copy postgres database. It's a wonderfully efficient system we have built. So we load these guys up. If we're careful, and save off these option calculation results before we drop the instruments, and then re-apply them after the new instruments are loaded, then it'll appear that the values are persistent for all instruments that are in the reload.

Clearly, new instruments won't have any values, but they would not have had any the night before, either. Seems to be a reasonable trade-off. Not the ideal I'd like, but it's workable. I personally think it'd be better to look at those new instruments and see if they happen to have a last price in the database, and if so, then use that. But if it's really a new instrument, then we have to wait for ticks - there's just no two ways about it.

This took a good chunk of the afternoon for me to write up, but it tests out fine and should work OK too.

Fixed Up Auto-Flipping Transition

Tuesday, January 31st, 2012

bug.gif

This morning I realized that my auto-flipping of the Ticker Plants had a big hole in it. When I did the flip based on the size of the non-preferred versus preferred side queues, I wasn't properly draining the queue with messages in it. This would be a problem because the next message to arrive would appear later than all the messages in the queue, and I'd effectively drop all them. Not good.

The solution was pretty easy - make a drainPendingMessages() method that I can call before flipping the sides, and use that to drain all the pending messages, and then do the flip. This way, we don't have the gap in the message stream even though we're clearly in trouble on one side of the feed.

Much better solution.

Added Auto-Flipping to Ticker Plants

Monday, January 30th, 2012

Building Great Code

Recently we had a serious problem with one of our exchange feeds. Basically, at 2:00 pm, we just stopped getting the 'A' side of several feeds. Because of the way I'd changed the Ticker Plants, this amounted to a complete halt on ticks. Very bad. But not a bug.

Interestingly enough, the system we have in the Ticker Plants is a consequence of trying to be as accurate as possible. If we have two sides of the same feed - A and B, and they are both supposed to be coming in at about the same time and rate, how do you arbitrate between the two to get the most complete feed and not send down any duplicates? Well… the first idea I had was to look at each side and take the most up-to-date message. Well… that works OK, until one of the feeds gets ahead of the other, or has a skip, and then one feed is showing message 100, and the other is showing 200.

I'd take the 200, and then look for 201 - totally skipping the fact that 100 through 200 is on the other side, just waiting to be used.

The solution was then to look at one side as the "preferred side" and use the other only to "fill in the gaps". This is great as it doesn't skip over message blocks, but the problem is that if your preferred side goes down, you're dead. Even if you have the message stream on the other side, it's going to be ignored as it's only there for filling in the gaps.

As an aside, this brought to a head something that has been a great sore spot for me in the recent reforges at The Shop. That is the need to "check with the manager" to fix something. I estimated that this "auto-flippimg" would take a few days. Not bad, and then it's smart enough to always pick the right side. A far superior product. But it wasn't seen that way. It was seen as something that needed to be scheduled, and planned, and blah, blah, blah. I wanted to scream!

This was my project, and I wanted to take a few days. That should be the end of it. Period. End of story. If I'm too late with too many things, then fire me. But after 35 yrs of doing this, I'm experienced enough to know what's really important, and what is something that's optional and can be scheduled.

Exceptionally frustrating.

So I ended up getting the "OK" and today was the coding.

The solution I came up with was to look at the number of queued messages on the non-preferred side when the preferred-side was empty. If that number was large, say 100,000, then it's pretty clear that the preferred side is in trouble, so let's switch the sides. I needed to do a little more than this to make it clean and as atomic as possible, given that multiple threads are involved in this code, but it wasn't too hard.

In the end, it was a nice change, and I really liked knowing it was going to protect itself from outages in the future.

Now if fixing the management issues were only this easy...

Slogging Through the Big Testing Cycle

Friday, January 27th, 2012

Today has been a lot of little things that came out of the continuing testing cycle with the traders. The greek engine is at the core of quite a few products that are switching over to it as the source of their information, and it's creating quite a mess about it. Lots of people sending in the same issues, not realizing that someone else is reporting the same thing, etc. Then we're purposefully not fixing things in the code they are testing because the management of this testing cycle didn't want to shift the testing underneath them. Truth be told, I'm not a fan of this scheme, but I can't say it's wrong either. The thought is that this is what they are testing, and if the rack up 100 bugs that are all the same thing, then when we fix that, it'll be easy to check those 100 things again and prove they are all working.

I'm more of the "fast waterfall" testing - find a bug, fix it, repeat as quickly as possible. This has the problem that you're always "starting" the testing process. The upside is that you aren't going to find 100 bugs with the same cause - only one.

So I'm trying to make sense of this and keep my head above water. It's hard, because work is becoming oppressive again, and my ability to post is getting beaten out of me. I have to fight the urge to just give up on posting, as it's about the only thing I can always look at and be happy about.

Adding New Data to the Greek Engine

Thursday, January 26th, 2012

High-Tech Greek Engine

Today, in addition to dealing with QA questions from a lot of the groups trying to test their apps against my new engine, I had to add in the last trade on each exchange to the system so that the clients could see that - in addition to the per exchange quotes and volume. I don't know why the guy who used to work on this didn't tell me this was needed - but he didn't, and that's just the way things go. Nothing to do today, but fit it in.

The problem is that the change really makes the serialized output format different, and that means that we need to have the Java and C# clients updated as well. I did the C++ client when I put in the change, and tested it out as a part of my codebase. But the other clients are in my codebase, and so I just have to send out the change, and hope they see the note and pick it up as soon as possible.

I'm sure there will be prodding on the part of the users, and it'll make them a little grumpy, but since I didn't know about it until today, I can't really have done anything about it until today. So it goes.

It was a bit of a hassle because of the way they wanted me to implement it - first in a different branch, then fold it into mine so I could really test it. Not horrible, but annoying. So it goes.

I'm saying a lot of that recently… Not a really great sign in my book...

Setting Up a Second Datacenter

Thursday, January 26th, 2012

Building Great Code

Today I got word that I had new boxes at out second datacenter - call it disaster/recovery, but it runs 100% all the time, so it's really just a secondary site that is fully used. There's a lot to get going on these boxes - postgres, The Broker, the user accounts, the right SSH keys, a lot of little things before I even really get to worry about the software and making it run on the second site.

Thankfully, there is no rush. I'll get it done when the sys admins get their stuff done, and then I'll fire up postgres and The Broker, and make the changes to the code so it deploys and runs as it should.

Throw in a few crontab entries and we'll be good to go. But it'll probably be tomorrow before that's all done. Just takes time.