Archive for September, 2010

Transmit 4.1.1 is Out

Tuesday, September 14th, 2010

Transmit 4

I got a tweet today from the great guys at Panic that they had released Transmit 4.1.1 after a fitful release of 4.1, which had a bug with the Sparkle updating. The 4.1.1 release has an impressive list of features including a massive update to the Transmit Disk feature. It's now independent of MacFUSE, so I removed that from System Preferences, and it works with 64-bit kernels. It always was a good to great product, it's just getting better and better for what I need. Amazing application.

Debugging Some Nasty Problems in Pair Programming Style

Tuesday, September 14th, 2010

This afternoon I had the most horrid debugging session in recent memory. It was really just that bad. The problem was that the code the new guy was writing was giving segmentation faults when my code wasn't. Additionally, there seemed to be a serious data problem in the information we were getting from another group's code. Not fun.

The first problem I ran into was the formatting of a timestamp into a human-readable string. The original code was:

  std::string formatTimestamp( uint64_t aTime )
  {
    char   buf[32];
    // see if it's w.r.t. epoch or today - we'll format accordingly
    if (aTime > 86400000) {
      // this is since epoch
      time_t  msec = aTime;
      struct tm   when;
      localtime(&msec, &when);
      // now make the msec since epoch for the broken out time
      msec = mktime(&when);
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%04d-%02d-%02d %02d:%02d:%02d.%03d",
               when.tm_year, when.tm_mon, when.tm_mday,
               when.tm_hour, when.tm_min, when.tm_sec,
               (int)(aTime - msec));
    } else {
      // this is since midnight - let's break it down...
      uint32_t    t = aTime;
      uint8_t     hrs = t/3600000;
      t -= hrs*3600000;
      uint8_t     min = t/60000;
      t -= min*60000;
      uint8_t     sec = t/1000;
      t -= sec*1000;
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%02d:%02d:%02d.%03d", hrs, min, sec, t);
    }
    // ...and return a nice std::string of it
    return std::string(buf);
  }

My problem typically is that I test some things, but not all edge cases. In this case, I hadn't really tested the "since epoch" code, for there's a few doozies in that section.

First, localtime_r() is seconds since epoch, not milliseconds, like I'm assuming. Duh. That means we need to fix that up:

  std::string formatTimestamp( uint64_t aTime )
  {
    char   buf[32];
    // see if it's w.r.t. epoch or today - we'll format accordingly
    if (aTime > 86400000) {
      // this is since epoch
      time_t  sec = aTime;
      struct tm   when;
      localtime(&sec, &when);
      // now make the sec since epoch for the broken out time
      sec = mktime(&when);
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%04d-%02d-%02d %02d:%02d:%02d.%03d",
               when.tm_year, when.tm_mon, when.tm_mday,
               when.tm_hour, when.tm_min, when.tm_sec,
               (int)(aTime - sec*1000));
    } else {
      // this is since midnight - let's break it down...
      uint32_t    t = aTime;
      uint8_t     hrs = t/3600000;
      t -= hrs*3600000;
      uint8_t     min = t/60000;
      t -= min*60000;
      uint8_t     sec = t/1000;
      t -= sec*1000;
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%02d:%02d:%02d.%03d", hrs, min, sec, t);
    }
    // ...and return a nice std::string of it
    return std::string(buf);
  }

The next one is about the broken out time struct. The year is offset by 1900, and the month offset by 0, so I had to fix those up:

  std::string formatTimestamp( uint64_t aTime )
  {
    char   buf[32];
    // see if it's w.r.t. epoch or today - we'll format accordingly
    if (aTime > 86400000) {
      // this is since epoch
      time_t  sec = aTime;
      struct tm   when;
      localtime(&sec, &when);
      // now make the sec since epoch for the broken out time
      sec = mktime(&when);
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%04d-%02d-%02d %02d:%02d:%02d.%03d",
               when.tm_year+1900, when.tm_mon+1, when.tm_mday,
               when.tm_hour, when.tm_min, when.tm_sec,
               (int)(aTime - sec*1000));
    } else {
      // this is since midnight - let's break it down...
      uint32_t    t = aTime;
      uint8_t     hrs = t/3600000;
      t -= hrs*3600000;
      uint8_t     min = t/60000;
      t -= min*60000;
      uint8_t     sec = t/1000;
      t -= sec*1000;
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%02d:%02d:%02d.%03d", hrs, min, sec, t);
    }
    // ...and return a nice std::string of it
    return std::string(buf);
  }

Here, now, we finally have something that's right. The second big issue was with the encoding of integers with Google's varint system. In general, it's vey effective, but I was seeing the timestamps coming from a service as twice what they should be. Made no sense whatsoever - until I saw the code.

In order to efficiently compress signed integers, Google came up with the idea of zig-zag encoding. Basically, you sort the integers by their absolute value, negatives first, and then pick the row in the sort as the coded value. Something like this:

Value Code
0 0
-1 1
1 2
-2 3
2 4

From this table, it's easy to see that if you expect to see an encoded unsigned int, as I did, the signed int can look to be twice the size! When I saw they were using signed values where I was expecting an unsigned value, I knew it was the zig-zag encoding and fixed that right up.

These should have taken me about 30 mins to find. As it was, they took hours. Why? The developer that's new to the team wanted to "watch". Now by "watch" they meant "watch a little, but talk a lot", and there's the rub. When I'm debugging, I need quiet. I need to see what's really going on, why the code isn't acting as I expect it, and what assumptions and ideas I had that were wrong. Having a Chatty Kathy sitting next to me was, to say the least, unhelpful. To be accurate, a hinderance.

If I felt they learned anything in the process, I'd settle for that as a positive outcome. But I'm not at all sure they learned a thing. I think they just don't like working alone, and so saw this as an opportunity to have a "working buddy". Well... today is the last day for that. Never again.

Gotta nip this in the bud.

Cleaning Up Someone Else’s Code

Tuesday, September 14th, 2010

This afternoon I've spent a lot of time cleaning up someone else's code. This person is new to the group, and I've been asked to see if he will fit into the project. I do have my doubts...

I had him put in a few tests and write a few methods on an existing class. This wasn't meant to be hard, it was meant to be easy, but it turned out to be something that I needed to be more explicit about.

The code was an interesting combination of decent STL C++ code, but completely optimistic, and no comments to speak of. Those comments that were there weren't even complete sentences. While I'm not an english major, I think the comments are the only real help that an author gives to the person trying to figure out what they did. Many think of this as a weakness, but I know better. I'm often the one that reads what I wrote 6 to 9 months after the fact, and I'd like to cut my future self a little slack.

So I had to not only re-arrange things, and comment them, I also had to remove the optimism of the original code, and in many cases, improve the speed as the use of some STL ideas is fine if speed isn't an issue, but in a ticker plant, that's just simply not the case.

What I have come to realize is that the new guy is not all that great a coder. With all the examples they had to work with, they wrote code that was completely out of context with the rest of the class. This shows a real lack of awareness to me. We'll see how he progresses in the coming days.

Skitch Beta 8.7 (v2716) is Out

Tuesday, September 14th, 2010

Skitch.jpg

Skitch 8.7 - the final Beta is out and they promise that very soon they will have the 1.0 of the app and the website out with a dramatically new look. I really think they'd have to charge and arm and a leg for me not to pay it for this app - it's just that good. The only difference seems to be the extension of the timeout, but hey, at least it keeps working.

Good news, and something to look forward to.

MarsEdit 3.1 is Out

Tuesday, September 14th, 2010

MarsEdit 3

I got a tweet this afternoon about an update of MarsEdit 3.1. It includes 64-bit support as well as Lightroom 3 compatibility. Looks nice. The whole change list is:

MarsEdit 3.1

  • 64-bit Compatible
  • New per-blog setting to constrain image media to pre-set size
  • Support for Lightroom 3 media libraries
  • Added new "Paste HTML Source" command for e.g. embedding YouTube easily in rich mode
  • Enabled plugins for rich editor so you can e.g. see embedded movies
  • "Send To Weblog" command now works when preview window is front-most
  • Optional support for Google Data API 2.0 with Blogger
  • Bug Fixes
    • Fix issue that prevented ejection of external media volumes
    • Fix some cases where the preview window could turn blank
    • Make sure changes to the preview template save to disk immediately
    • Restore functionality of "send post to blog" scripting command

It'll be interesting to see if the 64-bit change makes it a little more responsive at times. That's about my only complaint. It'll get there, and it's a great app.

[9/15] UPDATE: seems there was a single, nasty bug in 3.1, and so they released 3.1.1. Seems to be a lot of that going around these days (Transmit).

Cleaning Up Ticker Plant Client

Tuesday, September 14th, 2010

This morning I've been cleaning up my ticker plant client - making the ZeroMQ listener of messages smart enough so that when the last listener is removed, the ZMQ socket is torn down thus shutting off the reliable multicast stream of messages that no one is listening for any more. It's a bunch of little details like that - things that will make the code nicer, but not really any less functional for the initial testing and rollout.

It's the "second pass" on the code - adding all those touches you'd expect to see, but aren't required in the first workable cut of the code.

Finished Adding Conflation to all Necessary Endpoints

Friday, September 10th, 2010

GeneralDev.jpg

I'm relieved to know that I have all the conflation queues and good byte-level queues in all the necessary endpoints in my communications library. The ZMQ receiver and transmitter have them, as to the TCP client and client proxy. This is pretty much all that's needed, and adding them where they aren't necessary is loading the system with threads that are processing data as fast as possible. It gets inefficient.

But for now, I have them in and things are looking and running well. One step closer to a full-up test. Now I need to start pulling these together into more complex components and then testing box-to-box performance. It's getting there.

iTerm2 – Nice Fork of a Good Terminal App

Friday, September 10th, 2010

iTerm2

This morning I saw that iTerm had been forked and someone was working on it again. It's been stale for quite a while, and with the additions to Mac's Terminal.app, there's reason to believe they just stopped work on it. But others picked up the torch.

I like the changes they have made in the fork, and we'll see how it goes in the coming weeks. But good enough. Gotta love the open source possibilities.

Unifying Caching and Building Conflation Queues

Thursday, September 9th, 2010

GeneralDev.jpg

Today I wanted to just build a conflation queue and integrate it into a few of my endpoints in the message flow so that slow consumers, or very fast producers, don't overwhelm the system. It's a simple design: a queue with a map that are linked so that the map can quickly tell the insert routine if the message already exists in the queue, and so just the contents are replaced. It's standard stuff, but there are always a few tricks.

In my case, I wanted to have the uniqueness of the messages dictated by the type, and unsigned 8-bit integer, and a conflation key, a 64-bit unsigned integer that is returned by each message indicating the uniqueness of the message within that family of message. I decided to use a simple STL std::pair as it could be used in the std::deque as well as the std::map as a key.

When I got into the code, though, I realized that there were plenty of places I wasn't doing any kind of buffering - specifically, the TCP endpoints. So I had to go into those components and put in a simple byte-level buffer with an additional thread for de-spooling. That took time. Then I saw that there were a few more that needed some work, and pretty soon the entire morning was gone.

Finally, though, I got all the components working and testing just fine - which wasn't as easy as it used to be with additional threads running around. In fact, it was a mess of segmentation faults until I changed the way I was terminating the unspooling threads. But I got it all going.

The afternoon was devoted to getting the conflation queue working and tested out. It's nice, and it should be pretty fast, but I'll have to wait and see how it tests out for performance when it's getting hit with some of these exchange feeds. But hey, that's the point - we need to be able to decouple the producers from consumers with this conflation queue.

Tomorrow is going to be putting these into some components.

Finished the Good, Fast, Message Cache

Wednesday, September 8th, 2010

Professor.jpg

This morning I was able to finish up the message cache that I started work on yesterday afternoon. It's been pretty straight forward right up until I got to the part about being thread-safe and fast at the same time. This meant lockless data structures, and that meant the compare and swap operations.

Since I'm only having one producer and many consumers, it's a thought that maybe I didn't have to worry too much about this. Well... that's a mistake. I do. But thankfully, there's only one place I need to worry about the complexities of this map - in the "setter" of the individual value.

The code I ended up with looks like this:

  boost::unordered_map<uint64_t, Message *>   mCache;
 
  Message   *oldMsg = mCache[key];
  while (!__sync_bool_compare_and_swap(&(mCache[key]), oldMsg, newMsg)) {
    oldMsg = mCache[key];
  }
  // now handle the old message
  if (oldMsg != NULL) {
    delete oldMsg;
    oldMsg = NULL;
  }

where the value newMsg is the value to place in the map at the key of key.

Sure, this is greatly simplified, but the single core issue is there - we have to check to see if the value has been changed, and if it has, we need to get the new value, and then try and set our value again. The idea is that it's thread-safe, not that it's the right element in the map, but that it's the one that's put there last.

This is working great for me and while I haven't had the chance to run tests against it, it's as fast as I'm going to be able to get regardless if it's fast enough. I think it will be. I've used the boost::unordered_map which is supposed to be faster than the STL std::map which uses a read/black tree for the key space, and since my key space is a uint64_t, it's pretty easy to hash that guy - it's just the value.

I'm going to have to test this guy with the real data feeds, but I think I'm going to be OK. I feel good about it.