Debugging Some Nasty Problems in Pair Programming Style

This afternoon I had the most horrid debugging session in recent memory. It was really just that bad. The problem was that the code the new guy was writing was giving segmentation faults when my code wasn't. Additionally, there seemed to be a serious data problem in the information we were getting from another group's code. Not fun.

The first problem I ran into was the formatting of a timestamp into a human-readable string. The original code was:

  std::string formatTimestamp( uint64_t aTime )
  {
    char   buf[32];
    // see if it's w.r.t. epoch or today - we'll format accordingly
    if (aTime > 86400000) {
      // this is since epoch
      time_t  msec = aTime;
      struct tm   when;
      localtime(&msec, &when);
      // now make the msec since epoch for the broken out time
      msec = mktime(&when);
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%04d-%02d-%02d %02d:%02d:%02d.%03d",
               when.tm_year, when.tm_mon, when.tm_mday,
               when.tm_hour, when.tm_min, when.tm_sec,
               (int)(aTime - msec));
    } else {
      // this is since midnight - let's break it down...
      uint32_t    t = aTime;
      uint8_t     hrs = t/3600000;
      t -= hrs*3600000;
      uint8_t     min = t/60000;
      t -= min*60000;
      uint8_t     sec = t/1000;
      t -= sec*1000;
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%02d:%02d:%02d.%03d", hrs, min, sec, t);
    }
    // ...and return a nice std::string of it
    return std::string(buf);
  }

My problem typically is that I test some things, but not all edge cases. In this case, I hadn't really tested the "since epoch" code, for there's a few doozies in that section.

First, localtime_r() is seconds since epoch, not milliseconds, like I'm assuming. Duh. That means we need to fix that up:

  std::string formatTimestamp( uint64_t aTime )
  {
    char   buf[32];
    // see if it's w.r.t. epoch or today - we'll format accordingly
    if (aTime > 86400000) {
      // this is since epoch
      time_t  sec = aTime;
      struct tm   when;
      localtime(&sec, &when);
      // now make the sec since epoch for the broken out time
      sec = mktime(&when);
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%04d-%02d-%02d %02d:%02d:%02d.%03d",
               when.tm_year, when.tm_mon, when.tm_mday,
               when.tm_hour, when.tm_min, when.tm_sec,
               (int)(aTime - sec*1000));
    } else {
      // this is since midnight - let's break it down...
      uint32_t    t = aTime;
      uint8_t     hrs = t/3600000;
      t -= hrs*3600000;
      uint8_t     min = t/60000;
      t -= min*60000;
      uint8_t     sec = t/1000;
      t -= sec*1000;
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%02d:%02d:%02d.%03d", hrs, min, sec, t);
    }
    // ...and return a nice std::string of it
    return std::string(buf);
  }

The next one is about the broken out time struct. The year is offset by 1900, and the month offset by 0, so I had to fix those up:

  std::string formatTimestamp( uint64_t aTime )
  {
    char   buf[32];
    // see if it's w.r.t. epoch or today - we'll format accordingly
    if (aTime > 86400000) {
      // this is since epoch
      time_t  sec = aTime;
      struct tm   when;
      localtime(&sec, &when);
      // now make the sec since epoch for the broken out time
      sec = mktime(&when);
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%04d-%02d-%02d %02d:%02d:%02d.%03d",
               when.tm_year+1900, when.tm_mon+1, when.tm_mday,
               when.tm_hour, when.tm_min, when.tm_sec,
               (int)(aTime - sec*1000));
    } else {
      // this is since midnight - let's break it down...
      uint32_t    t = aTime;
      uint8_t     hrs = t/3600000;
      t -= hrs*3600000;
      uint8_t     min = t/60000;
      t -= min*60000;
      uint8_t     sec = t/1000;
      t -= sec*1000;
      // now let's make a pretty representation of those parts
      snprintf(buf, 31, "%02d:%02d:%02d.%03d", hrs, min, sec, t);
    }
    // ...and return a nice std::string of it
    return std::string(buf);
  }

Here, now, we finally have something that's right. The second big issue was with the encoding of integers with Google's varint system. In general, it's vey effective, but I was seeing the timestamps coming from a service as twice what they should be. Made no sense whatsoever - until I saw the code.

In order to efficiently compress signed integers, Google came up with the idea of zig-zag encoding. Basically, you sort the integers by their absolute value, negatives first, and then pick the row in the sort as the coded value. Something like this:

Value Code
0 0
-1 1
1 2
-2 3
2 4

From this table, it's easy to see that if you expect to see an encoded unsigned int, as I did, the signed int can look to be twice the size! When I saw they were using signed values where I was expecting an unsigned value, I knew it was the zig-zag encoding and fixed that right up.

These should have taken me about 30 mins to find. As it was, they took hours. Why? The developer that's new to the team wanted to "watch". Now by "watch" they meant "watch a little, but talk a lot", and there's the rub. When I'm debugging, I need quiet. I need to see what's really going on, why the code isn't acting as I expect it, and what assumptions and ideas I had that were wrong. Having a Chatty Kathy sitting next to me was, to say the least, unhelpful. To be accurate, a hinderance.

If I felt they learned anything in the process, I'd settle for that as a positive outcome. But I'm not at all sure they learned a thing. I think they just don't like working alone, and so saw this as an opportunity to have a "working buddy". Well... today is the last day for that. Never again.

Gotta nip this in the bud.