Archive for September, 2010

Shifting Target Once Again

Thursday, September 30th, 2010

Ringmaster

Today did a git pull on the Broker that another guy had done a significant re-write on, and ended up spending the bulk of the day changing my code to update it so that the broker client and service parts are working again. It's not ideal, but I can understand all the reasons for the changes, and it's nice to see these kind of things happening. Every system gets better with a re-write - by talented people, and so the system, it's protocol, and capabilities are just getting better.

The consequence is that I have to spend a day doing a pretty significant re-write on my own. Thankfully, my code is pretty clean, but there were still a lot of changes due to the way the protocol changed. In the end, I found a few little issues with the Broker, but those should be easy for him to fix.

Tomorrow is back to testing the ticker plant query system.

C++ Cast Operator Overloading – Watch Out for const

Thursday, September 30th, 2010

Professor.jpg

I spent far too much time on this problem today, but I've finally gotten it solved, and it's worthy of writing up. No question about it. The problem is in the variant class and the casting operators, or more properly conversion operators, that I put in place for the class.

To start out, it's a simple variant where we have a simple union as the ivar, and as the value type of the variant changes, the different components of the union are set. Pretty standard. What I wanted was to be able to use simple cast operators to get the values out of the variant:

  // set the value to an int
  variant   v = 10;
 
  // use the value
  count += (int) v;

and all was going pretty well when I had the operators for this variant class defined as:

  operator varmap &() const;
  operator varmap *() const;
  operator varlist &() const;
  operator varlist *() const;
  operator int() const;
  operator int64_t() const;
  operator int64_t &();
  operator float() const;
  operator double() const;
  operator double &();
  operator std::string &() const;
  operator std::string *() const;
  operator uuid_t &() const;
  operator uuid_t *() const;
  operator bytes_t &() const;
  operator bytes_t *() const;
  operator error_t &() const;
  operator error_t *() const;

but when I added:

  operator uint8_t() const;
  operator uint8_t &();

things really started to fall apart. Specifically, with the additional casting for (uint8_t), I got compiler warnings about:

  // use the value
  value = (int) v;

saying the compiler could not figure out which one of the casting operators to use. There were several - both references and not. Very confusing. If I tried this:

  // use the value
  value = (int64_t) v;

everything worked fine. Very odd. But I thought Hey, it's clearer to have the size in there anyway, let's just get rid of the problem operator. But that's never the real end of the problem, is it?

I next had the problem with:

  // use the value
  value = (float) v;

Same thing. SO I really had to solve this. Bummer.

I spent about 90 mins trying all kinds of things, only to be blown away at the real solution: It's the const-ness of the casting operators. Change that, and it's all OK:

  1. operator varmap &() const;
  2. operator varmap *() const;
  3. operator varlist &() const;
  4. operator varlist *() const;
  5. operator int();
  6. operator int64_t() const;
  7. operator int64_t &();
  8. operator uint8_t() const;
  9. operator uint8_t &();
  10. operator float();
  11. operator double() const;
  12. operator double &();
  13. operator std::string &() const;
  14. operator std::string *() const;
  15. operator uuid_t &() const;
  16. operator uuid_t *() const;
  17. operator bytes_t &() const;
  18. operator bytes_t *() const;
  19. operator error_t &() const;
  20. operator error_t *() const;

Note lines 5 and 10 - no const. This allowed the compiler to figure out what it needed and I could put back in the (int) cast operator. What I'm guessing is that the (int) and (float) are really methods that create values as opposed to simply returning references or pointers to the members of the union. As such, making it const was not quite in line with what was happening. By allowing it to be returned as-is, the compiler was happier.

At least that's how I'm figuring it.

Building Query into Subscription of Ticker Plant

Wednesday, September 29th, 2010

Today was spent integrating the query capabilities from the Broker and ticker plant into the ticker plant client. The simple subscription is working fine, but when a client really subscribes to the ticker feed, they need to be given any of the messages that are cached on the ticker plants so that if there's a message out there to be seen, the subscriber will see it. Great for off-hours. Anyway, I needed to build this into the system by first putting in the plumbing for the second source of messages - the 'query', and then putting in the capability into the ticker plant to service these queries.

I got a good 90% of the code written today and tomorrow I'll be able to finish it and start testing. Lots of hard coding today.

Office Humor – The Authoritative Stats Quoter

Wednesday, September 29th, 2010

cubeLifeView.gif

I had something just happen that made me giggle, and I'm betting it's been the target of a Dilbert, or ten: it's the Authoritative Stats Quoter. It goes something like this: You're sitting there, doing your work - like you do, and you overhear someone in a near-by cube or desk saying something like "You know... 82% of cell phones aren't cats." - or something equally funny, incredulous, or just plain flat-out crazy.

I had to suppress a giggle. It was funny. But the last time I tried to suppress a giggle when I heard a whopper like this one I got yelled at by the story teller. He said that looping was faster done from higher index numbers to lower index numbers because the subtraction operator is faster than the addition operator. I tried not to laugh, I even brought my water cup to my face to hide behind it, but I think my eyes gave me away.

It's stuff like this that both aggravates me and makes me laugh - it depends on what mood I'm in. Really. It's funny. I assume they really must believe what they are saying, but I can't imagine how. Anyway... I had another one, and it made me giggle.

Good times.

Google Chrome dev 7.0.517.24 is Out

Wednesday, September 29th, 2010

I noticed this morning that Google Chrome dev is now at 7.0.517.24 and while to sum total of the release notes is:

This release focused on resolving minor bug fixes or crashes. More details about additional changes are available in the svn log of all revisions.

it's something that makes sense upgrading. I'm just curious when the Mac client is going to get the hardware acceleration that they are putting into the Windows version? It'd be nice, but it's not bad now... just would love to see more widespread use of the GPU in software... it's a great untapped resource.

Merging Live and Historical Message Streams

Tuesday, September 28th, 2010

GeneralDev.jpg

Today I've been working on the problem of merging the current tick data stream with the last known ticks for a given message type and instrument. The reason for the merge is that when a user of my ticker plant client (TPClient) subscribes to a certain set of messages, we need to be able to provide him with the last known versions of those messages as well. If the market is closed, this represents his only data. If the market is ticking, then chances are he's going to get a more recent message, which is why we have to properly filter them out.

I've got the filtering working out because it's pretty easy - look at the type of the message, and then have a simple std::map of conflation key to timestamp. If a message comes in, and we have filtering turned 'on', then check to see the last timestamp we saw for a message like this. If we have a newer one, send it. If not, don't. If there's no filtering, then send it regardless.

This allows the client to decide if they want to see the messages, and filter them on their end, or allow the TPClient to handle it for them. It's all pretty simple.

What's proving to be a touch more difficult is the providing of the messages for filtering. It makes most sense to have the messages pulled from the cache when a subscription is done. Also, it makes sense to have the criteria be the same. It's also been established that this request will be a 'call' to an MMD Service that will be the QuickCache within the ticker plant's exchange feeder. What's not so clear to me now is the way I'm going to be able to map the request to the MMD service, and how I'm going to get that data back into the TPClient's filter's onMessage() method. But I know I have to do just that.

I'll tackle that tomorrow.

VoodooPad Pro 4.3 is Out

Tuesday, September 28th, 2010

I got a tweet this afternoon that said VoodooPad Pro 4.3 is Out. I hit the 'update' button and started the download. I use it all the time now for holding all kinds of data on projects and things I document. It's a wiki, sure, but it's a Mac Wiki and not a generic one. It's far slicker than the web ones I've used, and that makes all the difference in ease of use to me.

The release notes are pretty extensive, and while I didn't see anything earth shaking to me, I'm sure all the memory leaks and general fixes are going to be things I appreciate over the weeks and months to come. Sweet.

Limitations on the STL std::map and Finding Keys

Tuesday, September 28th, 2010

I've been working with the STL std::map for quite a while, but recently I use it as the core of a data structure where I wasn't finding an exact match - I was looking for a "lower bound" on the key to see where I needed to start looking for a match to the data. It's probably easier to start from the beginning. The data I'm dealing with is a std::map where the key is a uint32_t and the value is a boost::tuple of a uint32_t, another uint32_t, and a std::string. Like this:

  typedef boost::tuple<uint32_t, uint32_t, std::string> Channel;
  typedef std::map<uint32_t, Channel> ChannelMap;

Where the data looks something like this:

Key Value
0x00000000 0x00000000, 0x01ffffff, "first"
0x02000000 0x02000000, 0x02ffffff, "second"
0x03000000 0x03000000, 0x03ffffff, "third"
0x04000000 0x04000000, 0x04ffffff, "fourth"

where the 'key' is the first value of the tuple, and the two numeric values in the tuple form an arithmetic range for a uint32_t. What I need to do is to take an arbitrary uint32_t value and find the string that it fits with. If it's less then the least range there's nothing, and if it's greater than the last, it's nothing.

The problem was that I was using STL's lower_bound() and assuming that it was going to give me what I thought was the "lower bound" of the value in the keyspace. But it doesn't. What lower_bound() returns is:

Finds the first element whose key is not less than the argument

Simply put, it find the key whose value is greater than or equal to the argument. So this is almost the "upper bound" in my book. But there's more.

The upper_bound() method returns:

Finds the first element whose key greater than the argument

which is no better. What I wanted was something like: the largest key less than or equal to the argument. Put that way, I'm not really surprised that I didn't find it. So how to I make it out of the methods I have?

What I needed to do was to look at values of both of these functions and try to make some sense of it. So I made the following test code:

  #include <iostream>
  #include <string>
  #include <map>
  #include <stdint.h>
 
  int main(int argc, char *argv[]) {
    std::map<uint32_t, uint32_t>   m;
    for (uint32_t i = 10; i <= 100; i += 10) {
      m[i] = i + 1;
    }
    std::map<uint32_t, uint32_t>::iterator    it;
    // print out the entire map
    for (it = m.begin(); it != m.end(); ++it) {
      std::cout << "m[" << it->first << "] = " << it->second << std::endl;
    }
    // now check the lower_bound and upper_bound methods
    it = m.lower_bound(45);
    std::cout << "lower_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(45);
    std::cout << "upper_bound(45): " << it->first << " == " << it->second << std::endl;
 
    it = m.lower_bound(50);
    std::cout << "lower_bound(50): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    std::cout << "upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    it = m.upper_bound(45);
    --it;
    std::cout << "--upper_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    --it;
    std::cout << "--upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    return 0;
  }

which returns:

  m[10] = 11
  m[20] = 21
  m[30] = 31
  m[40] = 41
  m[50] = 51
  m[60] = 61
  m[70] = 71
  m[80] = 81
  m[90] = 91
  m[100] = 101
  lower_bound(45): 50 == 51
  upper_bound(45): 50 == 51
  lower_bound(50): 50 == 51
  upper_bound(50): 60 == 61

From this, it seems like the two really aren't all that different - and they aren't. What's important to see, though, is that really the definition of greatest less than or equal to is something like one less than the one just greater. With that, I tried using the upper_bound() and then backing off one:

  #include <iostream>
  #include <string>
  #include <map>
  #include <stdint.h>
 
  int main(int argc, char *argv[]) {
    std::map<uint32_t, uint32_t>   m;
    for (uint32_t i = 10; i <= 100; i += 10) {
      m[i] = i + 1;
    }
    std::map<uint32_t, uint32_t>::iterator    it;
    // print out the entire map
    for (it = m.begin(); it != m.end(); ++it) {
      std::cout << "m[" << it->first << "] = " << it->second << std::endl;
    }
    // now check the lower_bound and upper_bound methods
    it = m.lower_bound(45);
    std::cout << "lower_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(45);
    std::cout << "upper_bound(45): " << it->first << " == " << it->second << std::endl;
 
    it = m.lower_bound(50);
    std::cout << "lower_bound(50): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    std::cout << "upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    it = m.upper_bound(45);
    --it;
    std::cout << "--upper_bound(45): " << it->first << " == " << it->second << std::endl;
    it = m.upper_bound(50);
    --it;
    std::cout << "--upper_bound(50): " << it->first << " == " << it->second << std::endl;
 
    return 0;
  }

which returns:

  m[10] = 11
  m[20] = 21
  m[30] = 31
  m[40] = 41
  m[50] = 51
  m[60] = 61
  m[70] = 71
  m[80] = 81
  m[90] = 91
  m[100] = 101
  lower_bound(45): 50 == 51
  upper_bound(45): 50 == 51
  lower_bound(50): 50 == 51
  upper_bound(50): 60 == 61
  --upper_bound(45): 40 == 41
  --upper_bound(50): 50 == 51

If I was careful and checked for the limits, I think I'd have something. So that's exactly what I did.

My final code for finding the string in the tuple looks something like this:

  const std::string ZMQChannelMapper::getURL( const MessageMapCode aCode )
  {
    std::string     url;
 
    if (!mChannelMap.empty()) {
      ChannelMap::iterator  itr;
      if (mChannelMap.size() == 1) {
        // if there's only one... try it - we might get lucky
        itr = mChannelMap.begin();
      } else {
        // find the high-end of the enclosing range in the table
        itr = mChannelMap.upper_bound(aCode);
        // if it's not at the ends, back off one to the start
        if ((itr != mChannelMap.begin()) && (itr != mChannelMap.end())) {
          --itr;
        }
      }
      // from this starting point, check the range for a match
      if (itr != mChannelMap.end()) {
        Channel & tupleInfo = (*itr).second;
        if ((tupleInfo.get<0>() <= aCode) &&
            (aCode <= tupleInfo.get<1>())) {
          url.append(getBaseURL());
          url.append(tupleInfo.get<2>());
        }
      }
    }
 
    // all done - return what we have
    return url;
  }

It works, and it's OK, but it's clear why they didn't make something like this in STL - far too specialized. But I have it now.

Fantastic Advice for All Developers – Probably Everyone

Tuesday, September 28th, 2010

I found this on a tweet I got this morning and I love this guy's approach to using standards. In fact, I think it can be applied to a far greater range of things than standards. It's probably really good advice for life:

  1. If it hurts when you do it, stop doing it.
  2. Shut up and eat your vegetables.
  3. Assume people have common sense.

Golden Advice, people. Can't get much better than this.

If it hurts, then stop doing it. Figure out a better way. Realize that the pain is telling you something. Don't be a lemming. Find a better solution, a better way. This is the beginning line of virtually every great success story. Just do it.

Shut up and eat your vegetables. Realize there's no free lunch, and you're going to have to do the work. You're going to have to pound the keys. Practice. Effort. That's the only thing that's really going to pay off in the end. Just pipe down and get to it.

Assume people have common sense. If not, don't worry about them - they have far bigger problems than you're going to be able to solve.

Sage advice indeed. Love it.

Finished up the Cache Data Service for Ticker Plant

Monday, September 27th, 2010

Ringmaster

Today was a great day for making progress on publishing the cache through the data service. In fact, I got it all done. It's pretty slick and should fit into the other data services of the broker nicely.

I create a subclass of the MMDServiceHandler which is spawned off of the MMDService for each call to bind() that the MMDService receives. It's a classic controller/worker breakdown where I've put a lot of the smarts of the workers in the abstract base class - MMDServiceHandler, and then create subclasses for the 'basic' data handler and the 'cache' handler. I'd tackled the first one earlier, so today it was time to hit the latter.

The cache is a lockless single-producer, multiple-consumer cache of the last tick message the feed has produced for that message type and a particular conflation key to make sure we keep "unique" values, but not duplicates of those "unique" values. It's standard stuff for a ticker plant, the point is that we need to scan the cache - no two ways about it.

So I created the different query schemes and then implemented them - realizing that we don't have to worry as much about performance here as this is going through the Broker, and because of that it's meant for the slower data consumers. In that our big concern is not to slow down the feed by locking anything. Good enough.

I found out that I needed to return the data in two ways: objects and a map of ivar names/values for the cross-platform crowd. To accomplish this, I had to add a getMapData() method to all the messages and build it up from the bottom. It just took a little time, but in the end it's a solid way of allowing easier access to the Java and Python clients as they all have maps and I don't have to worry about making real objects of the messages. (Note: I am making them in Java, but it's nice not to have to mess with the Python client)

The last thing I had to do was to glue this service into the Ticker Plant such that the feed exposed it's cache so the service could bind() it to the Broker. Not bad at all, and very slick. Really nice day today.