Archive for the ‘Coding’ Category

Google Chrome dev 21.0.1180.11 is Out

Tuesday, June 26th, 2012

Google Chrome

This morning I noticed that yesterday while I was at the interviews, the Google Chrome team release 21.0.1180.11 to the dev channel. The changes are sounding pretty routine these days: new V8 javascript engine 3.11.10.12, more Retina (HiDPI) enhancements for the new MacBook Pros, and several other crash fixes. Not bad for an update. I'm pleased that they are keeping the speed up after those few sluggish releases, so we'll see what they have planned for the 22.x series.

Getting ZeroMQ 3.2.0 Compiling on Mac OS X 10.7 Lion

Wednesday, June 20th, 2012

ZeroMQ

This afternoon I decided that maybe it was time to see if I could get ZeroMQ built and running on my MacBook Pro running OS X 10.7 as well as my Ubuntu 12.04 laptop. I'm thinking it might be nice to write a few little test apps again with the latest ZeroMQ APIs between the machines to make sure that I have everything I need - should it come to that and I need to implement a little ZeroMQ action into DKit, or some other library.

The first step is downloading it from the ZeroMQ site. I picked the POSIX tarball as it's the one with the created ./configure script, and I needed that in order to get things kicked off.

Next, we try to build it on OS X 10.7 and Ubuntu 12.04. There are a few changes that have to be made to the OpenPGM code in order for it to compile on OS X. They are basically the includes needed, and not allowing duplicate definition of values.

In ./foreign/openpgm/build-staging/openpgm/pgm/include/pgm/in.h :

Replace:

  1. /* sections 5 and 8.2 of RFC 3768: Multicast group request */
  2. struct group_req
  3. {
  4. uint32_t gr_interface; /* interface index */
  5. struct sockaddr_storage gr_group; /* group address */
  6. };
  7.  
  8. struct group_source_req
  9. {
  10. uint32_t gsr_interface; /* interface index */
  11. struct sockaddr_storage gsr_group; /* group address */
  12. struct sockaddr_storage gsr_source; /* group source */
  13. };

with:

  1. #ifndef __APPLE__
  2. /* sections 5 and 8.2 of RFC 3768: Multicast group request */
  3. struct group_req
  4. {
  5. uint32_t gr_interface; /* interface index */
  6. struct sockaddr_storage gr_group; /* group address */
  7. };
  8.  
  9. struct group_source_req
  10. {
  11. uint32_t gsr_interface; /* interface index */
  12. struct sockaddr_storage gsr_group; /* group address */
  13. struct sockaddr_storage gsr_source; /* group source */
  14. };
  15. #endif // __APPLE__

In ./foreign/openpgm/build-staging/openpgm/pgm/sockaddr.c :

Replace:

  1. #include <errno.h>
  2. #ifndef _WIN32
  3. # include <sys/socket.h>
  4. # include <netdb.h>
  5. #endif

with:

  1. #include <errno.h>
  2. /* Mac OS X 10.7 differences */
  3. #ifdef __APPLE__
  4. # define __APPLE_USE_RFC_3542
  5. # include <netinet/in.h>
  6. #endif
  7. #ifndef _WIN32
  8. # include <sys/socket.h>
  9. # include <netdb.h>
  10. #endif

In ./foreign/openpgm/build-staging/openpgm/pgm/recv.c :

Replace:

  1. #include <errno.h>
  2. #ifndef _WIN32

with:

  1. #include <errno.h>
  2. /* Mac OS X 10.7 differences */
  3. #ifdef __APPLE__
  4. # define __APPLE_USE_RFC_3542
  5. # include <netinet/in.h>
  6. #endif
  7. #ifndef _WIN32

The final change is to the ZeroMQ source itself:

In ./src/pgm_socket.cpp :

Remove lines 88-92, make this:

  1. pgm_error_t *pgm_error = NULL;
  2. struct pgm_addrinfo_t hints, *res = NULL;
  3. sa_family_t sa_family;
  4.  
  5. memset (&hints, 0, sizeof (hints));
  6. hints.ai_family = AF_UNSPEC;
  7. if (!pgm_getaddrinfo (network, NULL, &res, &pgm_error)) {

look like this:

  1. pgm_error_t *pgm_error = NULL;
  2. if (!pgm_getaddrinfo (network, NULL, addr, &pgm_error)) {

At this point, we can get ZeroMQ to compile on Mac OS X 10.7 as well as Ubuntu 12.04. But there's a slight wrinkle… while I'm fine with the linux library being a 64-bit only architecture:

  drbob@mao:~/Developer/zeromq-3.2.0$ file src/.libs/libzmq.so.3.0.0
  src/.libs/libzmq.so.3.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/
  Linux), dynamically linked, BuildID[sha1]=0x71a160f17833128c864811b25942cdacdb54
  f6d0, not stripped

I'd really like the Mac OS X 10.7 dynamic library to be Universal, with both 32-bit and 64-bit architectures in it. Currently, it's 64-bit only:

  peabody{drbob}82: file src/.libs/libzmq.3.dylib
  src/.libs/libzmq.3.dylib: Mach-O 64-bit dynamically linked shared library x86_64

Hmmm… it's really amazing why some folks choose to write Makefiles in a way that doesn't allow you to use multiple -arch arguments. There really aren't all that many way to mess this up, but it seems that the OpenPGM and ZeroMQ guys have done it really pretty nicely. I can't simply remove the offending compiler flags and add in the necessary -arch i386 -arch x86_64. So I have to make it twice: once for i386 and again for x86_64 and then use lipo to stitch them together.

In the Makefiles, I added the -arch i386 command to the CFLAGS and CPPFLAGS variables after successfully building the 64-bit version of the libraries. I then did a simple:

  $ make clean
  $ make

and then when I looked at the resulting libraries they were 32-bit. I then just created the Universal binary with the lipo commands:

  $ lipo -create libzmq.3.dylib .libs/libzmq.3.dylib -output .libs/libzmq.3.dylib
  $ lipo -create libpgm-5.1.0.dylib .libs/libpgm-5.1.0.dylib -output \
    .libs/libpgm-5.1.0.dylib

Now I can copy these away and install them in the right location for usage. Typically, I'll do a make install and place everything into /usr/local/ and then copy these Universal binaries over the installed mono-binaries and be ready to roll.

UPDATE: I can get ZeroMQ to work with the simple transports, but the OpenPGM encapsulated transport fails miserably. I'm looking at the code and I'm becoming convinced that no one is really using or testing this code. It's a mess and there's no way some of these method calls are going to work. It's all in the PGM portion, so if they stay away from that, it's fine. But that's not good enough for me. So I'm giving up for now. I have asked a few things in IRC, but there aren't any responses (yet), and I don't expect to get any about the OpenPGM part.

I think it's dead, and I need to just move on. Sad to see, but it happens.

[6/21] UPDATE: after a lot of work trying to get the OpenPGM 5.1.118 code to work, I see that the developer knows about the problems, and is just not ready to support Mac OS X 10.7 at this time. I can respect that, but the things he had to have done to make his code this non-standard must be pretty wild. So there's nothing I can do at this point. OpenPGM is a deal-breaker, and that's the entire reason I wanted ZeroMQ.

Appreciating Ubuntu 12.04

Wednesday, June 20th, 2012

Ubuntu Tux

This morning I was once again updating my Ubuntu 12.04 laptop and realized that were it not for a crummy hardware platform - the trackpad is horrible, and the display is very inexpensive, this would be a really nice laptop/development box. It's got all the tools I could ask for, it's got some wonderful fonts for the Terminal and such, it's got Google Chrome for all the web stuff… it's really pretty slick.

I gotta tip my hat to the Ubuntu folks. This is a much better distro than RedHat/Fedora was. I'm guessing Fedora is getting better, and it's probably as nice as you need, but the ability to do the updates easily - and have them on by default, is really nice. It's nice to have it all right there.

Certainly, I'm not giving up my MacBook Pro anytime soon, but I've looked at BestBuy, and you can get nice Wintel hardware for $700 to run this on… all of a sudden, it's something I might actually carry from time to time. Certainly nicer to work with than the crummy trackpad and display.

It's a great compliment to the MacBook Pro. Nice to have.

Google Chrome dev 21.0.1180.0 is Out

Wednesday, June 20th, 2012

Google Chrome

This morning I noticed that Google Chrome dev 21.0.1180.0 was out with several changes for the Retina MacBook Pro as well as the latest V8 javascript engine. These are really nice updates, but what I noticed right off is the mistake in the rendering with the previous version left a slight tell-tale horizontal line in the background every "page" or so. It wasn't horrible, and wasn't even directly repeatable all the time, but it was somewhere on the page, and it was enough that it made me wonder if my machine was bad.

So it seems to be gone, and that's great news, but so are all the other fixes. These HiDPI changes are nice for a lot of people getting the new MacBook Pros, and one day I'm sure I'll have one too… but it's not where I'd hoped Chrome were adding features right now. But that's OK… it'll all work out I'm sure.

UPDATE: Spoke too soon:

Chrome Render Bug II

Interestingly enough, these disappear if I scroll this section out of range, but will return to another location on the page if I just keep scrolling. Nasty bug, but it's in Chrome, and I'm not going to worry too much about them fixing it. They'll get to it - just like they did with the rendering of the Finance page on Zoom.

[6/22] UPDATE: this morning I see 21.0.1180.4 is out and this time, they say they fixed several "alignment issues", and I'm hoping they mean these lines. They also put in the V8 javascript engine 3.11.10.10, which is nice. I'm hoping these lines are gone.

Installing JDK 6 on Ubuntu 12.04

Monday, June 18th, 2012

java-logo-thumb.png

This afternoon I wanted to get a few things going on the new Rackspace Cloud Server Joel had reconfigured to run Ubuntu 12.04. Specifically, I wanted to get the latest JDK 1.6.0 installed on the box as we need that for Mingle, which is the point of all this reconfiguration and such.

As it turns out, it's not that bad you just have to know where to go and what to get, and what to do with it. Isn't that the same thing with all new software for linux? Yes.

OK, first, get the latest JDK 6 download from Oracle here. Then it's a simple matter of unpacking it:

  $ chmod +x jdk-6u32-linux-x64.bin
  $ ./jdk-6u32-linux-x64.bin

Then you need to move that to someplace useful. For me, that's /usr/local and then make a few symlinks to make it easy to upgrade:

  $ sudo mv jdk1.6.0_32 /usr/local/
  $ cd /usr/local
  $ sudo ln -s jdk1.6.0_32 jdk1.6
  $ sudo ln -s jdk1.6.0_32 jdk

Then we can put it into the path using the alternatives system on Ubuntu 12.04:

  $ sudo update-alternatives --install /usr/bin/javac javac /usr/local/jdk1.6/bin/javac 1
  $ sudo update-alternatives --install /usr/bin/java java /usr/local/jdk1.6/bin/java 1
  $ sudo update-alternatives --install /usr/bin/javaws javaws /usr/local/jdk1.6/bin/javaws 1

and then set the default JDK (if needed):

  $ sudo update-alternatives --config javac
  $ sudo update-alternatives --config java
  $ sudo update-alternatives --config javaws

Finally, we can verify that the JDK was installed properly:

  $ java -version
  java version "1.6.0_32"
  Java(TM) SE Runtime Environment (build 1.6.0_32-b05)
  Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode)

Added Lots of Docs to SyncKit Project at GitHub

Monday, June 18th, 2012

GitHub Source Hosting

Today I moved my SyncKit project from Codesion (wanting more than $1000/yr) to GitHub (less than $100/yr) and with that move, it made a lot of sense to add in the standard GitHub README.md file so that the main page of the repo has some nice introductory documentation. While I was at it I did a lot of documentation - including how to set up the server-side box, and how to verify the set-up, and what the organization of the project was, and how it was all wired up to work. It was a lot of docs for a single day. But important to make sure that we are ready for any kind of documentation checking.

I'm glad to have it on the private GitHub side as well - there are just so many nice things about GitHub, I like to support it and use it when I can.

Built Templated Conflation Queue in DKit

Tuesday, June 12th, 2012

DKit Laboratory

This morning I finished up work on a conflation queue for DKit. The idea is pretty simple - take a queue and a trie, and based on the key value for the elements in the trie, allow updates to the values still in the queue, but keeping their relative placement in the queue such that when they are popped off, the latest value is pulled off, and the element is considered removed from the queue. It's a very common thing to have in processing market data when you don't need every tick, but what you do need is the very latest information when you can get it. Such would be great for risk systems, but not so great for execution systems - depending on the strategy.

Anyway, the key to all this was really that I had all the elements of the conflation queue in DKit - I just needed to bring them all together. For instance, since it's a FIFO queue, we based it off the FIFO superclass, and it's template signature supports this:

  namespace dkit {
  /**
   * This is the main class definition. The paramteres are as follows:
   *   T = the type of data to store in the conflation queue
   *   N = power of 2 for the size of the queue (2^N)
   *   Q = the type of access the queue has to have (SP/MP & SC/MC)
   *   KS = the size of the key for the value 'T'
   *   PN = the power of two of pooled keys to have in reserve (default: 2^17)
   */
  template <class T, uint8_t N, queue_type Q, trie_key_size KS,
            uint8_t PN = 17> class cqueue :
    public FIFO<T>
  {
  };
  }  // end of namespace dkit

The idea is simple: you have to know:

  • What to store - this is the type of data you're going to place in the queue, and if it's a pointer, then the conflation queue is going to automatically destroy the old copies when new values come in and overwrite the old
  • How many to store - this is a maximum number, as we're using the efficient circular queues. In practice, this isn't a real limitation as memory is pretty cheap and a queue meant to hold 217 elements is not all that expensive
  • The type of access - this is so you can control how many producers and now many consumers there are going to be. This is important as you can get higher performance from limiting the producers and consumers.
  • The key size of the trie - this is really what you're going to use to uniquely identify the values you are putting in the queue. If you know how you'll identify them, then you can choose the correct sized key to make that as efficient as possible.
  • (Optionally) The size of the pool of keys - this implementation allows for a set of pooled keys for the queue. This is nice in that you don't have to worry about getting keys into or out of the queue, but in order to be as efficient as possible, it makes sense to have a pool of them around. This optional parameter allows you to specify how many to hold onto at any one time.

Things went together fairly well because I had all the components: the different queues, even how to use the different access types in the pool. I had the pool, and so it was just a matter of putting things together and testing them out.

One thing I did find out was that when I call key_value() I'm not sure what exactly I'm going to be getting back. If we assume that we're using a key structure like this:

  struct key_t {
    uint8_t    bytes[KS];
  };

and the reason being is that we want to be able to create and destroy these without having to put the array form of the delete operator, then we can't simply do this:

  key_t      *key = _pool.next();
  if (key == NULL) {
    throw std::runtime_error("Unable to pull a key from the pool!");
  }
  // copy in the value for this element
  *(key->bytes) = key_value(anElem);

because the compiler is going to think that the LHS is just a uint8_t and not a series of bytes, capable of holding whatever is returned from key_value(). We also can't do this:

  // copy in the value for this element
  memcpy(key->bytes, key_value(anElem), eKeyBytes);

because the return value of key_value() is a value and not a pointer. So we have to be a little more involved than this. What I decided on was to use the fact that the compiler will choose the right form of the method and function, so I added in setters to the key:

  struct key_t {
    uint8_t    bytes[KS];
    // these are the different setters by size
    void set( uint16_t aValue ) { memcpy(bytes, &aValue, 2); }
    void set( uint32_t aValue ) { memcpy(bytes, &aValue, 4); }
    void set( uint64_t aValue ) { memcpy(bytes, &aValue, 8); }
    void set( uint8_t aValue[] ) { memcpy(bytes, aValue, eKeyBytes); }
  };

and with this, I can say:

  key_t      *key = _pool.next();
  if (key == NULL) {
    throw std::runtime_error("Unable to pull a key from the pool!");
  }
  // copy in the value for this element
  key->set(key_value(anElem));

and the compiler will pick the right set() method based on the right key_value() function the user provides. This is not as nice as simply copying bytes, as there's a call here, but it's not horrible, either. I need it to make the code simple and make it work.

Other than that, things went together very well. The tests are nice, and it's all ready to be hammered to death in a real setting. I'm so happy to have gotten to this point in DKit. These things took me a long time to get right before, and they aren't nearly as nice as what I've got now, and these will be out there for me to use no matter what. That's a very nice feeling.

iPhoto, iMovie, Thunderbolt – Lots of Updates from Apple

Tuesday, June 12th, 2012

Software Update

This morning I saw that there were updates to iPhoto, iMovie, Thunderbolt drivers, and even the AirPort stuff… so of course I had to update all that. This is nice, but I'm not really using iPhoto all that much, and I haven't had the time or material to really get into iMovie, so the updates are really just so-so to me. But what wasn't so-so was the reboot.

For the first time ever, when I logged back in, BBEdit was back in the right place on Spaces. Xcode was too. The only exception was MacVim. Everything else was back in place and just where I left it. This was awesome!

The updates in OS X Lion that started this "clean reboot" were really nice, but a few apps never really worked with it, and BBEdit was one of them. It'd restart, but it wouldn't put the windows back where they "came from". It wasn't a horrible issue, but there were just a few apps that didn't do it right. Well… no more.

Now BBEdit and Xcode are back where they were, and the only thing I have to do is get back into the directories I was in with my Terminal.app sessions. Too bad they don't save that, but who knows? Maybe they will in 10.8?

Google Chrome dev 21.1171.0 is Out – Fixes Finance Rendering

Tuesday, June 12th, 2012

Google Chrome

This morning saw that there was an update to Google Chrome dev to 21.0.1171.0, and I needed to get it and check to see if they had gotten my bug report on the rendering of the Google Finance page. As I pulled it up, I was very happy to see that they had, indeed, fixed the problem. It seemed to be related to the zoom out level, but I can't be sure. In any case, it's nice to see the fix back to the way it was.

The release notes don't say a lot, but that's OK. The proof is in the rendering.

Adding a Flexible Sized Key Templated Trie to DKit

Saturday, June 9th, 2012

DKit Laboratory

Today I spent a good chunk of time writing a 64-bit key trie as the next basic component of re-writing the exchange feeds in a lot better component library than I had previously written. The purpose of the trie is to have a lockless storage mechanism that has very fast insertion and access methods. The last version I'd written was based on a specific element being stored - a pointer to a Message. In this version I wanted to be able to store any template type - including pointers, and handle the disposal as needed.

I've made adjustments like this for the dkit::pool where I handled plain-old-datatypes as well as pointers, and disposed of the pointer values properly. I needed to be able to do the same thing for the trie.

The second improvement this time around is to stop trying to make the trie do everything associated with processing of its contents. In the past, I'd had the trie scan the Messages for symbol limits, values, etc. This was very inefficient, as it was all in the trie, and therefore very specific. I needed to move away from this and use functors that could be passed into the trie, and then operated on each of the valid elements in the trie. This is a far better scheme as it allows the user to make any processing on the Nodes in the trie, and this is far more flexible than trying to implement the methods in the trie itself.

The solution to this was to define a class within the templates trie so that I could create functors like this:

  class counter : public dkit::trie<blob *, dkit::uint64_key>::functor
  {
    public:
      counter() : _cnt(0) { }
      virtual ~counter() { }
      virtual bool process(
          volatile dkit::trie<blob *, dkit::uint64_key>::Node & aNode )
      {
        ++_cnt;
        return true;
      }
      uint64_t getCount() { return _cnt; }
    private:
      uint64_t     _cnt;
  };

They are easy to create, and you can apply them to all the elements in the trie with a simple call:

    counter    worker;
    m.apply(worker);

The final improvement was to switch from loop-based indexing into the trie to recursion-based accessing. There are a few reasons for this - but the most is flexibility and code simplicity. When you have a fixed key length, and you use looping to traverse the trie, you have a fixed series of nested for loops. This seems all well and good, but it makes for a very fixed structure and the code isn't all the easy to read.

By going with a better class structure for building the trie, we're able to take advantage of recursion, and then the only parameter to the "depth" of the trie, and the corresponding size of the key, is the number of branches used in the construction of the tree in the trie. If we put the parameter in the creation method then there's a single point where we stop creating branches, and instead make the leaf nodes. From this point, the same code that traverses a 64-bit key trie works for a 128-bit trie.

At least insofar as the movement in the trie.

Once I got it all working, the next thing was to look at the recursion vs. looping design. I wanted to make sure that I had the most performant design. What I found was that the recursion was about 25% faster than the looping structure. I'm guessing it's due to tail-recursion optimization by the compiler, but I'm not positive. I repeated the tests, and the difference is real, that's good enough for me.

Once it was all done, I wondered how hard it might be to make the key size another parameter in the template. Well… since we're using recursion, and therefore, the size of the key space is only used in one space, the solution was pretty simple. Start by defining the different sizes we'll accept:

  namespace dkit {
  enum trie_key_size {
    uint16_key = 2,
    uint32_key = 4,
    uint64_key = 8,
    uint128_key = 16,
  };
  }      // end of namespace dkit

and then we convert this to the test we need in the branch creation:

  enum {
    eLastBranch = (N - 2)
  };

In the code, we then have:

  virtual volatile Node *getOrCreateNodeForKey( const uint8_t aKey[],
                                                uint16_t aStep )
  {
    volatile Node   *n = NULL;
 
    // get the index we're working on (re-used a few times)
    uint8_t     idx = aKey[aStep];
    Component   *curr = __sync_or_and_fetch(&kids[idx], 0x0);
    if (curr == NULL) {
      // create a new Branch or Leaf for this part of the trie
      bool    createBranch = true;
      if (aStep < eLastBranch) {
        curr = new Branch();
      } else {
        curr = new Leaf();
        createBranch = false;
      }
      // throw a runtime exception if we couldn't make it
      if (curr == NULL) {
        if (createBranch) {
          throw std::runtime_error("[Branch::getOrCreateNodeForKey] "
                   "Unable to create new Branch for the trie!");
        } else {
          throw std::runtime_error("[Branch::getOrCreateNodeForKey] "
                   "Unable to create new Leaf for the trie!");
        }
      }
      // see if we can put this new one in the right place
      if (!__sync_bool_compare_and_swap(&kids[idx], NULL, curr)) {
        // someone beat us to it! Delete what we just made...
        delete curr;
        // ...and get what is there now
        curr = __sync_or_and_fetch(&kids[idx], 0x0);
      }
    }
 
    // now pass down to that next branch the request to fill
    if (curr != NULL) {
      n = curr->getOrCreateNodeForKey(aKey, (aStep + 1));
    }
 
    // return what we have dug out of the tree
    return n;
  }

This little test allows us to place the key size as a parameter to the template, and that makes it very easy to make different sized keys. For convenience, I added in the convenience methods to deal with the different sized keys from the contained values. It's not strictly necessary, but it'll make using the template class a lot nicer.