Now that I have good results with the UDP-based delivery protocol, I thought I'd take the time to go back and see if the master branch of ZeroMQ at GitHub was working yet. The problem I've been having in the past is that there were some significant changes in ZeroMQ post-2.1.0 release that broke the OpenPGM protocols. I've been working with the maintainers to try and fix these things, but the best I can do is test as I haven't dug into the code and unraveled all the changes they made.
Last time I checked, it was still broken, and then I started on the UDP-based transport and haven't looked back - until now. It's pretty easy:
$ git pull
$ ./autogen.sh
$ ./configure --with-pgm
$ make clean
$ make
and then in src/.libs I have all the libraries I need. I simply put them into the LD_LIBRARY_PATH before the installed libraries and I'm in business.
I'm very pleased to say that as of this morning, it's working again! Yup, working like a champ. I'm not positive what all the changes include, but they are significant, and possibly much needed. There's still the issue of everything being so "hidden" in ZeroMQ, but that's nothing new. It is, however, a downside when we have a competing transport that we can use that's not so heavily encapsulated.
Still... I haven't made the UDP transport reliable, and I know that's needed. So the question will be, what's the effect of swapping out the UDP-based transport with the latest ZeroMQ transport? I'll have to wait until everyone is happy with the UDP-based transport, and then swap back in the ZeroMQ one and see the difference.
If I had to bet, I'd say the difference will be minimal. I believe all the problems were really mine. Humbling to admit, but I believe it's the case.