Archive for February, 2011

Twitterrific 4.0.1 is Out

Monday, February 28th, 2011

Twitterrific.jpg

I got a tweet that Twitterrific 4.0.1 was released to the Mac App Store as well as to us direct customers, and it included several really nice features and fixes. There are a few more settings and the ability to make the dock icon go away. Not bad. I have to say, it's the twitter client I'm using, and it's significantly nicer than the official Twitter client as there's something about the official client that's broken on the Growl notifications. Maybe it's the lack of a persistent visual notification of tweets.

I'm glad they are making updates - the old version wasn't nearly so responsive to changes.

ZeroMQ master at GitHub has OpenPGM Working Again

Monday, February 28th, 2011

ZeroMQ

Now that I have good results with the UDP-based delivery protocol, I thought I'd take the time to go back and see if the master branch of ZeroMQ at GitHub was working yet. The problem I've been having in the past is that there were some significant changes in ZeroMQ post-2.1.0 release that broke the OpenPGM protocols. I've been working with the maintainers to try and fix these things, but the best I can do is test as I haven't dug into the code and unraveled all the changes they made.

Last time I checked, it was still broken, and then I started on the UDP-based transport and haven't looked back - until now. It's pretty easy:

  $ git pull
  $ ./autogen.sh
  $ ./configure --with-pgm
  $ make clean
  $ make

and then in src/.libs I have all the libraries I need. I simply put them into the LD_LIBRARY_PATH before the installed libraries and I'm in business.

I'm very pleased to say that as of this morning, it's working again! Yup, working like a champ. I'm not positive what all the changes include, but they are significant, and possibly much needed. There's still the issue of everything being so "hidden" in ZeroMQ, but that's nothing new. It is, however, a downside when we have a competing transport that we can use that's not so heavily encapsulated.

Still... I haven't made the UDP transport reliable, and I know that's needed. So the question will be, what's the effect of swapping out the UDP-based transport with the latest ZeroMQ transport? I'll have to wait until everyone is happy with the UDP-based transport, and then swap back in the ZeroMQ one and see the difference.

If I had to bet, I'd say the difference will be minimal. I believe all the problems were really mine. Humbling to admit, but I believe it's the case.

Finally a Win for the Good Guys!

Monday, February 28th, 2011

smiley.jpg

Well... it's been a while getting here, but I honestly think that this morning was a break-through for my ticker plants. The maximum delay was considerably less than in the past (thanks to finding the silly second consumer thread) and even far better than I remembered from the other group's tests. It's a big day for the good guys, and it's been a long-time in coming.

To be sure, a lot of the problems I came across were because of the changes in the target market. But it took me quite a while to get all the details worked out. But Holy Cow! It's a great feeling.

Multiple Transports for Ticks – Political Issues Abound

Friday, February 25th, 2011

cubeLifeView.gif

Well... with the last threading bug figured out for my new UDP transport, I have to wonder if ZeroMQ is really just fine. There are several reasons to use it - OK, one: reliability, but that's a biggie. Still, it might be nice to try it again. The UDP transport, if I can add in the necessary reliability, will be better, for sure, because it's less "black box" and more transparent to all involved. It's certainly already suppressed a few nagging naysayers because they feel that PGM is not reliable at all, and they'd rather have straight UDP. Giving that to them allows them to feel like they have a victory in this "battle", and so adoption will go a little easier.

In short: Politics.

Every time I come to a crossroads like this where a technical decision is given a back-seat to a political decision, I've always felt the wrong decision was made. Always. I'm hoping to give them their UDP transport, and make it technically better than what they have, and so not have to come to this decision.

Alternatively, since they have been so negative in this adoption, it would be nice to simply do all the work myself and then point out to management how we can "re-task" these "resources" to "different efforts". Problem with not being part of the solution is that after the solution is done, you've proven yourself to be unnecessary.

Right out of a job.

Anyway... for now, I have one good transport, and I think I have two. But I'll have to run with these changes for a bit to convince people it's as good as what they have. Gotta prove they aren't needed any more. Sad but true.

So Amazingly Easy to Make Threading Mistakes

Friday, February 25th, 2011

bug.gif

Today I've been fighting a lot of issues in my new UDP-based transport for my ticker plants, and I've tried changing queues - multiple times, more logging, structural changes - all kinds of things that seemed to be the issue, but the real issue seems to be that I was not being careful enough starting my threads. Yup... threading is hard, and it's easy to make mistakes.

The set up is that I have a lot of "single-consumer" queues in my system. If you try to hit them with multiple consuming threads, you're going to be very sorry. Couple that with the fact that I was trying to be clever in my thread initialization and processing code and "covering all my bases" by putting redundant code in the initialization and processing blocks. The problem is, if you get the initialization complete, the next thing is the first trip through the processing code, and if you duplicate the code there, but have an off-cache flag set, it might appear as though you haven't properly initialized the threads.

So you start more.

Ouch.

The obvious answer is don't try to be cute. Have code in one place and one place only and then make sure it's successful. When I did that, all the other problems "magically" disappeared. Amazing.

Not really. It's obvious if you have multiple consumers on a single-consumer queue you're going to get into trouble. Also, it'll appear as though you are loosing data, when you really aren't. It's a mess.

Well... it seems to have cleared up and I'm glad. It's been a major pain.

Making a New Transport for my Ticker Plants (cont.)

Thursday, February 24th, 2011

GeneralDev.jpg

Today I've been at it all day - writing more code for my new UDP-based transport. There's a lot that I need to get going, and it's not at all trivial to get these guys working properly when you aren't locking anything. It's been a nasty day of a lot of tests and tweaks, but in the end, I have something that delivers bursts of 320,000 msgs and it does it all pretty efficiently.

Tomorrow, I'll start the live testing.

Upgraded to WordPress 3.1 at HostMonster

Thursday, February 24th, 2011

wordpress.gif

This morning I noticed that a significant new upgrade had been made to WordPress to bring it up to version 3.1 - so I had to go to HostMonster and upgrade my installs. Since my last upgrade, SimpleScripts has adopted the "update all at once" idea for multiple installs of the same piece of software. I haven't had any problems with SimpleScripts to date, but I didn't want to start this morning. Still... I knew they made backups, so I went ahead and upgraded them all at once.

Worked out just fine.

The new WordPress has a menu bar at the top of the page when you're logged into the site. It's not bad, but it's not the same style as the blog, and I'm sure it's only a matter of time before the artists get things going and make that, too, in-sync with the theme of the site.

I spent a few minutes looking over the new features - some are nice, others have a few bugs in them, but I'm sure will get worked out with the release. In all, it's a nice improvement, and while I really just use this for my journal, it's nice to know it's got the room and capabilities to be much more.

Making a New Transport for my Ticker Plants

Wednesday, February 23rd, 2011

GeneralDev.jpg

This afternoon I started work on a new UDP-based delivery system to replace ZeroMQ in my ticker plants. It's going to be as simple as I can make it while still giving me the level of service that I need for my users. Basically, I'm sitting in nice data centers, and the switches are nice, so I shouldn't have a lot of drop. But there will possibly be some, so I need to be able to plan for that. I'm not exactly sure how to implement the reliability in this system, I've got a few ideas that all could work, but I'll for sure need something.

What I want to start off with is a simple UDP broadcaster and antenna. These will take the messages, serialize them, put them in the UPD datagrams, and send them out on the different multicast channels I'd been using for ZeroMQ. I'll use boost ASIO for all this, so it shouldn't have too much overhead - it works for the incoming data from the exchanges.

On the antenna, I'll have one boost udp socket for each multicast channel, and then have a single io_service thread reading them off the socket and into "byte buffers" - one per socket. Then we'll have a thread pulling the datagrams off the queues and deserializing them to place them into the conflation queue. Pretty simple model.

Lots to write. Better get at it.

Real Problems with ZeroMQ

Tuesday, February 22nd, 2011

ZeroMQ

It really breaks my heart to see problems with ZeroMQ that aren't being addressed. For instance, I've been trying to work with the guys on IRC about the fact that the GitHub master doesn't receive messages when using OpenPGM. I've talked to the guys on IRC about this, and it seems that some of the more recent changes really messed this up, and there were no checks on the code to make sure it was still working. Sad, but true.

I've tried several times to jump-start this several times, but I didn't get very far on any occasion. This makes it hard to use, as I can't use it without OpenPGM, but I'd like to get some of the newer features they are talking about.

But that's not the worst.

I think ZeroMQ is either delaying, re-ordering, or retrying with excessive delay, some of my messages - and it's only when I'm really hammering the data rate. For example, I know that the ZeroMQ send() method is asynchronous. So it buffers up the data and then sends it. But what if it gets messed up?

The delivery seems to get worse and worse as the day goes on, and it seems to be based on a zmq::socket_r getting into a "bad state" and never getting itself out. I believe that it's in the receiver, because two apps on the same box have different reception profiles after a time.

In any case, I can't trust it and I can't build the latest code. It's just no longer a really workable solution. I have to find some kind of solution that doesn't include ZeroMQ.

When a Stack Variable Gets Too Big – Go to the Heap!

Tuesday, February 22nd, 2011

bug.gif

This morning I noticed that when I had subscribed to a large portion of the option market data feed, I got into a position where the conflation queue could exceed 131,072 (128k) messages. It's not impossible for a slow client to have this problem, and the safest thing to do is to increase the size of the FIFO queue in the conflation queue to cover all possible cases. So I bumped it up by a factor of 4 to 512k (219), and then all of a sudden I started getting seg faults. What?!

I started digging into this and was just plain stunned to see that it really didn't matter what the use case was for the conflation queue - if I had a conflation queue as a stack variable, I was going to seg fault. Right at the beginning of the app.

When I changed it to allocate one on the heap, everything was fine. OK... looks like a limitation or limit on the shell - but nothing I could find pointed out what to change. Also, this was something that everyone would have to know, and that's not a very user-friendly experience.

So I decided to make all the uses of the conflation queue in my code to be heap-based. It wasn't too hard - change the variable to a pointer, all the 'dotted' references to 'arrows' and then you're almost there. The last big thing was to make sure that they were initialized an cleaned up properly.

Thankfully, I didn't have all that many places that I had used the conflation queue - three classes in total. It was pretty easy to look at each and come up with a plan for creating these guys and cleaning them up properly. The whole thing probably took me 45 mins - and of that, more than half was spent trying to come up with a way around the seeming limitation.

Important thing to note when creating large items - you make have to go to the heap just because.