Tracking Down Nasty Memory Issue – Patience is a Virtue (cont.)

Detective.jpg

This morning has been very enlightening on ZeroMQ. Very exciting stuff. As I was leaving yesterday I had made a test app for the ZeroMQ guys to check and then posted the following test results as I varied the value of ZMQ_RATE:

bps ZMQ_RATE Initial Final
10 Mbps 10000 7 MB 18 MB
50 Mbps 50000 7 MB 73 MB
200 Mbps 200000 7 MB 280 MB

The data was pretty compelling. The effect ZMQ_RATE had on the memory footprint of the same data source was staggering. Thankfully, I put it all together in a nice email to the mailing list and I got a great hit from Martin S.:

Isn't it just the TX buffer? The size of PGM's TX buffer can be be computed as ZMQ_RATE * ZMQ_RECOVERY_IVL. The messages are held in memory even after they are sent to allow retransmission (repair) for the period of ZMQ_RECOVERY_IVL seconds.

So I added the following to the ZMQ transmitter's code:

  static int64_t     __rate = 50000;
  static int64_t     __recovery = 1;
  static int64_t     __loopback = 0;
 
  // we need to set this guy up properly
  top->socket->setsockopt(ZMQ_RATE, &__rate, sizeof(__rate));
  top->socket->setsockopt(ZMQ_RECOVERY_IVL, &__recovery, sizeof(__recovery));
  top->socket->setsockopt(ZMQ_MCAST_LOOP, &__loopback, sizeof(__loopback));

And then started running the tests again.

The results were amazing:

bps ZMQ_RATE Initial Final
50 Mbps 50000 7 MB 11 MB
200 Mbps 200000 7 MB 32 MB

This was exactly what I was looking for! The ZMQ_RECOVER_IVL can't go below 1 sec, but for me even that's too much. If you're not here and ready to get ticks, then waiting a second is likely to be several hundred if not several thousand messages. It'd be fine with me to make it 0.5 sec - but Martin says that's the underlying resolution of OpenPGM.

Not bad. I'll take it. What a great morning!

[12/7] UPDATE: the option:

  static int64_t     __loopback = 0;
 
  top->socket->setsockopt(ZMQ_MCAST_LOOP, &__loopback, sizeof(__loopback));

is a massive red herring. It's not about the loopback interface, as my reliable multicast URLs are all targeted to specific NICs, it's more about being able to receive on the same box as the sender. I was trying to figure out why things "broke", and it's when I took this out that things worked again. Dangerously worded docs on this one... leave it out.