More Evils of Optimization and Performance Tuning

servers.jpg

This morning I came in to find that the CPU usage on the server's kernel is at an all-time high! OK, thankfully, it's still running, but it's not running the way I had expected it to be running this morning. I had done too much optimization and that lead to backups in the production system. Unfortunately, there's no way to test the server like it's used in production other than to put it into production. There are clients all over the globe, and it's just not possible to load the dev server to the same level. So what was looking like it was working very well in development, wasn't working so well in production.

Not a big surprise, but it's something I need to go back and mess with. Basically, the system was balanced - no one component that much faster than the others. When I made the pub/sub system faster, that allowed the price ticks to move faster, and the clients had to keep up. Well... what if they're on the other side of the globe and the WAN line just isn't fast enough to get them the bandwidth to the server that they need to keep up? Yup, backups.

It's not horrible, as the server becomes a self-regulating system, but it's annoying that there's so little I can do to check these kinds of interactions.

So today I'm backing out a few of the changes to the pub/sub system and putting in a few more subtle changes to the pub/sub system. We'll put in an index to the primary feeder queue so it's faster to look up duplicates (if it's configured that way)... we'll also have a little better look at the outgoing client queues now that they hold the instruments again (back to a few large queues)... all these will make the code more understandable and not quite as high performance, but in the end, it'll balance the system better.

I'm still going to try to get the tick rate up, but I may be able to do that with these changes installed. We'll have to see how the tests go.

UPDATE: OK... I'm getting the code back in balance. There are a lot of things I can do to make it work less and that's what I'm focusing on at this time. It's looking much better in development, but the acid test is, as always, production tomorrow.