Archive for October, 2004

Too much Dependence

Monday, October 25th, 2004

OK, I'm the first person to admit that I enjoy the fact that I understand many (but certainly not all) of the systems in use at work. It makes it easier to understand where the ones I build fit in and what jobs are best done by other systems. But there's a dark, ugly side of this... the 1:30 am phone call.

In the past week I've been called three times at 1:30 am to help the operators figure out problems they are having with systems. Every time, it was an upstream system that had the problem, it was just that when it got to my system it was clearly apparent. So they called me. Unfortunately, all I was able to do was to point them to the systems that had problems - which they already knew from other sources. So I'm not really helping a lot, but I'm certainly wide awake and that's no fun.

I have put together web pages to assist in figuring out what system is at fault, and what might be done to clear up the problem. Of course, these aren't foolproof documents as different kinds of problems keep popping up and we have to adapt to the new data sources, etc. But it's a darn good start, and far more than most developers give their support staff. But if they don't read it, it's not going to help them, and it's not going to stop the 1:30 am phone call.

So now I can tell them to read the web page and call me back if it isn't solved. But that's still not stopping the calls. Once they are used to me having the answers for them they are going to skim the docs and call me. It's a little disappointing and very tiring.

Stability for BBGServer

Friday, October 15th, 2004

It's taken months but the final trick to getting stability into the BBGServer has been to take all references to the Bloomberg API out of the main server and place it into a simple, small, single-threaded, C app that can be loaded and run from the server. The idea is that Bloomberg's API is itself not thread-safe. It can't handle several requests coming and going, and everything I tried to do with locking didn't change that basic fact.

So I created a simple C app that would open up a socket connection back to the server and wait for requests to process. When it received a request it sent it to Bloomberg via it's own connection and then waited for the response. When it got the response, it sent it back to the server and waited for another request on the socket. This was the ticket... get Bloomberg in it's own application with only one request pending at a time.

What's happened is that as soon as I put that into the code things stabilized. No more Bus Errors. No more crashes. It was a wonderful sight.