The Financial Industry does have it’s Personalities

cubeLifeView.gif

There's no denying that the financial industry has it's share of personalities. And I'm being kind here, as you will soon see. For the most part, I take it as part of the job. They are millionaires and are making (and losing) millions a day, so they are (somewhat) allowed to be high-strung, and a bit temperamental. But today was an experience that I get only once in a great while and it's worth writing about.

One of the users of my market data provider chatted me saying he was seeing vastly different response times for a 300 symbol request of some historical data. I had helped this person just yesterday get another data request under control, so I was wondering what it could be that we didn't get covered yesterday. Well... I got the request and tried it on two machines - the one that he was running it on, and one of my machines. I've learned enough in this business to know that there are all kinds of unseen issues on some boxes - wild processes, memory hogs, etc. all need to be factored out to make sure you're looking at your problem and not someone else's.

What I found was that he was right - there was a significant difference in the response times, and more importantly, the data provider was getting the data to the server reasonably quickly, it was getting the data from there to the client that seemed to be the problem. I told the user what I thought the problem was, and my attack plan. I always like to be as transparent as possible with the users as it lets them know what I'm doing even if they don't understand it all.

He didn't agree with my attack plan. He wanted me to determine why the variance existed in the first place. I tried to tell him that I believed that the gathering/decoding logic in the client API was incredibly inefficient and the resulting CPU-bound process was wildly varying because the load on the box was wildly varying. He wouldn't listen to it.

So I bit my tongue and went on to prove to him that even-though hat he thought was faster wasn't really any faster, it was a sample size of one with a large variation in the load. He tried to tell me that I was all wrong. That he had done the tests properly and the sample size didn't matter. I tried to say what I thought was the problem and what my plan of attack was going to be. He only got madder.

We'll move along through this part of the day quickly because it's really not going to help to cover it in any detail. Suffice it to say that I was treated very unprofessionally, but finally was told to keep him appraised of any updates.

I then took my time and put in tests to see where the real time was being spent. I was surprised to see that the vast majority of the time was in reading data off the socket. In fact, the protocol between the client and the server has all datasets ending in a CRLF combination, and so the data that's read in needs to be checked for this combination.

What I realized was that in reading from the socket, I had limited the data read in from the socket to about 2kB a 'chunk'. Each chunk was then added to the result set and checked for the terminal data condition. But imagine if the data was going to be 2MB? That's 1000 chunks and the first chunk is going to be checked 100 times for the terminal data. There's the problem. So I changed the socket reader to read in everything that's available into the buffer as opposed to a small chunk. What happened was not surprising: all the data was read in at one time, the check was done once, and the result was that the reception of the data was far far faster than before.

When I tested this out on his requests I saw 10 to 40 min requests fall to 2.5 min and stay there. Very little variability now because the process isn't as CPU-bound and the transfer takes priority. Very nicely done, in my mind.

When I gave him the results, he wanted to test it, naturally. When he found out how fast it was, he said "Why haven't we done this before now? We've wasted a lot of time. This should have been done LONG ago!" How nice.

I pointed out that when we put this system together it was an order of magnitude faster than the system it replaced. In fact, it was just fine in the performance department for everyone - even him. That I didn't investigate every possibility for a performance improvement is because I had other things to do as well, and they were all very happy with the speed as it was. He admitted that was true. How nice.

So in the end, it's much faster thanks to a little thought that he didn't want to have me consider, and then was mad that I hadn't fixed the problem that didn't exist until today. Amazing person. Truly one of a kind.