Archive for the ‘Coding’ Category
Write… Compile… Run… What a Dream
Wednesday, November 17th, 2010I was coding this morning - adding a new synchronous calling scheme to my existing asynchronous library, and really enjoying the capabilities of boost's asio library. I knew what I wanted to write, and I had the basics of the skeleton there for the asynchronous versions, but there were a few differences that arose simply because it was synchronous.
So I was writing this up, got done, compiled it and ran the test app to verify that I hadn't changed any of the behavior, and BINGO! It just worked.
Now, this wasn't cracking the atom - it was only about 100 lines of code, but still... the number of times I've been able to write, compile, and run code and have it just work is pretty rare. It was a very fun feeling to have. Just type it in and run it - it's pretty wild.
Good times...
Bug Hunt 101
Tuesday, November 16th, 2010There comes a time when you are done with the major issues and start to do the full-up tests when you run into little bugs that take all of about 15 mins each to solve, and you spend hour after hour fixing these little nagging problems. It's the polish in the project. You could leave these in and have the system restart, but that's got no class. You need to spend the 15 min and just fix it.
That's what I've been doing all afternoon - fixing these issues. I've solved a bunch of them, but I think there are a few more that I need to handle. Certainly when it comes to communicating with the Broker. That guy needs some help when it comes to massive hits from all my ticker plants starting up. I'm not sure exactly how I'm going to deal with them, but I know I'm going to have to.
It's all a part of the process. At least I'm no longer writing bash scripts. That was also necessary, but wasn't nearly as fun.
Hardening the Ticker Plant for Production
Monday, November 15th, 2010Today I'm spending a lot of time hardening the TickerPlant for production. It's not the most glamorous work... in fact, it's kind of mind-numbing, but it's as important as getting the rest of the core right, as the program isn't going to be run by me - it's going to be run by operations, and that means it needs to be a lot more secure than I might normally make it for myself.
I started with a little better application shell - it's about 50 lines of C++ code that starts things off after reading the command line arguments. I added in the standard 'usage' function to act as the "help" of the application. I then added a lot more comments in the code and it was looking pretty nice.
Then it was off to the shell scripts. I need to make the start/stop/restart scripts so that the user doesn't need to worry about the environment or the location - it's all just automatic. I've used scripts like these in the past and they have been really helpful, but getting them all into bash will be a little work - they were a mixture of bash and c shell in the past.
It's not glamorous, but it's necessary.
UPDATE: I still need to add in the sending of the ticker plant's stats via SNMP and add in the Jabber client so that we can communicate with the TickerPlant easily.
Really Hammering on the Unit Tests
Monday, November 15th, 2010I have been looking at the memory footprint of my best bid/offer server and thinking that there might be a memory leak in there. I've done a lot of unit testing, but it's very hard to tell exactly what's happening when you hit it with 128,000 symbols and more than 30 ticks a symbol. After all, I expect the memory usage to rise as I put in new instruments. But I was worried that I was still creating data structures for existing instruments.
I looked at the code, and traced it for a few cases, and it seemed to work, but that wasn't nearly as satisfying as I had hoped. I really wanted to know for sure. So something finally came to me: Hammer it! I mean, really hammer it.
So I added a few loops to the test app and used the existing 128,000 instruments. I ran through another dozen ticks that would very predictably effect the best bid/offer. I ran that for each instrument ten times and let it run.
What I saw was that the memory climbed during the initial phase, and then during the "re-run" phase, it was rock solid. I mean it was amazing.
I'm satisfied. The small-scale tests are good, and the massive scale is as well. There aren't any leaks I can see. What I have now is what the system requires.
BBEdit 9.6.1 is Out
Monday, November 15th, 2010This morning I saw that BBEdit 9.6.1 was released. It's got an impressive list of fixes, and while I can't say that I've hit one of them, it's nice to see that someone is really kicking the tires, and they are fixing these issues quickly. Good enough.
Refactoring the Serialization Scheme for a Component
Friday, November 12th, 2010I was looking at the memory usage of my NBBO server and noticed that after a little bit it was hovering on the high side of 4GB in memory. While that may not seem like much, it's far too big for what this process was doing, so I decided to dig into what that number was all about today. The end result was that I really needed to change the serialization scheme I was using for my engine component.
In order to have everything all "fit" together, I have a Broker service that holds my configuration data. I've been placing the current running state in that service for each instance, and it's been working well for me. This time, however, I realized that while it works for me, it's not really very efficient, and that's what's killing me.
What I was doing was to place all the values into lists and maps and then just bundle them all together and let that be the payload that was sent to the configuration service. While this is just fine, it leads to data structures that are pretty involved to deal with, and really no more readable than the byte-level encoding I'm doing with my messages. So I decided to go back to that scheme as it should be easier for someone to understand as it's also used in the message serialization code.
The wrinkle to this problem was that I had several objects/structures and each one needed to be serialized and deserialized properly and while it's not hard, it was time-consuming. Each one needed to have it's own format, and then I had to write it up, and finally test it. Slow and sure is the best way to do all this stuff.
In the end, the serialization footprint went down from 2GB to about 50MB - significant reduction in the transient memory usage of the process. At the same time, the speed of the serialization and sending went down - not bad, either.
Good News for Java on the Mac
Friday, November 12th, 2010Looks like Oracle and Apple have come to terms about supporting Java on the Mac. Now it looks like we'll have a standard distribution of Java from Oracle (Sun) just like the Windows and Linux builds. That's nice.
I have no doubt that it'll be more current than the versions Apple ships - they try to stay pretty stable on a single version of the OS, but Oracle (Sun) really moves regardless of the underlying OS version. I'm just not sure about the overall quality, but I guess that's going to be as good as the linux port. Which is to say, not bad.
Good enough. Glad to hear it.
The Beauty of Tuning a Solution
Thursday, November 11th, 2010Yesterday afternoon I got the first really good cut of one of my servers done. It was really nice to see it running for more than five minutes, and it was a great relief. However, the stats on the delivery times of the messages weren't what I was hoping for. In fact, it was a little on the pokey side of things - not really all that much better than the existing code. The numbers for the delay from the receipt of the UDP datagram to the client actually receiving an NBBO quote message I had were pretty bad:
Max | Min | Avg |
200 msec | 10 msec | 70 msec |
But I knew that I'd be able to make it faster... I just needed to figure out where the delay was and what I needed to do to fix it.
This morning in the shower I was thinking about the problem, like you do, and realized that I was probably hitting the sleep intervals for processing data off the queues. Because I have lockless queues (for the most part), I don't have the ability to use a conditional on a mutex to be alerted when something is there to process. The pop() methods will return a NULL when there's nothing to return, and it's up to my code to wait a bit and try again.
These waiting loops are pretty simple, but I don't want them to spin like crazy when the market is closed. So I have a variable sleep value for the loop - the longer you go without getting something from the queue, the bigger the sleep interval to make it less of a load on the system. So if things are coming fast and furious, there's no wait, and after the close, you don't crush the box with your spinning loops.
But there were problems - specifically, if you waited a little bit, you might very quickly get into a 100 msec sleep. If you happened to hit that once, you're likely to have to wait another 100 msec before checking the queue again. All of a sudden, the 200 msec maximum delay was understandable. So how to fix it?
The first thing was to pull these waiting loops into the queues themselves so they were a lot easier to control and modify. The code got a lot cleaner, and the timing loops because part of the queues themselves. Much nicer.
Then I needed to tune the delay parameters so that I was being careful to be responsive, but at the same time, not overly hoggish of the CPU. When I looked at the delays I had, it seemed that I was increasing them far too fast (red line). When I took it in a lot more smaller steps, I started getting really nice results (blue line):
which resulted in the much more acceptable delays of:
Max | Min | Avg |
8 msec | <0.5 msec | 3 msec |
Sweet.
Get it right, and then get it fast. Works every time.
Finally Able to Nail Down the Best Bid/Offer Server
Wednesday, November 10th, 2010Today has been a long day of working on getting a few details done and then focusing back on the Best Bid/Offer server. I have been struggling with a boost asio crash in the code, and I started removing things until there was just about nothing left. What I found was that it was in the exchange feeds - but they have been tested in isolation pretty well. What's up?
Well, I started to look at the code again and then it hit me - stack versus heap allocation and the copy operations. In STL's map, you create an empty value and then copy in the contents. If I've been lazy in the least, I could really have messed myself up there. So I decided that it wasn't worth the worry and switched to pointers and the heap. Now I wasn't copying anything and I just had to add in NULL pointer checks to a few places.
When I made those changes, everything worked really well. Wonderful!
I then took the time to add some stats to the client so I could see the delay from the datagrams hitting the initial server to the final client. A simple min, max, average should be sufficient, and it was pretty easy to build.
Very nice work.
I'm beat.