Archive for the ‘Coding’ Category

Google Chrome dev 18.0.1025.7 is Out

Wednesday, February 8th, 2012

Google Chrome

This morning I saw that Google Chrome dev 18.0.125.7 was out, and the release notes say there are a few bug fixes with regards to the settings pane. Good enough. I like reading in the comments the praise for the speed of fixing these issues. It shows me that the folks finding the issues are really happy with the team fixing them in a reasonable time frame. Says a lot, because if I had a problem, I'd like to know it was being addressed. Just nice to see.

[2/11] UPDATE: Interestingly, this morning I saw that the beat channel was updated to 18.0.1025.25 - a jump over the dev channel. This makes no sense as the dev channel is supposed to the be more often updated and less stable of the channels. So something much be going on… like maybe the dev channel is going to 19.x.x.x, or something. In any case, it's an odd move that certainly means something… I'm just not sure what it is.

The Victim of Bad Device Drivers

Tuesday, February 7th, 2012

bug.gif

I've been trying to deal with a few dodgey disk array for a few weeks. This was a consequence of the recent floods in Thailand and we were unable to get the high-capacity drives to make the 2TB array for the server, so they pressed an old email server's drive array into use, and it's been a bit dodgey to say the least.

To be fair, I'm glad we had the old array to press into service. If I had been forced to wait for the estimated 3 months, that would certainly have been worse. But I still have to say that bad device drivers are a pain, and I would really like them fixed.

So here's what's been happening… I come in in the morning and I see the mount point for this drive array is there in the filesystem, but all the jobs referencing it are failing. Wonderful. So I try to take a look at it:

  $ cd /plogs/Engine/dumps
  $ ls
  ls: cannot open directory .: Input/output error

No amount of un-mounting and re-mounting will work as the OS simply cannot see the drive array. We have to reboot the box and then it comes back online.

The problem with this approach is that I've got a ton of exchange feed recorders running on this box, and it's the only backup we have to production. If we miss recording one of these feeds, then it's gone as the exchanges aren't in the business of replaying their entire day just because we had a hardware problem.

So I'm trying to get a few things done - the first is get a real backup to the recorders in a second datacenter. The second is getting this drive array working properly on Ubuntu 10, hopefully with a kernel update that's in the offing. It is a decent array. I like it. But it's got to work first, and then I'll be happy.

Finished the Sync Start to the Greek Engine

Tuesday, February 7th, 2012

High-Tech Greek Engine

This afternoon I've put the final touches on the sync start to the greek engine. Basically, when we restart the greek engine, it's possible that we are going to miss messages from the exchange because we're down/restarting. This option allow the app to recognize that it might have missed messages, and hit the archive server and ask it for any possible messages for that time frame. If there are some, we'll work them into the message stream.

It's a very nice feature to have as it means that a mid-day crash or reboot is not going to loose anything. But it's a huge load on the archive server, and it's really not been hit all that hard, so the testing is going to be a really big part of this. Fair enough, it's going to take some time to work out the kinks, but at least now we have what we think we need, and it's up to testing to either confirm or deny those beliefs.

It'll be nice to get this tested and into the main codebase. It's been a ton of work to get all the pieces working and working well.

Scraping Logs vs. Exposed Stats APIs

Tuesday, February 7th, 2012

Ringmaster

I spent the morning today exposing another Broker service for my greeks engine - this one for stats on the running process. In the last few days, the operations folks, who have had months to decide what support tools they need, have put a halt on the deployment to production of my greek engine because they now need to have these stats for monitoring. Currently, they are running a script on the output of an IRC bot that's hitting the engine, but that parser bot depends on getting data in a specific format, and that's brittle, and doesn't allow us to expand the logging on IRC. So I built the better solution this morning.

It's all based on maps of maps, and I just put the data in what I felt made sense. It's organized by feeds and then the general engine, and within feeds, there are the stock feeds and the option feeds, and so on until you get all the data as values of leaf nodes in the maps. It's pretty simple, the only real issue was that there were several metrics that they wanted to see that I hadn't put in the code, and the person that had failed to make proper getters for the data, which meant that I had to make those before I could get at the data.

Not bad, but it took time.

The testing went really well, and they should be able to gather the stats they want at their convenience. Not bad.

As a personal aside, it really makes me wonder why it is that this is coming up right now, and why it's a show-stopper? I mean if it's a show-stopper, why wasn't it stated months ago at the beginning of testing? I think the reality is that it's not that critical, but the folks are starting to panic a bit, and are looking for the usual suspects to slow things down, or try to make this new system fit the same mold as the previous one.

It's kinda disappointing.

Smartest Way to Speed Up: Just Do Less

Monday, February 6th, 2012

High-Tech Greek Engine

Today I spent the vast majority of my day today trying to make this one client application of my greek engine a lot faster. I mean a lot faster. Friday afternoon, I was running some tests on this usage pattern, and realized that the client really was seeing some massive delays in getting data from my engine when dealing with very large, very active families. Using SPY as the example, there are some 2500 derivatives on SPY, and calculating their data and returning it to the caller was taking from 1800 to 2200 msec. That's a long time. The problem was magnified because all they wanted was three of the 2500 options, and they had to wait for all 2500.

Not good.

So Friday I jotted down a few ideas to try today and spent the first few hours doing just that. Each one was a little better, but I was still looking at 1300 msec, and that's just too long. I needed to chop out an order of magnitude or two. So I started doing the profiling. What was it that was taking so long?

Well… it's the calculations. That's no surprise, but it's a real bottleneck too. We can't really afford to make the calculations tie up multiple threads. That'd kill the box with some 50 clients each needing multiple threads for their calks. Not good. I tried to look at other things, but in the end, it always came back to the calculations.

Along the way, however, I did come up with a few really fun optimizations. I was able to look at a continually updating profile of the instrument and use those values to 'seed' the request, but the updates from the market were just so frequent, it was impossible to stay ahead of the updates. It was a real problem.

So I did what I should have done first - go and talk to the coders writing the client app.

I found out that all they really wanted were the implied vols and they only wanted two or three options in each call. Well… now that's very interesting. That's a use-case that I hadn't expected. The reason it's very interesting is that the implied vols can be calculated independently of each other, which means that by telling me you're interested in only the implied vol calculations, I can look at the three options you're asking for, and calculate just them. Sweet.

I had to work into the API the idea of the type of calculation, but we had something pretty much like that already in the API - it just needed a simple extension. And then I had to get the different type handled in the code. In the end, it wasn't too bad, and the time savings were amazing!

The 1800 msec went to 20 msec. That's something that's more than fast enough for what we need. All because I listened to what the client specifically needed. Simple way to be faster? Just do less.

Excellent.

Updated Git to 1.7.8.4 on My MacBook Pro

Monday, February 6th, 2012

gitLogo.gif

This morning I thought that git on my MacBook Pro might be a little behind the times. I don't honestly think there's a huge difference from 1.7.4.1 to 1.7.8.4, but you never know, and it's simple using the Mac OS X installer. Just download it, double-click, and it's ready to go.

It's nice to see:

  $ git --version
  git version 1.7.8.4

Nice. Love it when things "just work".

Interesting… I just noticed that Mac OS X 10.7.3 comes with git - and it makes perfect sense that it does. Xcode uses git now, and so it'd require that the OS - at least the developer tools, would have to have it. So it's not necessary for me to worry about updating this any more. It's nice to have a secondary source, should Apple decide to drop it's support, but I'm guessing that's not going to happen anytime soon.

Interesting stuff…

Exchange Timezones Hit Me Again

Friday, February 3rd, 2012

bug.gif

This morning it was brought to my attention that my ticker plants were showing the open on VIX, quoted out of the CBOE, as something different than the legacy feeds were showing. All my different feeds (dev, staging and prod) showed the same number, so I let the QA guy find out what the problem was. I had no idea where it was coming from. I wasn't even sure I was wrong.

So he asked one of the legacy developers, and sure enough, his numbers matched Yahoo! Finance. So it looked like there was something to this. So I started looking into the trade feed. Thankfully, I have a nice, stable, feed recorder and query service already going, so it was just a matter of giving it the right parameters and it would pull up the files, uncompress them, decode them, search them, and deliver me the results.

I had to admit, this isn't the first time I'd wished I had a web interface for this, but alas, I don't.

So I looked at the feed, and realized that at 8:30 am, the trades arrived, but they were not marked as 'valid' trades. This is odd because all Index "trades" are valid if they arrive after the open. And it was, after all, after the open.

So I looked at the code - and sure enough, there was the problem. The CBOE is the one exchange I listen to that's located in CST as opposed to EST. That means that the time of the "open" for the CBOE is 8:30 am, and not 9:30 am, like the NYSE, PHLX, etc. This is something I'd planned for, but hadn't remembered to use in this particular exchange codec.

The fix was simple - use the CST open, and all would be fine. Unfortunately, that means that the data for today for the indexes from CBOE is messed up, but at least it'll be right for Monday. Just all the little data things that need to be fixed up… it's getting to be fewer, but I'm sure there's still a lot to find.

Google Chrome dev 18.0.1025.3 is Out

Friday, February 3rd, 2012

Well… it's only been a few days since 18.0.1025.1 was released, but I guess the Google Chrome team realized that there were a few outstanding issues that warranted a new release and a few ticks in the version number. Nice that they all appear to be fixes for crashing bugs… way to keep on the crashers, guys.

This is Why Codecs Have to be Strictly Controlled

Thursday, February 2nd, 2012

bug.gif

OK… so I'm working on some nice little features on the greek engine today - adding a few nice IRC commands to the system, and making things just a little bit nicer for the support staff, and we start getting these odd problems. In some cases, the response time from the engine is wildly varying, and in other cases, the memory footprint is far too big. All very odd, seemingly unrelated, but all timed to happen today.

So I started looking at yet another problem - one of the clients to the engine is sending a Close Channel message when it wasn't needed. That, in and of itself, is not the problem, but on closer inspection, the contents of the message are alarming:

[asyncRead] a close channel for unknown channel was received
[asyncRead] 58 af cb a6 3b 49 db 41 03 84 4d 11 c4 5e 55 d1 05 45 12
            X…;I.A..M..^U..E.

the sec on line of the error message is the binary contents of the close channel message, which should contain an 'X', followed by the 16-byte channel ID, followed by an encoded variant value. In this case, the 'E' means it is an error, and by definition that means that a varint-encoded number follows, and after that, another variant that is the "reason'. The value after the 'E' is intended to be the numeric error code, and the variant is meant to hold the message or messages that accompany it.

But as you can see, there's the 'E', and a varint-encoded value, and then nothing. In my decoder, I look at the next byte and try to decode it. If that happens to be a String, or a Map, I can go off into lala land and decode a GB or two. Not good.

The solution? Well… there's two: we have to get the app (or app writer) that generates this malformed error to correct their mistake, and until this person can be identified, we have to protect our decoder against this kind of problem and put in a simple test:

  if (aPos < aCode.size()) {
    mErrorValue->get<1>().deserialize(aCode, aPos);
  }

The real problem with this is that someone created a codec that encodes data improperly. To what extent? I have no idea, but this kind of things is capable of bring down a whole lot of servers and clients. I'm lucky it's not been a lot worse. But it underscores the need to have a group of people that control these critical components, and not allow just "anyone" to fiddle with them. The risk and consequences are just too great.

I wish I had faith that this will be the catalyst to stop all this, but I have serious doubts about it. I'm certainly going to try to get it to create some change. It needs to happen, and it needs to happen now.

Bloomberg Gets a Pretty New Face

Thursday, February 2nd, 2012

GeneralDev.jpg

Yesterday, I heard about Bloomberg's new Open API initiative. It's a new .Net, C++, Java, and C API that is "Open" for all to use and make use of. The catch is that all the data you'd want to get is really still exceptionally expensive, but that's Bloomberg, eh? The last time I used a Bloomberg API it was the Bloomberg Server API, which was a mild modification on the old Bloomberg Terminal API that came with every Bloomberg Terminal - Windows and Solaris, going far, far back into the past.

I've just briefly scanned these docs, and it's a new API alright. Much easier to deal with, and hopefully far easier to decode the data once it's returned from Bloomberg. I like that they are trying to really make it easier to use - both in the pub/sub and the req/resp modes. It's an improvement.

Heck, almost anything is an improvement.

Still, the kicker is the cost of the data. When last I looked, it was still some of the most expensive data around. I mean outta sight prices. I don't think it's gone down in the last two years, but I could be wrong.

Yet I can't blame them. They have a nice gig - they have a great reputation on the street for their data, and so they can charge a ton and use that to keep away the riffraff. It's working for them, and who am I to give them grief. Sure… I'd love to build a system off this for the Mac and build in all the bells and whistles, but that's a really hard sell as the data is so expensive and all the online brokerages are giving their data away - with decent tools.

Still… if I hit the lotto, I'm all over this.