Archive for November, 2008

Figured Out a Work-Around for the VantagePoint Second Y Axis Bug

Monday, November 10th, 2008

comboGraph.png

I've been wrestling with a VantagePoint bug concerning the placement of Variables on the secondary Y axis, and this morning I got another email from Gordon with an example application that showed the secondary Y axis working perfectly. It took me a little bit to figure out that Gordon had put in the setLicense() call into the code, as I put mine in at another place and his trumped mine and so nothing worked, but that was a "Duh!" moment that was easy enough to figure out.

No, the problem was that on a simple graph, like Gordon's example, where the graph was all set up and ready to go before the first call to unlock() on the graph, didn't show the problem. When I looked at my code, there's all kinds of lock() and unlock() calls, and in general, the construction is far more dynamic - by design.

I was playing around with a lot of things and then came across this little nugget that was the beginning of the end for the problem. When I did the initial setting of the Variables on to the secondary Y axis, the code looked pretty familiar:

  for (int i = 0; i < cols.length; ++i) {
    getGraph().getVariableAttributes(cols[i])
        .setOption(VariableAttributes.voiSecondary, true);
  }

As you'd expect, this runs through the array of int values for the column numbers and sets those that need to be set. Pretty simple. Or so I thought.

When I did this I found that setting cols[2] actually also set cols[0] and if I cleared out cols[0] it also cleared out cols[2]. It's like they were linked somehow. I verified this by logging the status of each getOption() call for each of the columns on the graph after each call to setOption(). I was stunned.

I also was unable to reproduce this in the example Gordon gave me. I wanted to be able to send him back something but it wasn't looking good. Then I got the idea that maybe this was an order of operation issue, like some of the others have been with VantagePoint. Basically, for showing points and labels, it seems there's a necessary order to respect. This isn't required when you build the graph atomically, like Gordon's example, but it is in my code.

So I started messing with the order.

The first thing I did was to move the setting of the secondary Y axis to after the visibility of the points and labels. Immediately, it worked. Wow. That was fast. There wasn't much else to do but to document the heck out of the placement of the code, and then write it up to Gordon. In the end, I'm not sure if he's going to write back with requests for more data. If he does, I'll certainly try to supply it, but it's going to be tough. Simple is not where this guy shows up.

But it's nice to have it working as it's supposed to:

Fixed Second Y Axis

In a Good Organization, Communication is the Key

Monday, November 10th, 2008

cubeLifeView.gif

This morning I walked in to find that over the weekend about a dozen of my machines around the globe had been rebooted. Normally, this is OK because the Unix Engineers tell me this and I get online on Sunday evening and fix things up so that everything is ready to go in the morning. But because they didn't tell me, I had no idea, and was playing catch-up for a few hours this morning to get things up and going.

Now I realize that this isn't the end of the world. It impacted a few users for an hour tops, but the idea is that the preventative medicine here was a 10-sec email. That's it. When the effort to make things "good" is that little, I'd think it's worth doing. Keep the lines of communication open. Make sure people understand what's happening. That's all I'm asking for.

Well... this time it was more than they did. I was very disappointed, but things got going. It's just that I don't feel the need to keep them up to date of my changes, if they aren't going to keep up their end. It's degenerating into an every man for himself environment and that never works out well.

Apple’s Amazing Customer Service Blows Me Away

Monday, November 10th, 2008

iMac-G5-20.jpg

I've been cleaning up my office - OK, more than 'cleaning', I've been gutting it and dropping about five computers (Sun, SGI, NeXT, HP, and Canon) and reorganizing the entire office to simplify the office and clean all the wiring and power distribution I have. It's taken me several weekends, and it's required me to throw away computers that I've had longer than I've had kids. It's been an emotional journey.

Several months ago, my iMac G5 died and the cats at the Genius Bar in OakBrook said that it would be $900 to replace the mainboard. Yikes! That's nearly the cost of a new Intel iMac. So I got the Intel iMac and put it into service. I didn't get rid of the iMac G5 because I really wanted to get it working, I just didn't know what it'd take to get it going.

Well, in the cleanup, it was one machine that I just could not part with. I think it's the emotional attachment to the first iMac I bought myself. It was a G5, after all! That was a smoking processor! So I kept it. Well... this weekend, I decided to try and see if I could get to the bottom of the problem. All my Google searches said "Mainboard replacement", and my tests showed that I had to do the same. No way around it.

So I realized that it was worth $900 to me, and I went back to the OakBrook Genius Bar and said the story of my iMac. It boots, but then locks up. The gal took it and looked at it behind the bar (it was very crowded) and after a little bit came back to me and said "Yeah, it looks like the mainboard, and it's out of warranty." This, I knew. But I was ready. "But we're going to fix it for you for free."

What!?

Free?!

Yes, free. She took down a little more information from me and went to write up the ticket. It was going to take about 5 days to get the board and replace it. I told Liza about the news and she was blown away for about half a second. "Bob", she said, "You bought a MacBook Pro for your Mom here, my MacBook Air, the three Kids' MacBooks, your iMac... they're going to throw you a bone because it's good advertising."

OK, I had to agree, she's right. After all I've spent in that store, the cost of the mainboard is really nothing to them, but to me, it's incredible customer service. I'll be telling this story for a long time to come - and buying Apple computers for even longer.

So in about a week I'll be able to go back and get my new iMac G5 and put it in my new office and have it back up and running. That's just the most incredible news I've had in weeks. I'm so very glad I didn't ditch it in the clean-up. This is going to be great.

The Problem With Agreeing to Help Another Project

Friday, November 7th, 2008

Detective.jpg

I've been suckered into helping a non-developer pretend to be a developer and write this critical piece of code for trade processing. It's a bad idea from the start, but I knew the probability of him getting it right from the start, without any help, was so near to zero as to be indistinguishable from zero. Some may think me cruel, I think of it as honest assessment. This guy is a good natured, decent guy, but has no formal training in developing at all. Everything he's learned is from hacking around on the job. While that's nice, it's no basis for making a critical component. It just isn't.

So I agreed to help. And help I have. I've been sucked into cleaning up the code many times - each one because he's gone off where we left off and started to add in a new batch of functionality. Lots of copy-n-paste, but not even getting a clear and concise set of requirements from the functional stand-point. What's supposed to happen on these inputs? What about these? Some he knows, and some he doesn't. So I have to help there too.

But today it was a new dimension in this as I got called into his manager's office who clearly didn't know I'd helped this guy to the level that I have. To be honest, the code it 80% mine, and 10% the guy before me, and 10% his. He really doesn't know what's going on, and it's not because he's dumb, it's because he didn't code it and didn't spend any time figuring out what the code is doing.

So we're in his manager's office and he's saying things like "We developed this." Hold on there, Professor... Let's be careful about the use of the pronoun "we" here... I wrote a bunch of it, based on his somewhat dodgey specs. If there's an issue about what it should be doing talk to him - if it's about how it's getting done, then that's me. But this manager is not known for his precise language. He's considered somewhat of a micromanaging developer-wannabe, and as such he's throwing around "we" way too much for my taste.

But he gets mad at me when I say "Watch the 'we', I wasn't in it at this point" to which he gets angry and says "It's our code - that means we built it." Yikes! That's a leap I'm not willing to take at this point in time. So we go back and forth. Finally, we get to the point where he's telling me the specs that I didn't get from the guy I was helping. I put them into the code in all of 15 mins and the tests work wonderfully. Surprise.

It's this getting suckered into this project and then yelled at for not making it perfectly without any specs that makes me a little leery of helping these clowns ever again. After all, it's not my job if this guy can't deliver anything that works. It's his. At some point, if this place doesn't start to turn around, it'll be every man for himself, and at that point I'll be fine, but he's going to be looking at the door.

Perl and Regular Expressions are Pretty Amazing

Friday, November 7th, 2008

perl.jpg

Late yesterday I was working on enhancing a feature of my fast-tick risk server where I wanted to be able to take research portfolios and load them into the system as if they were real positions - just tagged a little differently so they aren't confused with real positions. As I was doing this, I realized that I needed to parse the option symbol and remove a single component.

In my server, the IBM Dec 2008 85.00 Put is symbolized as IBM:IBM.U:20081220:85.0000:0 where the components are separated by colons (:) and the first is the underlying (many times the symbol for the underlying is not the option symbol), the second is the option symbol, a dot, and the exchange the option is traded on, the third is the expiration, the fourth the strike, and the last is 0/1 for Put/Call. Pretty simple. But for technical reasons of the file formats, I needed to have:

  IBM:20081220:85.0000:0

essentially stripping out the option symbol and exchange. I knew it was possible in Perl, but at the time I was on the train trying to work this out on my way home. Thankfully, OS X has a complete perl reference built-in.

I started assuming that the symbol was given to me. I knew I had it in the script, I just needed to mangle it to the proper form.

  my $symbol = "IBM:IBM.U:20081220:85.0000:0";

and if I did the simple regex on it, I almost got what I wanted:

  my $symbol = "IBM:IBM.U:20081220:85.0000:0";
  print $symbol . "\n";
  $symbol =~ s/(^.*)\:.*\:(.*$)/$1\:$2/;
  print $symbol . "\n";

I got:

  IBM:IBM.U:20081220:85.0000:0
  IBM:IBM.U:20081220:0

and as soon as I saw this, I knew it was because the first wildcard was being 'greedy' in it's matching, and I was deleting the second to the last, not second, component of the symbol. So I looked up the perl docs on my Mac, and there in a wonderful example was the way to make it a stingy match:

  my $symbol = "IBM:IBM.U:20081220:85.0000:0";
  print $symbol . "\n";
  $symbol =~ s/(^.*?)\:.*?\:(.*$)/$1\:$2/;
  print $symbol . "\n";

With this, I was able to match the first part properly and the results were what I wanted:

  IBM:IBM.U:20081220:85.0000:0
  IBM:20081220:85.0000:0

While I knew there was a key to regexs that would make the normally greedy match a stingy match, I'm still amazed at the power of a language like Perl with it's very powerful regex system built in. I put in the code this morning and it worked like a charm. It's really pretty neat that a half-dozen lines of a perl script can add all this functionality. Sweet.

Pushing More Ticks Through the Fast-Tick Server Safely

Thursday, November 6th, 2008

servers.jpg

This morning I had a problem with my fast-tick server where it appeared that one of the 'sidekick' threads was not able to successfully process it's run-loop. I noticed this because it started to run the 'purge' of stale greek values, but it never completed the task. When this happened I had to restart the complete app because the code was locked up and yet wasn't crashing. No fun, and it happened twice in one morning!

The relavent portion of the code responsible for this basically did the following:

    /* 
     * Interesting problem... when this guy runs and there is a lot of 
     * work to do, he'll effectively lock out the CalcEngineWorkers and 
     * nothing will get done with the CalcNodes. All because this guy 
     * is too fast and keep locking other processes out. So... I'm going 
     * to put a few 'breaks' into the processing flow. A hundred of them 
     * to be exact, and each time, we're going to see if someone else 
     * needs a turn. Then we'll continue. This is just a cooperative 
     * way to get the job done without locking other threads out. 
     */ 
    int   blockSize = InstrumentManager::numberOfInstruments()/100; 
    // lock up this guy for a read to make sure he doesn't change on us 
    __lockRead(); 
    try { 
      // log what we're going to be doing 
      getLog() << l_status.setErrorId("InstrumentManager") 
               << "taking the time now to purge old data from " 
               << "the instruments" << endl; 
      // next, we need to get an iterator for all the instruments 
      int                   cnt = 0; 
      int                   pass = 0; 
      tIterator< void * >   iter = __allInstruments(); 
      while (iter.hasNext()) { 
        INSTR_BASE      *inst = (INSTR_BASE *) iter.getNext(); 
        if (inst != NULL) { 
          inst->retain(); 
          cnt += inst->purge(); 
          inst->release(); 
        } 
        /* 
         * Check and see if we should yield a bit and see if the system 
         * has something else that needs to get done. With only the read 
         * lock, this isn't a bad place to pause. 
         */ 
        if (++pass % blockSize == 0) { 
          sched_yield(); 
        } 
      } 
      // log what we did 
      getLog() << l_status.setErrorId("InstrumentManager") 
               << cnt << " unnecessary data elements purged from " 
               << "the instruments" << endl; 
    } catch (...) { 
    } 
    __unlock();

the code essentially locks up the list of all instruments for a read, goes through each instrument, telling it to purge any stale data, and then reports on the results and releases the lock. Seems pretty simple. But it's got issues.

First, why maintain the read lock for the entire process? Well... if we don't, then someone can add or remove instruments and we'll not have the first clue about it and the instruments may actually be deleted and that's going to cause us a world of hurt.

Second, why are we waiting to put the retain() on the instrument until we get to it in the iterator on the list? Seems to be a better idea would be to put the retain() on the instrument as soon as possible and then we know it'll be around for us to use when we get around to checking it. Good point, that was one of my concerns about this code when looking at it today.

But the real kicker is the hidden issue: other threads/processes waiting for the lock to be removed so they can modify the list. This may not seem like a lot of work, but with in excess of 400,000 instruments, it takes about 10 seconds (wall clock time) to run through this section of the code. That's a lot. When we are in times of a lot of changes - like the open of the US markets, then this is a real issue. This lock causes us to pause the processing of ticks and greeks and that's no good. So... what can we do to fix this?

Answer: do the obvious: copy the instruments and then process them. More correctly copy the pointers to the instruments and then run through the list processing each.

Why is this a big difference? Well, first off, it means that the lock is only going to be on the main instrument list for the time required to copy about 400,000 pointers. That's essentially nothing. Then the lock is removed and the other threads and processes are free to do what they need.

Second, it allow is to put the retain() on the instrument at the time of the pointer copy, and that means that it's "safe" as soon as possible, and that means it's nearly impossible to have an instrument killed out from under us. Much nicer.

In the end, the code now looks like this:

    /* 
     * We need to make a copy of the pointers to all the active 
     * instruments right now. As we do this, we're going to retain() 
     * each so it doesn't go away. When we're done with each, we'll 
     * release() it to be nice. 
     */ 
    tVector<INSTR_BASE *>   instruments(instrCnt); 
    // lock up this guy for a read to make sure he doesn't change on us 
    __lockRead(); 
    try { 
      tIterator< void * >   iter = __allInstruments(); 
      while (iter.hasNext()) { 
        INSTR_BASE      *inst = (INSTR_BASE *) iter.getNext(); 
        if (inst != NULL) { 
          inst->retain(); 
          instruments.addBack(inst); 
        } 
      } 
    } catch (...) { 
    } 
    __unlock(); 
    // update the instrument count to what we actually have in hand 
    instrCnt = instruments.size();
 
    // next, we need to run through all the instruments 
    int     cnt = 0; 
    for (int i = 0; i < instrCnt; ++i) { 
      INSTR_BASE        *inst = instruments[i]; 
      if (inst != NULL) { 
        cnt += inst->purge(); 
        inst->release(); 
      } 
    } 
    // log what we did 
    getLog() << l_status.setErrorId("InstrumentManager") 
             << cnt << " unnecessary data elements purged from " 
             << "the instruments" << endl;

My initial tests show that the clean-up is being done just as before, but the pauses in the processing are not - which is good, and expected. I'm very pleased with this because I think it's quite likely that this had something to do with the deadlock, and now shouldn't be an issue any longer.

Decided to Try Firefox 3.1 Beta 1

Thursday, November 6th, 2008

Firefox.jpg

This morning I was wondering if the heavy load I was seeing on my MacBook Pro when viewing Intrade in Firefox (approx. 30% CPU) was fixed in the Firefox 3.1 Beta 1 - so I downloaded it and tried it. Turns out, that's not it. But I did notice a few nice things about the beta that may be really slick when they finish them.

First, the 'smoothness' (hard to put it any other way) of the GUI is much improved. I feels slicker, more polished. I know, very subjective, but still, it's a nice addition. Also, the tabs have been worked over. They are nicer as well. But there are still a few issues with the updating of the GUI that make me feel like they have a little bit of work to do before it's 'final'.

Nice, and in the right direction, but not ready for me to use. Not yet.

Updated the Zoom Reset on the Scatter Graph for Z-Axis Changes

Wednesday, November 5th, 2008

comboGraph.png

The developer that asked for the 'Zoom Reset' feature on the axes changes was testing this feature today and noticed that when changing the z-axis selection the data was all visible, but the zoom was not reset. Fair enough... any change means any change. So I added in the zoom reset for any change in the z-axis pick list.

Not too bad, and it's what he wanted. Given that I did this for him, it makes sense that he gets what he wanted.

Significantly Improving the BKit Graph GUI Widget Sync Code

Wednesday, November 5th, 2008

comboGraph.png

Today I spent some time working on removing the unnecessary calls in the 'simple' BKit graphs that synchronize the graph to the GUI widgets that allow the user to configure the graph. There were two biggies - first off, when adding a column label to the graph, the standard procedure is to sync the GUI widgets to the change at the end of the change. But if we're setting all of the column headers - as we would be at the onset, then we're calling this a lot more than we need to. Fix there is to be clever and set the column headers on the underlying graph - save the last one. Do that last one like the original and have it sync the GUI widgets at the end. This way, (n-1) are done quickly, and the last one is done completely.

The next issue was the updating of the secondary Y pick list based on the addition of columns of data to the secondary Y. Again, this makes sense for the addition of a single column of data, but for the initial set-up, we need to pause the listener handler while adding them, and then resume it when we're done. Simple enough, but to find it took a little bit of time.

I'm not fooling myself into thinking that this is going to make the graphs faster or more responsive - this is a quicker set-up, that's all. But as it was a linear problem, the larger the data set, the more improvement we'll notice. So it's not bad. Plus... don't do something twice if you don't have to. Makes good sense.

Amazing Accounts of an Amazing Event

Wednesday, November 5th, 2008

PotUS.jpg

Part of me is stunned, part expected it to happen just as it did. In the end, today is the first day of a new feeling about America by Americans for Americans. Facts haven't changed all that much - staggering debt, horrible foreign relations, poor international opinion... but things have changed - attitudes.

I've read time and again this morning how nothing's changed today, but people feel vastly different. There is hope, optimism, and a feeling that these problems are going to be tackled with compassion, fairness, and honor. I don't think there are many people that think the change will be overnight, or even in a year or two. But it's the change that the citizenry believe it's time to take back the running of this country.

I am so excited about the future, it's hard to believe.