Archive for June, 2010

Getting on a Decent Version of Boost

Wednesday, June 30th, 2010

Boost C++ Libraries

In the past, I've gone the "roll your own" mode with C++ libraries, and there's a lot to like about the model. First, if it's in the C++ compiler (GCC), then it's free, automatic, and you use it - unless there's a compelling reason not to. Second, if there's an RPM for the system, use that as it's easy to place on every box and it's close to being in the compiler as it's been installed in the obvious locations for the compiler to pick up. Third, if you need to compile it from source, then do that, but it means that you either have to make a tarball to place it in /usr/local, or you have to manually install it on all the boxes, or you have to place it in the delivery package with your code. Finally, you can write your own.

But there are a lot of reasons to skip to the last step. I can remember doing performance tests on the STL vector - not impressive in speed. So I write a vector template class that's a ton faster because it's a simple array of instances. It's not as generic as the STL version, but for all the things I needed, it was far better. So there are reasons, sometimes.

But in this new rewrite I'm doing at The Shop, I decided to try to use as much Boost as possible. There are a ton of libraries in Boost. It's going to be in the next C++ standard, and that means it's going to be in GCC, so there's plenty of reasons to use it. I looked, and for CentOS 5, there are RPMs for Boost! What a break. Problem is, they are for version 1.33.1, which is several years old, and missing some very key features for me.

Further complicating the problem was how to approach the inclusion of Boost, should I choose to upgrade to the latest version for this project. What I wanted was an RPM of the i386 and x86_64 versions of the libraries and the headers. It's certainly possible to make one from the source, but that's a ton of work that just doesn't seem to be worth the effort. While it might be nice to build it and make it available to the world, as an option for folks in my shoes, it seems that having no experience in making an RPM puts me in the distinct disadvantage here.

Putting it in /usr/local means that there's something I need to put onto each machine, and if it's got the RPMs installed already, there's a real possibility that there could be some serious conflicts. Additionally, The Shop doesn't have a nice NFS mount point for all this open source code where it's organized by the project, the version, etc. and therefore available on all machines naturally.

I'm left with the icky realization that the easiest and safest method is to package it with the app. I really hate this, but there's really no other solid alternative.

So how do I get the latest Boost?

It's really a lot simpler than the Boost web site describes:

  cd /usr/local/src
  wget http://sourceforge.net/projects/boost/files/boost/1.43.0/boost_1_43_0.tar.gz
  tar zxvf boost_1_43_0.tar.gz
  cd boost_1_43_0
  ./bootstrap.sh
  ./bjam
  sudo ./bjam install

That's it. Now this will install it into /usr/local by default, and I didn't do that last step, but had I beed doing this on machines I really controlled, I probably would have. I just think putting a library with the project is just wasteful. Too many people might want to use it, and that's the point: Get a recent version on the box and use it.

I'm still toying with the idea of building an RPM, but I need to get this project going and then hassle with the libraries. The only difference would be in the Makefiles and the deployment strategy, and those can wait for now. It's built, it works, and I can write code.

Good enough.

I sure do wish they'd make the RPMs for a reasonably recent version available on a web site, though. That would be really nice.

Incredibly Funny iPhone 4 vs HTC Evo Video

Wednesday, June 30th, 2010

OK, I'm as big an Apple fanboy as they come - I think it's for good reason, but that doesn't mean that other people see things my way. I was reading Daring Fireball this morning and came across this iPhone vs HTC Evo cartoon that's just about the funniest thing I think I've seen in months. It takes a little bit to get going, but the guy's response to the near mindless desire for an iPhone 4 by someone that doesn't have the first real clue about why, is just priceless.

Give it a watch...

The Value of Good Code Layout

Monday, June 28th, 2010

I've been trying to get a handle on what's in the current version of The Magic Schoolbus and it's hard. I mean it's a lot harder than it has to be. There are complete non-template implementations in header files, there are classes grouped - some logically, some not, into the same files... it's a mess. Trying to see what's happening - a real important thing in OO design, is next to impossible.

Many people have criticized me for making my code too verbose. Maybe it is. But every one of those people were never trying to understand it. They were trying to shy away from it as a coding standard, and simply write less. Hey, I understand lazy. It's easy to understand: You want to do less than you have to. Easy. But it's always going to cost you in the end.

Take this codebase... if they had taken the time to make header and implementation files for each class, then it'd be a lot easier to see what's happening. I wouldn't have a 277,000 line header file that's really a header with all the implementation in it. I'd have a good set of headers for use with a pre-compiled binary library and people would be able to use it like they should.

But that wasn't the path chosen.

I want to choose the better path. So I'm writing the entire thing over from scratch. Using Boost every single time I can to make it portable while not sacrificing capability and speed. I'm going to make this project something to be proud of, and I hope, I really hope that it catches on.

If not, I'm still going to do it.

That's just the way I roll. Baby.

Google Chrome dev 6.0.447.0 is Out

Friday, June 25th, 2010

GoogleChrome.jpg

This morning I noticed that Google Chrome dev 6.0.447.0 was out and they did a few nice little things this time:

  • PDFs are now centered
  • Ctrl-Click a link opens it to the right of the current tab
  • Unified the Page/Wrench menus on the upper-right
  • Fixed a few crashes with tabs and bookmarks

In all, a nice update. I really like the 'unified' page and developer menus as I was constantly getting confused about what was in which when I needed something. This is far better in the long run.

Making a Quad-FAT Universal Binary for CKit

Friday, June 25th, 2010

CKit.jpg

I saw something like this on Apple's Developer website while looking for some help in getting the PPC build of CKit working the other day. It was a way to combine two binaries and make a universal (or FAT) binary as the output. So you'd compile a PPC library and an Intel one, and then merge the two together. Pretty slick. It's a simple command:

  $ lipo -create one.dylib two.dylib -output final.dylib

This creates a single final.dylib file from the two inputs. Very nice.

So I decided to try my hand at creating a single quad-FAT CKit library: 32/64-bit, PPC/Intel. All I needed to do was to change the lines in the Makefile from:

  ifeq ($(shell uname),Darwin)
  all: $(LIB_FILE) $(LIB64_FILE)
  else

to:

  ifeq ($(shell uname),Darwin)
  all: $(LIB_FILE) $(LIB64_FILE)
	lipo -create $(LIB_FILE) $(LIB64_FILE) -output $(LIB_FILE)
  else

Where the original 'all' target just required the two library files to be generated, and now it does that but then it uses the lipo command to stitch the two libraries together into one. This has the effect of leaving the 64-bit PPC/Intel library alone, but glues it onto the 32-bit version. What an amazingly simple thing to do!

I do love these tools!

Identifying, Sorting, Classifying a Ton of Messages

Thursday, June 24th, 2010

The Magic School Bus

Today I started the process of trying to consolidate the 300+ messages in The Magic Schoolbus into a few reasonable categories: OPRA messages (tons of them, space is critical, data format very rigid), Price Messages (little looser, but still important and small), and everything else. The remainder of the messages are really suitable for fitting into self-describing message formats like JSON, or more likely BSON, as they are very flexible - have variable number of components, and don't need to get shot around the network all the time.

The Really Wasteful

Take for instance, the Holiday Calendar. This is just like every other Holiday Calendar I've ever seen: give it a date (or default to today, and it'll give you all the trading holidays for the next 'n' months. Very simple data structure. Even simpler when all you're talking about are US Equities and their options - you don't even need to tell it which exchange you're asking about as they are all the same.

But here's what The Magic Schoolbus does: Every minute it will publish a list of all holidays for the next ten years and those that are registered for this data will receive it. Over, and over again. Every minute. The format is pretty simple as well. There's the basic header of the message (far too verbose and general) but the payload of the message looks like this:

  struct {
    uint16_t       modifiedBy;    // trader ID
    char           today[9];      // YYYYMMDD
    uint8_t        numHolidays;   // # of holidays
    Holidays_NEST  holidays_nest[];
  } HolidayCalendar;

where Holidays_NEST looks like:

  struct {
    char      holidayDate[9];   // YYYYMMDD
    uint8_t   holidayType;      // 1=no trading; 2=half day
  } Holidays_NEST;

Now even if we put aside the problems with this content - like a date that's 9 bytes when 2 would do (as a uint16_t) - in fact, we could compress the entire message to look like this:

  struct {
    uint16_t    modifiedBy;    // trader ID
    uint16_t    today;         // YYYYMMDD
    uint8_t     numHolidays;   // # of holidays
    uint16_t    holidays[];    // tYYYYMMDD
  } HolidayCalendar;

where the 't' is the type of day and the date immediately follows. A simple mask gets us what you need and size comparison (assuming 64-bit pointers) is:

  old size = 12 + n * 10
  new size = 5 + n * 2

and for a typical year we have, say 7 holidays, and ten years, so n = 70:

  old size = 12 + 70 * 10 = 712
  new size = 5 + 70 * 2 = 145
  savings: 79%

It's just stunning how bad some of these messages are.

The Horrible Congestion

Look again at the Holiday Calendar - it's sending this data out every minute. Why? Because the designers believed that this was the only way the data was going to get delivered to the client. What about a data cache/data service? They even have a cache server in the architecture - but it holds all the messages sent and as such, it's not nearly as efficient as a more customized data service.

So I need to do something here - basically, stop the insanity of sending all this data all the time. I need to have the client get it when it requests it and when it fundamentally changes. This means something a lot more intelligent and flexible than read from the database, make a monster message, send it, repeat.

The Task

It's huge. I have to look at all the used messages and then try to see what can be combined into a nice, compact format for sending at high speed to a lot of clients, and what can be more free-form and possibly even skip the 29West sending in the first place.

It's a monster job. But it's gotta be done. The reason this is in such a horrible state is because no one has taken it upon themselves to do this until now. It's ugly, and it's painful, but it's got to be done.

To Move On, or Not…

Tuesday, June 22nd, 2010

xcode.jpg

I've been updating some C++ code this afternoon to use Xcode 3.2.2 and the new LLVM compiler that comes with it and I ran into some interesting new warnings and one real issue. The biggest source of the new warnings were the use of char * as a constant. The warning was saying that it wasn't going to cast the string into a char *. For example, the following generates a warning in Xcode 3.2.2:

  char *name = "Bob";

but the fix is pretty easy:

  const char *name = "Bob";

There was another problem with calling methods with optional parameters and needing the interface to declare the argument const. It was easy enough to fix as well.

The biggie was that when I installed Xcode 3.2.2 I didn't include the complete 10.5 compatibility package - including the PowerPC code. And that was a real issue. I spent a ton of time trying to understand what the error message was saying. It was saying __Unwind_Resume was undefined in all the modules, and that had nothing to do with the PPC libraries not being installed.

Yet, it did.

I read in a few places that the link problem was from a bad compiler, or something, and so I got to thinking - What if Xcode 3.2.2 wasn't PPC-aware? And so I checked. And its libraries are all a single CPU type. It's gotta be the case that in one of the installs of Xcode, I didn't ask it to install everything for 10.5, and that dropped the PPC libraries. The compiler generated the code OK, but the link failed.

OK... now I just need to download Xcode 3.2.2 again and re-install it to get the 10.5 libraries for PPC back. I hope it's easier than it sounds.

The alternative is to drop PPC support for the code and just "Move on". I'm not really ready to do that if I don't have to. But there's nothing holding me to PPC and 10.5, but at the same time, there's nothing that says I need Intel-only or 10.6. SO for now, I think I'll try to get it back.

[6/23] UPDATE: It's not nearly as easy to build for PPC and Intel in Snow Leopard. I can still do it, but it's getting to be more and more of a pain. I can see that the legacy systems are just dropping off. While I'll bet I can still build apps with Xcode for PPC, the command-line args are getting tougher to master.

For now, when I used to make my shared library, I had the following arguments in the Makefile:

LDD_32 = -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk -arch ppc -arch i386 \
         -install_name libCKit.$(SO_EXT) -current_version 1.0.0 \
         -compatibility_version 1.0.0
LDD_64 = -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk -arch ppc -arch x86_64 \
         -install_name libCKit.$(SO_EXT) -current_version 1.0.0 \
         -compatibility_version 1.0.0

but to get it working with Xcode 3.2.3 and Snow Leopard - even with all the old development tools, I had to change this to:

LDD_32 = -lgcc_s.1 -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk -arch ppc \
         -arch i386
LDD_64 = -lgcc_s.1 -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk -arch ppc64 \
         -arch x86_64

Interestingly, I didn't need the current_version or compatibility_version, but it's nice to have them if you want that information in the code. In my code, I leave in the two values just to be complete:

LDD_32 = -lgcc_s.1 -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk -arch ppc \
         -arch i386 -install_name libCKit.$(SO_EXT) -current_version 1.0.0 \
         -compatibility_version 1.0.0
LDD_64 = -lgcc_s.1 -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk -arch ppc \
         -arch x86_64 -install_name libCKit.$(SO_EXT) -current_version 1.0.0 \
         -compatibility_version 1.0.0

I also changed the defines for the 64-bit build from:

CXX_64 = -isysroot /Developer/SDKs/MacOSX10.5.sdk -arch ppc -arch x86_64

to:

CXX_64 = -isysroot /Developer/SDKs/MacOSX10.5.sdk -arch ppc64 -arch x86_64

And then for the tests that only need to work on the one architecture:

CXX_DEFS = -D_REENTRANT -O2 -isysroot /Developer/SDKs/MacOSX10.5.sdk \
           -arch ppc -arch i386
OS_LIBS = -Wl,-syslibroot,/Developer/SDKs/MacOSX10.5.sdk

to:

CXX_DEFS = -D_REENTRANT -O2 -arch ppc -arch i386
OS_LIBS = 

as I was getting problems with linking libcrt.1.10.6 and the only way to get it was to let the system default to it's libraries. It's not ideal, but it's working.

The next step is just to remove PPC support and be Intel-only.

[6/24] UPDATE: To get better universal binaries out of Xcode, I'm going to have to switch to Xcode as the build system. Why? The LLVM compiler. Using GCC, I'm not going to be able to make 10.4u binaries as that was GCC 3.3, and Xcode 3.2.3 is on GCC 4.2. Pretty soon, the GCC will move away from that, and I'll be left with nothing to easily compile with. Except the LLVM compiler. Unfortunately, the documentation on this is still a little sketchy, and the best way to really use it is Xcode.

Probably just switch off PPC generation as it's unlikely to come back anytime soon.

How Best to Describe The Magic Schoolbus? Convoluted… Inconsistent

Tuesday, June 22nd, 2010

The Magic School Bus

I've been struggling to come up with a way to describe the codebase of The Magic Schoolbus - and it's not all that easy for me. The code has good parts - the guys who wrote it are not without understanding. They have atomic operations from boost, and in places, it's clear that they have been trying to get this codebase up to a very respectable level, and in some places, they have done a good job.

But the real problem is that they haven't been consistent. It's a single codebase with multiple projects - many of which are no longer used by anyone, and there's no consistency in the code. OK, that's not 100% accurate - there's plenty of copy-n-paste reuse where they have taken whole applications in the codebase - copied them, and then just replaced a few letters in the name to make a new class. It's the worst kind of consistency: to make any changes across the board, you have to change everything in the codebase.

There's precious little in the way of real re-use. There's even precious little in the consistent use of types. In many places they'll use their own unsigned integer types, and in others they'll use those in stdint.h. I'm all for using either as there are distinct advantages to both, but you really need to pick one, and stick to it. No matter what.

That's the thing that really gets to me: the lack of consistency.

In many projects I've been in, I didn't like the way the original developer started writing the code, but for the sake of the project's consistency, I stuck with it. The goal was to have my changes look no different from the original code. If I succeeded, then there's only one style to understand. This is far, far easier to grok.

But in this codebase it's like Rube Goldberg gone amok. There are some sub-projects where the includes are in the source directory, and others where they aren't. Some use custom types, others use system types. Some use header files and implementation files, and some use massive structs in header files with included implementations. There's just no consistency.

And it's all very, very big. Like the 277,000+ line message header file.

So what to do?

If we try to clean it up - to really use a well-designed object model, then we're gutting everything. And I mean everything. If we're going to do that, then we might as well start over and make something that's far better - with a client library in multiple languages, and forget having 100% backward compatibility. We'll do our best to make a transition plan, but it's a new era, with better information, better performance, etc.

If we do that, are they going to be willing to go with me? Hard to say. I think they have serious doubts about if it can be done. In that, I have serious doubts if they can do it. If they have no faith that it can be done, then there's no chance they'll actually be able to pull it off.

We need to have a more consistent codebase. It's essential to monitoring, stability, low maintenance, etc. But to do that I may have to Just Do It, and then hand it off to them. That's not really ideal, but it may be the only option.

It's a tough place to be. But I'm glad that I have a good handle on the code, and can vocalize the issues for those that have asked me to look into this. It's not an easy decision, but it's one that needs to be made.

Tutorial on Moving a Time Machine Backup Drive

Tuesday, June 22nd, 2010

TimeMachine.jpg

I've got a 1TB external drive for my Time Machine backups, but I know that someday soon, I'll need to upgrade it to larger space. When I do, I'll probably go with a very large drive system - say 4TB or more, and I want to move all my old Time Machine data to the new disk. I've done this a few times in the past, but I wanted to have a single place where I had all the instructions so I didn't have to worry about forgetting what sites I used, etc.

So here it is:

  1. After attaching, formatting (Mac Journaled), and naming the new drive, launch Time Machine and switch it off.
  2. Unmount the current Time Machine drive by dragging it to the Trash. Remount it by either turning it off and then on again, or mounting it with Disk Utility. You do this so it loses its Time Machine icon and appears with the typical orange external drive icon.
  3. Launch Disk Utility, select the old drive, and click on the Restore tab.
  4. Drag the Time Machine partition from the old drive to the Source field in the Restore tab. Drag the new drive’s partition (assuming there’s just the one partition) to the Destination field.
  5. Enable the Erase Destination option and click the Restore button.
  6. Wait while Disk Utility does its job (this can take several hours).
  7. When the job is done you’ll see two identical volumes on the Desktop. Unmount and disconnect the one that shows a capacity of 500GB (the old drive).
  8. With the new drive mounted, open Time Machine and switch it on. Click on Select Disk and direct Time Machine to the new drive that contains your copied Time Machines backup.

Time Machine should do the right thing and use that drive for its backups. Should you need to restore you’ll find that all your old backed up data is on this new drive.

Google Apps Auto-Lock with iOS 4 Upgrade

Tuesday, June 22nd, 2010

GMail-iPhone.jpg

Since I've upgraded my iPhone 3GS to iOS 4, I've been trying to see if it's acting better, the same, or worse than before. I have to say that the Mail app and syncing to GMail with Exchange did not work well at all yesterday. Messages weren't getting to my upgraded phone, and for several hours I wondered if it was going to be broken until we got iOS 4.0.1. It was bad.

Then the camera app wasn't working really well last night, and I was convinced that this wasn't all that great an update for the 3GS hardware. But I'm coming around this morning. I power-cycled the phone, and things are a little better. The camera is better, and email is OK. But there's a wrinkle - the upgrade allowed Google to set the "auto-lock" on my phone to: "1 Minute".

Not good. I just needed to go into Settings, then General, then change it, but by default, it's going to auto-lock your phone. It's simple to fix, but if you didn't know, it could really fluster you. Glad I read about this. I'll know what to do when upgrading Liza's phone.