OmniGraphSketcher 1.1 Final is Out
Tuesday, December 15th, 2009Well, today I got a tweet from OmniGraphSketcher saying that 1.1 Final was out and gave a reference to the notice on the Omni web site.
Looks good.
Well, today I got a tweet from OmniGraphSketcher saying that 1.1 Final was out and gave a reference to the notice on the Omni web site.
Looks good.
I have been running some tests all morning, and they'll continue throughout the day, but as I'm watching these tests and gathering data, it occurred to me that I wanted to know which of my data injectors was giving me the most data, and by how much. It's not hard to get the counts with SQL:
SELECT portfolio, COUNT(*) AS hits FROM PortfolioData WHERE acquired > '2009-12-15' GROUP BY portfolio ORDER BY hits DESC
and I get a table that looks a lot like this:
portfolio | hits |
Gas_NGUNG | 114150 |
NG MM | 111106 |
OIL_CLUSO | 95886 |
Oil MM | 91320 |
S&P | 28696 |
L_NRML | 22820 |
ED MM | 21957 |
Oil Indexes | 20268 |
Rho Hedge | 18579 |
ER_NRML | 17974 |
...and I've truncated the table because the effect I was looking for is clear - there are a few of the portfolios that are contributing the vast majority of the rows to the table. The problem is, I can't see the real percentage each contributes to the total. And while I've shown the top 10, there are really more than 30, so it's not easy to do the percentage calculation in my head.
I googled a bit and came up with a really simple solution: include the subquery as the divisor in the select statement:
SELECT portfolio, COUNT(*) AS hits, COUNT(*)*100.0/(SELECT COUNT(*) FROM PortfolioData WHERE acquired > '2009-12-15') AS percentage FROM PortfolioData WHERE acquired > '2009-12-15' GROUP BY portfolio ORDER BY hits DESC
and to this I get what I was looking for:
portfolio | hits | percentage |
Gas_NGUNG | 114150 | 15.07 |
NG MM | 111106 | 14.67 |
OIL_CLUSO | 95886 | 12.66 |
Oil MM | 91320 | 12.06 |
S&P | 28696 | 3.79 |
L_NRML | 22820 | 3.01 |
ED MM | 21957 | 2.89 |
Oil Indexes | 20268 | 2.67 |
Rho Hedge | 18579 | 2.45 |
ER_NRML | 17974 | 2.37 |
With this, I can now see that my top 4 (out of more than 30) contribute more than 54% of the rows in the table. That's significant, and it's nice to know. If I throttle back these four, I have a great deal of control over the total number of rows inserted in a day.
Cool.
I usually don't spend time writing about the funny things I read on the net - face it, there's just too much of it to really write about, but this morning there are a few things from Daring Fireball that are just too funny to not pass on.
The first is from Fake Steve on a conversation he had with the CEO of AT&T. In it he says of the talk about the coming soon data limiting plans from AT&T:
He launches into a mumbling spiel about how Ralph de la Vega didn’t really say what all the papers are saying he said, and he was misquoted, and it was taken out of context, but I’m like, Bitch, please, guys at our level don’t get taken out of context, we write the shit out in advance and we know exactly what we’re saying when we say it and every goddamn word has been vetted and gone over by a team of flacks. So please don’t sit there like a zoo monkey throwing your own feces at me through the bars of your cage, bokay?
I laughed out loud on that one. It goes to say:
I stopped, then. There was nothing on the line. Silence. I said, Randall? He goes, Yeah, I’m here. I said, Does any of that make sense? He says, Yeah, but we’re still not going to do it. See, when you run the numbers what you find is that we’re actually better off running a shitty network than making the investment to build a good one. It’s just numbers, Steve. You can’t charge enough to get a return on the investment.
I hope that eventually, AT&T will come to their senses and make a decent network for the load. If it takes them losing a ton of customers to balance out the demand, or regulation, or something... I'm hopeful that things will turn around for AT&T.
The next funny was about the fake Walt Mossberg on the CrunchPad-cum-JoJo. If even half of what the sock puppet said was true, then it's an interesting turn of events. I'm not one to think it's OK for folks to go back on their words, but I'm a bigger believer in karma, and you reap what you sew.
These both made me laugh a good deal. Just what I needed this morning.
Well, this morning I went to have a look at frosty, the iMac G3 I have in my home office that I use for hosting my CVS repos, Git repos, web services (including SSL and WebDAV). Well... it wasn't good. The drive had died, and rather than mess with getting another 160GB drive and building it up again, I decided to bring a Mac Mini that used to be the kid's computer out of retirement, get a new 500GB drive for it, get a monitor, and install Snow Leopard on it to make it up to date and a little easier to maintain.
What follows is the details of all the little things I had to do to get all the services running on this Mini. It's not too bad, but it's different enough from the old 10.3 install on the iMac that it warrants detailing the differences.
The first thing I needed to do after installing Snow Leopard on the new 500GB drive was to use the Time Machine drive from my iMac to pull in the basic apps and my accounts to the new box. Interestingly enough, this seemed to pull over the Developer Tools (Xcode, etc.) but upon closer inspection, come critical command-line tools like rlog and rcsdiff, so I had to install Xcode 3.2 from the Snow Leopard disk, and then use Software Updates to get it to Xcode 3.2.1. This got me the command-line CVS tools as well as the RCS tools I needed for CVSweb.
The next thing I needed to do was to update the box with the version of Git from the Google Code project. This was currently at 1.6.5.2. I have since seen that it's at 1.6.5.5, but so it goes... these minor updates come fast and furious. I'm sure in another few weeks, I'll update to the latest version at the Google Code site, but for now, this was the latest, and more than sufficient.
The first thing I needed to get going was my CVS repository - with the pserver running on the box. This is just so critical to all the code I have it had to be the first thing I got working.
The first step wasn't too bad. I had good backups of the CVSROOT directory - which just happened to be /usr/local/CVSroot in my old server. So I put it back there, and made sure that the CVSROOT environment variable were defined on my account on the new box, and then start to work on getting the pserver going. Since all my old experience had been with xinetd, and I knew that wasn't on Snow Leopard, I needed to create a launchd configuration file for the CVS pserver.
Taking the example I had to PostgreSQL, I came up with the following that I placed in /Library/LaunchDaemons/cvspserver.plist:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Disabled</key> <false/> <key>GroupName</key> <string>wheel</string> <key>InitGroups</key> <true/> <key>Label</key> <string>com.apple.cvspserver</string> <key>UserName</key> <string>root</string> <key>Program</key> <string>/usr/bin/cvs</string> <key>ProgramArguments</key> <array> <string>/usr/bin/cvs</string> <string>-f</string> <string>--allow-root=/usr/local/CVSroot</string> <string>pserver</string> </array> <key>Sockets</key> <dict> <key>Listeners</key> <dict> <key>SockPassive</key> <true/> <key>SockServiceName</key> <string>cvspserver</string> <key>SockType</key> <string>SOCK_STREAM</string> </dict> </dict> <key>inetdCompatibility</key> <dict> <key>Wait</key> <false/> </dict> </dict> </plist>
Of course, the /usr/local/CVSroot is supposed to be the location of the CVS repository, and it just happens to be mine, but if you have something different, use it. I've seen examples where folks create a 'CVS' user, and then use the directory /Users/cvs/CVSROOT, or something like that, so it's all in a "user", which does make it easier to move things around.
For me, it's Old School - the old BSD/Solaris/linux background that makes me think in terms of services and not necessarily users. But to each his own.
Once this is done, load it up with:
sudo launchctl load /Library/LaunchDaemons/cvspserver.plist
and then you should be good to go.
The key to getting CVSweb working is to make sure that the cvsweb.cgi file is correctly configured. Therein lies a tale.
The original CVSweb I had was for NT. There was a great CVS pserver for NT that I used for a very long time. It was wonderful. It allowed me to use a 180MHz Pentium II with 144MB of RAM and three disk drives totaling less than 30GB to be a CVS pserver, a Sybase Database server, an Apache server, and a few other little things. It was really impressive. So I had the CVSweb configured for NT.
Then I moved it to the iMac G3, and had to convert it to Mac OS X. No big deal, just a bunch of trial and error. Finding odd paths, improper commands, etc. and fixing them one by one. Not really hard, but not trivial, either. The problem was I didn't save it! So I had to do it all again. Thankfully, this time, /Library/WebServer/CGI-Executables/ is backed up with TimeMachine so I won't loose it again.
There's a configuration file that needed to be put in a logical place for the CGI script to pick it up. I mistakenly put it in /etc/apache2/other/ directory as the default name - cvsweb.conf. The problem is that the main httpd.conf has an include of all the *.conf files in other/ and as a consequence, it was trying to read the CVSweb configuration file as an Apache2 configuration file. That was an interesting development. In the end, I placed the cvsweb.conf file in /etc/apache2 directory with all the other high-level config files.
In the end, it was a simple CGI file and a few images to throw in the /Library/WebServer/Documents/Images/ directory. Not bad, just took a little time.
The next thing I wanted to get going was Git service for my Git repos. This needed to support the two methods of access: the git@git.themanfromspud.com method and the web service (git://) method. The first was basic configuration with a git account, which I had as a backup from the iMac G3. So I just had to make the account on the new machine, un-tar the backup from the external drive, and then it's ready to go. Well... almost.
The PATH for the git user needed to include the path to all the git executables. For the Google Groups install, that's /usr/local/git/bin/, and once I added that, things were a lot better.
Made sure the SSH service is turned 'on' from within System Preferences and I could:
git clone git@git.themanfromspud.com:project.git
To get the git-daemon going, I needed to make another launchd config file, and this one I called /Library/LaunchDaemons/git.plist and it contained:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Disabled</key> <false/> <key>Label</key> <string>com.apple.git</string> <key>UserName</key> <string>git</string> <key>GroupName</key> <string>_www</string> <key>Program</key> <string>/usr/local/git/bin/git</string> <key>ProgramArguments</key> <array> <string>/usr/local/git/bin/git</string> <string>daemon</string> <string>--base-path=/Users/git/repositories/</string> <string>--export-all</string> <string>--inetd</string> </array> <key>Sockets</key> <dict> <key>Listeners</key> <dict> <key>SockPassive</key> <true/> <key>SockServiceName</key> <string>git</string> <key>SockType</key> <string>SOCK_STREAM</string> </dict> </dict> <key>inetdCompatibility</key> <dict> <key>Wait</key> <false/> </dict> </dict> </plist>
and once it's loaded with:
sudo launchctl load /Library/LaunchDaemons/git.plist
it's ready to go. You can then clone a repo with:
git clone git://git.themanfromspud.com/project.git
GitWeb is a little different in that it's in the Git source package. In order to get it, I ended up pulling down the 1.6.5.2 source tarball from the source:
curl -o git.tar.gz http://www.kernel.org/pub/software/scm/git/git-1.6.5.2.tar.gz cd git-1.6.5.2 ./configure make
there's no reason to install it, but it's nice to get it made, just to be sure things are working OK on the server. Then, in the git-1.6.5.2 directory, there's a gitweb directory that has the information you need to install gitweb. There's a CGI file, a Perl file, a couple of PNG images, and a CSS file. Get it all configured and installed, and it's not too bad. Just took a little time.
Getting SSL working on Apache2 was something I wanted, and badly, but it wasn't critical. I had everything I needed, and now I was onto the icing on the cake. SSL isn't critical, but it's something I've used in the past for building other sites, so I wanted to have it on this rebuilt server. Thankfully, this article about getting it going on Leopard (10.5) is still accurate enough to get me home.
I crated a 100 year certificate, sure, it's not signed by anyone other than me, but that's good enough for me. It's all working and that was a nice load off my mind.
I decided a while back to stop using Marc's PHP builds - they're just too infrequent, and Apple ships a good PHP build, it just doesn't have PostgreSQL support in it. Thankfully, I had this solved already for my Intel iMac, and there was very little I needed to do to get this working. Having all these detailed instructions in the blog is a really nice thing.
When I moved off Marc's PostgreSQL builds, the one that I found that was the most successful for me was the KyngChaos wiki build. This has the 64-bit and 32-bit versions in the same binaries, and that's great for the libraries as well as the database engine itself. I decided that it'd be worth it to get PostgreSQL working on this guy - even if I didn't do a lot of heavy lifting with it, it'd be nice to have. Hey... if I need a bigger machine at a later date, I'll just get a Mac Pro and be done with it.
The launchd config file I used was placed in /Library/LaunchDaemons/org.postgresql.postgres.plist, and contains:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>PostgreSQL</string> <key>UserName</key> <string>postgres</string> <key>RunAtLoad</key> <true/> <key>EnvironmentVariables</key> <dict> <key>PGDATA</key> <string>/usr/local/pgsql/data</string> </dict> <key>ProgramArguments</key> <array> <string>/usr/local/pgsql/bin/postgres</string> <string>-e</string> <string>-i</string> </array> <key>StandardOutPath</key> <string>/usr/local/pgsql/var/logfile</string> <key>StandardErrorPath</key> <string>/usr/local/pgsql/var/logfile</string> <key>ServiceDescription</key> <string>PostgreSQL Server</string> </dict> </plist>
and again, once it's loaded with:
sudo launchctl load /Library/LaunchDaemons/git.plist
it's ready to go. You can check the databases with:
psql -l
The last thing I wanted to get going was WebDAV with SSL. I found several articles about WebDAV on Snow Leopard as it's pretty much built-in. All we need to do is to configure it. The one I used as my primary reference was this guy, and it's close - but not exact. The problem is that the config is using Basic authentication and I wanted to use Digest, as the Apple file suggests. In general, I copied a lot more of Apple's example config than I did the article, but it helped put a lot of it in perspective.
The /etc/apache/extra/httpd-dav.conf I got working was:
DavLockDB "/Library/WebServer/WebDAV/DavLock" Alias /webdav /usr/local/davroot DavMinTimeout 600 <Directory /usr/local/davroot> Dav On Order Allow,Deny Allow from all AuthType Digest AuthName frosty AuthUserFile "/Library/WebServer/WebDAV.passwd" AuthDigestProvider file <LimitExcept GET HEAD OPTIONS> require user drbob </limitExcept> <Limit GET HEAD OPTIONS> require valid-user </Limit> </Directory>
and then created the necessary password file as stated in the article. In the end, Cyberduck didn't work because it only understands Basic authentication. Transmit understands the Digest authentication, and so that's working just fine. I'm going to see if there's something I can do to get Cyberduck working, but even if I don't it's better to be hitting it with non-plaintext passwords than passing any passwords in the clear. Ever.
When it all was said and done, including a multi-hour initial TimeMachine backup, I had a new server that had everything I needed and didn't loose a thing from the dead machine. Nice.
When I added a bunch of additional values to my web app, I noticed that the updating of the Google AnnotatedTimeLine was taking a lot longer than before. I mean it was pausing for a good 3 sec. before updating the view. I looked at the CPU usage, and it was all in the Flex component. It was bad. Very bad. So I knew I had to do something about it.
In the original version of the page, I had code that was run when the data was received and the graph redrawn. This was a standard hook for the AnnotatedTimeLine:
/** * This function is called when the ATL is done updating the graph * from the data and the "draw()" method. */ function graphReady() { // hide all the unchecked data sets for (var i = 0; i < portfolioChecks.length; ++i) { if (!portfolioChecks[i].checked) { updateBackgroundVisibility(portfolioNames[i], false); } } // ...finish up with more processing } /** * This is called to update the visibility of the named portfolio * to the provided state in the background graph. */ function updateBackgroundVisibility(name, state) { var colCnt = graphData.getNumberOfColumns(); // I have to find the column name in the table for (var i = 1; i < colCnt; ++i) { if (graphData.getColumnLabel(i) == name) { // the dataset number is one less than the column number if (state) { chart[bg].showDataColumns(i-1); } else { chart[bg].hideDataColumns(i-1); } } } }
When I originally built the code, I wanted to have something that allowed me to change the visibility of the data sets in the graph(s) either way, and that's really nice. But what I didn't expect was the performance penalty I would pay for such a design.
As it turns out, there's a form of the hideDataColumns() method on the ATL that allows the user to send an array of dataset indexes. I didn't know what to expect, but I thought it had to be better than this, so I recoded this as:
/** * This function is called when the ATL is done updating the graph * from the data and the "draw()" method. */ function graphReady() { // hide all the unchecked data sets var cols = []; for (var i = 0; i < portfolioChecks.length; ++i) { if (!portfolioChecks[i].checked) { cols.push(i-i); } } chart[bg].hideDataColumns(cols); // ...finish up with more processing }
Without the call to updateBackgroundVisibility(), I knew it'd be a little faster - no need to look up all the column headers, etc. But I didn't expect to see what I saw.
The resulting code took the refresh time from 2-3 sec. to under the blink of an eye. Really. It was a slight flicker, but that's about it. Amazing. In retrospect, it makes sense... I was individually hiding about 30 columns each update. That's a lot. To do it right, there had to be some type of lock, update, refresh/redraw, and then unlock. All that added up. Big time.
It's taught me a big lesson about the efficiency of Google's code: if in doubt, look for a better way, there's probably one there, or you can always ask the visualization team. I know I've learned my lesson.
Today I found another bug in the Google Visualizations AnnotatedTimeLine widget. Basically, if you set the graph setting legendPosition to sameRow, the legend at the top of the widget will start on the same row as the date/time of the point you're currently highlighting. If you have it set to newRow, the legend will start on the line below the date/time. The sameRow looks like this:
and the newRow looks like this:
What you can see is that on the sameRow, the legend starts out right, but it never wraps to the next line. On the newRow version, it wraps nicely, but you loose a complete row, and in the case of large legends, that row is important.
So I posted a question to the Visualization group and got this answer:
Hi,
Please open a feature request from the link at the left side menu, and we will try to get to this.
Regards,
VizGuy
So that's exactly what I did. I'm hoping that they get to this as soon as possible.
This morning there were some updates for my MacBook Pro - specifically, the DVD Drive is supposed to be "making noise" coming out of sleep, or so they say. The EFI fix is necessary for the SuperDrive fix that is the second update in the cycle. There's also a fix for AirPort clients to get better connection stability and the ability to shut it down under all conditions.
Gotta get these, even if I don't see a real problem with the system.
I'm not a big Flash fan, in fact, I use ClickToFlash to keep from seeing it displayed on most of my web usage, but there is one notable exception: the Google Visualization widgets. I use these extensively in my web work to assist in the visualization of the data. In order to run these guys, I need Flash. So it makes sense to keep up with Flash for this reason alone.
This morning, Adobe updated the Flash player for Mac OS X to 10.0.42.34, and I needed to pick it up. Not thrilled, but until Google moves from Flash to something else, this is what I have to do.
This morning I noticed that BBEdit 9.3.1 was out with an impressive list of fixes and features for a minor release update. Had to get that, I use it every single day.
With all the work I'd been doing the last few days with the conversion of the Alerts from a Java Properties file-based system to a database-driven system, I needed to make an editor page for the bulk of the data. It used a ton of AJAX, but it works very nicely and I'm able to put this behind me. Very nice to get this conversion done.