Archive for the ‘Coding’ Category

Messing around with CSS for Blog

Friday, October 12th, 2012

WebDevel.jpg

I've spent far too much time messing with CSS for the preview in MarsEdit for my journal. This took way too much time because of the caching of the files and silly problems I made in the CSS file. I really like the preview in MarsEdit, but the caching of the CSS makes it really hard to know if what you're doing is really having the effect you think.

I spent several hours - really, on this. In order to minimize problem, read the CSS from an http:// URL - this seems to have a lot less problems with caching, and then switch back to the file:// URL once things are stable.

In the end, I was able to get it all fixed up, and then copy up the CSS changes to the CSS for the theme I'm using. Not easy, that's for sure, but worth it.

Conducting Timing Tests

Friday, October 12th, 2012

Speed

So this morning I'm doing some timing tests to see what the effect is of trying to run multiple jobs at the same time. Initially, we thought that the boxes would be CPU-bound because of all the logic in the pinning and calculations. But that's not the case. I can use top to see that the vast majority of the time the CPUs are idle (2 CPUs in this box).

So what's taking so long?

The monitoring (NewRelic) says it's all I/O - seems reasonable, but then can we speed up the time to make all six runs by putting them all in the run queue at the same time? Maybe it's I/O choked, and if that's the case, then No, it won't help. But if we're latency bound, then it will - so long as the service on the other end are fast enough to handle multiple requests.

It's a lot of questions, so that's why I'm running the tests.

Timing of a particular run is easy - just use time. But if I want to parallel the runs and see the effect in the same manner, I need to be a little more clever:

  #!/bin/bash
  philadelphia &
  central-jersey &
  baltimore &
 
  wait

This is a great little script that runs all three in parallel, and then waits for them all to be done before returning. If I put this into a time call:

  $ time ./parallel

then I'll get exactly what I want to get.

Now it's just a matter of doing all the runs and gathering all the data.

Division Total real user sys
philadelphia 414.68 3:13.863 2:02.340 0:08.889
central-jersey 109.00 2:17.305 1:29.034 0:05.768
baltimore 214.04 4:03.828 1:53.211 0:07.500
cincinnati 121.80 3:09.080 1:33.894 0:06.716
cleveland 264.36 2:56.955 1:48.855 0:07.872
austin 194.46 4:45.738 2:06.456 0:08.581

OK, this is all a nice baseline, but what we're really interested in is how they work together. In that case, we ran successively larger sets, starting at the top of the list and running more jobs and timing them:

Job Count real user sys
2 7:37.395 4:22.084 0:16.545
3 7:40.657 6:45.033 0:24.438
4 12:14.561 8:05.870 1:54.751
5 16:09.544 12:06.461 0:31.694
6 18:53.886 14:59.040 0:37.562

This looks puzzling, so I'm going to do similar tests - this time serially running the first 2, 3, 4, 5, and 6 jobs and see how they compare:

Job Count real user sys
2 9:00.463 4:10.892 0:16.337
3 12:21.413 5:59.158 0:23.101
4 15:49.010 7:32.160 0:27.686
5 20:13.591 9:39.248 0:37.150
6 26:44.176 11:47.096 0:44.891

All this data looks like:

Parallel Tests

Which tells me that it doesn't matter, really, if we try to run these in parallel. They simply don't scale that way. There's some other limiting factor in the system and it's not the CPU. This isn't terribly surprising, but it's something that means we really need to start looking at these external systems and see where we can limit our exposure to them.

UPDATE: it took 4:10 - just over four hours to do all 40 of the divisions that we need to do nightly. I guess in that light, it's OK, as there were plenty of retries in the run and we still had time to do the Top 40. The problem, I fear, is that we still have a global choke point that we need to work through, but we'll run into that soon enough, I guess.

Today is Just One of Those Days…

Thursday, October 11th, 2012

I'm sitting waiting for some runs to complete on the UAT box for The Group, and there's not a lot I can do other than to wait. This has been one of those days where I just want to scream - but can't. I've been trying to make progress to feel like I've really accomplished something today, but it seems that I'm getting set-backs at the same rate that I'm removing them. Every "Finally…" is quickly followed by "What's this?!?", and a diversion or blocker that keeps me from actually getting anything really interesting done today.

Looking back at the commit log, I know I did some good stuff this morning, but that's long been forgotten. Now I'm just sitting here, waiting for someone's runs to complete. This is what I really liked about Finance - or maybe just working in small groups: You spend very little time waiting for folks because there's so few things you need from other people.

Not so here, and so I have to wait. I'm not very good at it, but I guess this is going to make me better at it.

Google Chrome dev 24.0.1290.1 is Out

Wednesday, October 10th, 2012

This morning I noticed that Google Chrome dev 24.0.1290.0 was out with a respectable set of release notes. It appears they've felt with some crashing issues, but not a lot else. I suppose that's OK, it's still an improvement, and that's a step. I'm not sure where they are really going but it's nice to seem them still moving forward.

Loads of Bug Fixes Today – Slugging it Out

Tuesday, October 9th, 2012

bug.gif

Today was another really slow day. There were a few bug reports I was able to ditch this morning, but the rest of the day was slugging through problems that really did need to be addressed. One of the "bugs" from this morning was turned into a feature request, and that was an interesting exercise in frustration.

I've had the talk with a lot of the Rubists that comments are non-optional, but I've gotten quite a bit of push-back from even those that I didn't expect it from. I waited, because I knew a day like today would come.

I was working on a feature, looking at a particularly obtuse piece of Ruby code, and trying to figure out what the author was doing. One of the long-time Ruby devs stopped by to lend a hand, and even to him, this was bad code. Far too "tricky" - just for the sake of being "tricky". Far too compact. No readability - and of course, no comments.

The long-time Ruby dev even said "I hate it when he does this".

I could not have had a better opening for the importance of comments. And he totally agreed.

'Nuff said.

Slugging through this bug was worth it.

Making Support Tools – Slow, but Necessary

Monday, October 8th, 2012

Building Great Code

This afternoon I had to make a lot of tools for supporting the app I've been working on. I'm getting bug reports, and I need to look into the data we're generating, and the only place it really all sits is in the CouchDB. So I had to write some scripts to hit some views and then load up the values I want with some key restrictions. None of it was all that hard, but when I needed to make a change to one of the views on the production database, I had to wait upwards of a couple of hours for Couch to regenerate the view due to the amount of data we have, and the limitations of the EC2 instance we're currently running on.

Soon enough, we'll be on our own hardware in our datacenter, but it's hard to beat the spin-up time of an EC2 box. It's pretty nice. But the performance isn't all that great, and that's what we're really running up against - I/O to the disk and CPUs to process. I'm hoping that as soon as we get the new machines, things like this will be a lot faster.

All told, I got built what I needed, and it's just a matter of time to be able to use it.

[10/9] UPDATE: worked perfectly! Once I got in this morning, I was able to use these views and tools to look into the bug reports and prove that they aren't bugs - the code is working exactly as designed.

Surprising Fact: Cron Jobs aren’t in a Shell

Monday, October 8th, 2012

GeneralDev.jpg

Today I was surprised to learn that cron jobs are not run in a shell. That was really a surprise. The reason it shocked me was that I had the following in a crontab:

  15 05 * * mon-fri  $HOME/bin/startQL | $HOME/bin/campfire -p

where I'd written a nice little bash script to send a piped message to our Campfire room, and I wanted to pipe the output of the startQL command to that room so we'd be able to see what happened without having to check our email.

The script takes stdin and sends it on as a pass-through, but with a copy being send to Campfire. It's a pretty nice little script that just uses the Campfire curl API:

What I found was that the pipe wasn't working because we weren't in a real shell. It was easy enough to fix - just put the commands in a new file and call that, or use a subshell:

  15 05 * * mon-fri  ($HOME/bin/startQL | $HOME/bin/campfire -p)

Learn something new every day!

Really Amazing Working Experience

Saturday, October 6th, 2012

So I've been working at home this weekend, and I have to say, it's an amazing experience. I can take a little time Saturday morning and fix up a new data import table for the business user and not sweat about it Monday morning. Liza and the kids are asleep, and I'm awake, and after a nice bowl of Raisin Bran, I can work for an hour and get some stuff done without having to worry about what I'm missing with the family, or making someone angry that I'm too involved with work.

I can pop open the laptop, write a little code, do a little magic, check it in, do a run, and then call it done. That's pretty sweet!

This is what I'd hoped for when I thought about developing. Something as effortless as just doing the work. Work a little when you have the time, and then during the week, pound out some major hours. But always be flexible enough to take a trip to the grocery store if needed.

When I think of the joys of self-employment, that's it. If I could do that with the place I am now, I'm not sure I'd ever leave. Ever.

It's just that great to me.

Details, Details, Details – It’s all in the Details

Friday, October 5th, 2012

Code Monkeys

Today I was trying to get some work done (from home!) and I got a chat from one of the Data Sci guys I work with in Palo Alto. He was saying that he couldn't find the data I'd just generated into CouchDB. Because Couch is a NoSQL database, we needed to enforce some kind of structure to the documents, and what we chose was the idea of an execution tag. Currently, it's the date and time of the run - to the millisecond, and the name of the division it's running. This is calculated once at the start of the run and then used in all logging and persistence stores so that it's easy to correlate all the data for a given run.

But today, it seems, the execution tag wasn't constant after all. That's a big problem.

So I started to track this down, and I very quickly came to the realization that the massive refactoring that a co-worker had done to try and streamline logging in the application really wasn't preserving the essential nature of the execution tag. It was generating several of them - all a few seconds off.

Now, don't get me started on the need for logging. It's what I think of as essential to serious, commercial, production apps. You just gotta have it. But there seems to be a group of folks that wants to get back to the APL days where there's no logging, no comments, and the strong belief that the code speaks for itself.

I can't possibly disagree more with all these points, but that's another day, as I said. But not respecting a critical thing like the immutability of the primary correlation factor in the data is kind of a big deal. But it's really a detail, isn't it?

The tests all passed - and the code ran, so it's got to be good, right?

Unfortunately, that's far too often the case with a lot of the younger developers I'm meeting these days. Nice guys, and they are capable of writing good code, but I think they just don't want to put forth the effort it takes to pay attention to all the details. After all, there are a lot of them in even the most simplistic of business apps.

I've talked to a lot of younger coders, and they are self-confessed "Lazy coders". Now I know they aren't really lazy, they work hard, but they aren't interested in paying attention to the details, and that's where everything of importance is to be found.

I just wish I could teach them of the need, but I think the only one that can teach them that, is Time. She has a funny way of catching up to even the smartest of folks.

Working from Home Done Right

Friday, October 5th, 2012

trophy.jpg

So today, for the first time while being at The Shop, I did a Work from Home. There have been a lot of times in the last three months that other guys in the group have done it, but today was a first for me. I have to say, the IT guys at The Shop know how to do it right. I've been at a lot of places with VPN access, and it's always the same: you either control a desktop with RDC/VNC or you have full VPN and it's Cisco and locks you out of all other network activity - save that that goes through the VPN.

The Shop does it right - all SSH tunneling and it's all automatic. I open up my MacBook Pro at home and it's exactly the same as being at work. There's no configuration changes, there's nothing to do, or worry about. Email is clean, calendar is there… it's all working just like I was sitting at my desk at work.

Of course, this means working at Starbucks or Panera is also as nice.

Holy Cow! I'd dreamed that this could be possible, but I've never really been at a place that was done this nicely. It's effortless and seamless. You can't ask for anything better than that.