Archive for the ‘Coding’ Category

After a 24hr Set-Back, Testing Resumes

Thursday, October 18th, 2012

Speed

This morning I was finally able to get back to the performance testing that I was hoping to finish yesterday. The problem with today's data is that it's significantly slower than before the Salesforce refresh - meaning they clearly put us on less powerful hardware or bandwidth because we're seeing a three to five times increase in the access times for the data and writes. It's really slow. However, that's got it's upsides as well - we won't likely see much slower, and we'll see how the data stacks up.

Additionally, today I compared the EC2 instance with a physical box in the datacenter that will be the ultimate host for the app. This will again be the comparison of serial running versus a number of parallel processes. All in hope of finding what impact the runs have, and how the time grows with workload.

Hope things all work out.

Note to self: Never Stop Playing Music!

Wednesday, October 17th, 2012

For the last several hours I've been really dragging… feeling like what I'm doing is just a waste of time - no on cares, etc. Really getting bummed out.

Then I got sick and tired of listening to a guy near me yammer on about something that had nothing to do with work, and picked up the ear buds for some tunes. Amazing what a great song will do.

It's really almost magic.

Gotta remember this.

Nothing is Ever Easy

Wednesday, October 17th, 2012

I'm sitting here trying to get all my tests restarted, and I realize a simple fact:

Nothin's ever easy.

Nope. It just isn't.

I can say that the Best things in life are free - and to a point, in a way, they are. But then again, the best relationships may be free, but they take work. Again, not easy.

Getting these tests going… should not have even been an issue… and it's turned into a complete and horrible diversion. Not easy.

Ugh.

Struggling with Salesforce Sandbox Refresh

Wednesday, October 17th, 2012

The Magic School Bus

I knew as soon as it happened, it was going to eat up hours of my time. I first noticed it as a failure of one of the tests I was running this morning. Nothing back from Salesforce, and when I looked at the log files, everything was failing. I had heard that they would be refreshing the staging sandbox from production, but I had no idea it'd be this morning.

So first things first… Wait.

Let them finish the refresh and then try to get the Remote Access going again, given that I've never really gotten it going in the first place. You have to create an application in Salesforce, and then make sure that's all OK to get the first two pieces of the authentication tokens. Then you have to have the new user send it's security token back to you, and now you just need to battle with the uncompiled, or mis-permissioned classes.

It's enough to make you want to drink. Heavily.

I had some help, and that's nice, but it was still an amazing waste of time to get things back to where they were. But hey… I'm sure someone had a good reason for this. I just can't possibly imagine what it is.

Fixed Bad Merge and Changing APIs

Tuesday, October 16th, 2012

bug.gif

This afternoon I needed to fix up a huge merge that a co-worker did right before leaving for a two week vacation. He did all he could with Git, and he tried, to be sure, but his codebase had a lot of problems in it, and they needed to be fixed up before I released it to UAT and production this evening. While I can certainly understand misnamed methods, I was more than a little miffed at the API changes that I got as well.

Specifically, because of the code paths, an API change that only happened in development, and only when retrying saves. The code used to require an array of merchants - even if it's just one, and the new method worked without the array on a single merchant. Since this was really an API, the normal code worked when passing it the array - but not when in development. Then again, it was only hit on the retries.

It was a really frustrating time because I've been on the "business end" of a lot of folks leaving systems to me, and the carnage that ensues. It's not pretty. And in this codebase, where comments are seen as a weakness, it's orders of magnitude worse.

But in the end, I got it all squared away and deployed. I just wish it weren't so much effort a lot of the time.

Really Dumbfounded About Ruby Devs and Comments

Monday, October 15th, 2012

Yesterday a co-worker re-tweeted this tweet from someone I don't know - but I've heard this attitude so many times in the past three months, it's really quite shocking:

This is what I've come to call the Rubist's Aversion to Comments.

I had a discussion with another co-worker today just before he left, and his points were this:

  • Ruby coders don't like comments - they see it as "unmaintained fluff" in the code files. That it's possible to maintain the comments as well as the code, seems completely foreign to them, and when I pointed that out, they seem to simply poo-poo it as a silly notion.
  • Ruby coders like pairing - his argument is that pairing is better than comments because more people understand the code. When I mentioned that well written documentation serves the same purpose, it was again poo-pooed.

I think the problem is that for the most part, Ruby coders like good docs - like the Ruby Docs, but they don't like to write them. I think this is just because they want to be lazy, and come up with all kinds of excuses to justify the lack of comments. In short, I think it's just immaturity.

When I responded:

his response was:

I get it… there was a time I didn't write comments - but that was when I was writing games for myself and I was a teenager. When I got serious about this profession, I wrote comments. Always.

But that's not the world I'm in these days. I'm dealing with a lot of folks that see comments as a bad thing - certainly something a good coder can do without. After all, if the code is really well done then it's self-documenting (an impossibility for any complex system). So the Holy Grail of a Ruby coder seems to be to never write a line of comments.

I wonder how many of them will think this in 15 years?

Another Tough Day of Slugging Through Testing

Monday, October 15th, 2012

Speed

I've spent most of this morning slugging through tests to see how the performance of the process is effected by different things it has to do. For example, writing results to Salesforce… and writing processed data to CouchDB… It's not good news, that's for sure, but it's data, and you have to respect what it's trying to tell you - even if you don't like what it's saying.

There's a story here, and figuring it out is the real challenge.

I've got a few ideas on what to test next, but the overall scheme isn't looking good. We use a lot of data, and reading and writing it is just a huge part of the process. The reasons for writing it back to Salesforce are obvious, but a case could be made that this is the wrong direction to take, and we should remove ourselves from a dependency on Salesforce and just make everything stand-alone.

There's a lot I like about this idea - it still has only one copy of the data, it's just no longer in a third-party application that we can't really access efficiently. It does mean that all the supporting tools (report builders, query tools, etc.) that are in Salesforce would have to be duplicated outside in the system we built, but that's a one-time cost as opposed to the continual cost of moving the bits from there to here and back again.

I know that in Finance this would not have been a long-term solution. But we're not in Finance anymore, and I'm trying to go with the flow here and see what plays outside of Wall Street. So we'll see… maybe it stays, maybe it doesn't.

Messing around with CSS for Blog

Friday, October 12th, 2012

WebDevel.jpg

I've spent far too much time messing with CSS for the preview in MarsEdit for my journal. This took way too much time because of the caching of the files and silly problems I made in the CSS file. I really like the preview in MarsEdit, but the caching of the CSS makes it really hard to know if what you're doing is really having the effect you think.

I spent several hours - really, on this. In order to minimize problem, read the CSS from an http:// URL - this seems to have a lot less problems with caching, and then switch back to the file:// URL once things are stable.

In the end, I was able to get it all fixed up, and then copy up the CSS changes to the CSS for the theme I'm using. Not easy, that's for sure, but worth it.

Conducting Timing Tests

Friday, October 12th, 2012

Speed

So this morning I'm doing some timing tests to see what the effect is of trying to run multiple jobs at the same time. Initially, we thought that the boxes would be CPU-bound because of all the logic in the pinning and calculations. But that's not the case. I can use top to see that the vast majority of the time the CPUs are idle (2 CPUs in this box).

So what's taking so long?

The monitoring (NewRelic) says it's all I/O - seems reasonable, but then can we speed up the time to make all six runs by putting them all in the run queue at the same time? Maybe it's I/O choked, and if that's the case, then No, it won't help. But if we're latency bound, then it will - so long as the service on the other end are fast enough to handle multiple requests.

It's a lot of questions, so that's why I'm running the tests.

Timing of a particular run is easy - just use time. But if I want to parallel the runs and see the effect in the same manner, I need to be a little more clever:

  #!/bin/bash
  philadelphia &
  central-jersey &
  baltimore &
 
  wait

This is a great little script that runs all three in parallel, and then waits for them all to be done before returning. If I put this into a time call:

  $ time ./parallel

then I'll get exactly what I want to get.

Now it's just a matter of doing all the runs and gathering all the data.

Division Total real user sys
philadelphia 414.68 3:13.863 2:02.340 0:08.889
central-jersey 109.00 2:17.305 1:29.034 0:05.768
baltimore 214.04 4:03.828 1:53.211 0:07.500
cincinnati 121.80 3:09.080 1:33.894 0:06.716
cleveland 264.36 2:56.955 1:48.855 0:07.872
austin 194.46 4:45.738 2:06.456 0:08.581

OK, this is all a nice baseline, but what we're really interested in is how they work together. In that case, we ran successively larger sets, starting at the top of the list and running more jobs and timing them:

Job Count real user sys
2 7:37.395 4:22.084 0:16.545
3 7:40.657 6:45.033 0:24.438
4 12:14.561 8:05.870 1:54.751
5 16:09.544 12:06.461 0:31.694
6 18:53.886 14:59.040 0:37.562

This looks puzzling, so I'm going to do similar tests - this time serially running the first 2, 3, 4, 5, and 6 jobs and see how they compare:

Job Count real user sys
2 9:00.463 4:10.892 0:16.337
3 12:21.413 5:59.158 0:23.101
4 15:49.010 7:32.160 0:27.686
5 20:13.591 9:39.248 0:37.150
6 26:44.176 11:47.096 0:44.891

All this data looks like:

Parallel Tests

Which tells me that it doesn't matter, really, if we try to run these in parallel. They simply don't scale that way. There's some other limiting factor in the system and it's not the CPU. This isn't terribly surprising, but it's something that means we really need to start looking at these external systems and see where we can limit our exposure to them.

UPDATE: it took 4:10 - just over four hours to do all 40 of the divisions that we need to do nightly. I guess in that light, it's OK, as there were plenty of retries in the run and we still had time to do the Top 40. The problem, I fear, is that we still have a global choke point that we need to work through, but we'll run into that soon enough, I guess.

Today is Just One of Those Days…

Thursday, October 11th, 2012

I'm sitting waiting for some runs to complete on the UAT box for The Group, and there's not a lot I can do other than to wait. This has been one of those days where I just want to scream - but can't. I've been trying to make progress to feel like I've really accomplished something today, but it seems that I'm getting set-backs at the same rate that I'm removing them. Every "Finally…" is quickly followed by "What's this?!?", and a diversion or blocker that keeps me from actually getting anything really interesting done today.

Looking back at the commit log, I know I did some good stuff this morning, but that's long been forgotten. Now I'm just sitting here, waiting for someone's runs to complete. This is what I really liked about Finance - or maybe just working in small groups: You spend very little time waiting for folks because there's so few things you need from other people.

Not so here, and so I have to wait. I'm not very good at it, but I guess this is going to make me better at it.