What's it all about, Alfie?

Archive for the ‘Coding’ Category

Working Through The Possible vs. The Pretty

Monday, December 10th, 2012

I know this about myself - I'm not afraid to get dirty. I don't mind it, and as long as I'm in for it a little, I'm in all the way. Mud up to my elbows - no problem. So it's come to no shock to me today that Archibald, one of my co-workers, who's a little fussy, and loves clojure is a bit unsettled by some developments in the project today.

Basically, we need to have something that conditionally formats objects into JSON from clojure so that the data flowing out of the system we're building is easily read by the downstream system without modification. That's the key.

Now it's certainly possible that we can modify the downstream system to make it take the data we'll be sending, but previously in the day Archie didn't want to do that either - for obvious coupling reasons (another big clojure design point) - so we discarded that idea, and went full steam ahead. But when we got to the "dirty" part of the code - where we were going to have to have a very customized output formatter to send only the fields we need to send - based on the data we're going to send.

For example, we can have two different location data points. One that's based on a zip code:

  {
    "name": "Downtown",
    "zip": "47664"
  }

and one that's based on a division:

  {
    "name": "Downtown",
    "division": "indianapolis"
  }

The idea is that a zip code is very targeted - geographically, but there are certain demands that are much larger in scope, and for those, we want to make it clear that the entire area is "fair game". The problem is that the table this data is coming from has both columns:

  CREATE TABLE locations (
    id          uuid PRIMARY KEY,
    demand_id   uuid,
    latitude    DOUBLE PRECISION,
    longitude   DOUBLE PRECISION,
    name        VARCHAR,
    zip         VARCHAR,
    division    VARCHAR
  )

so if we sent data directly from the table, we get:

  {
    "latitude": null,
    "longitude": null,
    "name": "Downtown",
    "zip": null,
    "division": "indianapolis"
  }

and if the reader of this data looks at the existence of the keys for what to do, it's going to be confused. We need to have something that intelligently outputs the data so that null fields are only sent when they are required null fields.

This is getting "messy", and I could see it on his face. It was something he was finding quite distasteful. It wasn't clean, like math. And I could tell it was grating on him.

For me, it's just part of the job. Very reasonable, and something we should be doing for our clients. We need to send them what they need, everything that they need, and nothing they don't need. I'd like that if I was a client of the service, so why shouldn't I want to do that for the clients of my service? Seems only reasonable.

I can tell this is going to be one of those projects that I'm going to wish was over before it ever really got started. Archie is a nice guy. He's funny and personable, and smart. But all too often he decides what he is willing to do, and many times that's not what really needs to be done because it's "messy". Please just do the job, or move on… there really is very little room in life for people that only do what they want.

Posted in Coding, Cube Life | Comments Off on Working Through The Possible vs. The Pretty

Fixed Issues with Production

Monday, December 10th, 2012

This morning I had a production problem that I'm sad to say I probably should have seen coming. I added some new data from Teradata into the system and it was a bunch of metrics for a bunch of deals, all aggregated up to the Merchant level, and thrown into a JSON file for parsing. The original code didn't allow for there not being data for a division, and it could have easily been solved with something like this:

  @division_cache = source_data[division] || {}

but I forgot the "|| {}", and so it was possible to return a nil. This caused a nil pointer problem, and that hurt.

The solution was simple, and thankfully, I had time to re-run everything, but it was something that again I've strayed from - good, solid, defensive coding. I miss it.

I wish I had more control over this project and could enforce this without the knowledge that it'll get ripped out in a few days by someone looking to "improve" the code.

Posted in Coding, Cube Life | Comments Off on Fixed Issues with Production

Old Biases in Java Tools Creeping into Clojure Work

Monday, December 10th, 2012

I know that there's little to nothing I can do about it, but it's a little frightening that as I start to dig more and more into the "World of Clojure" the tools that I didn't like from my previous jobs with Java are rearing their ugly heads: JBoss, Hibernate, etc. Now things may have changed dramatically since I was writing Java code back at a previous job, but I'm guessing it's only changed marginally since then. The problems I had back then were the near mend less use of any package from the net that had anything to do with Java. Certainly if it came from the Apache camp.

Interestingly, it's a lot like the gems I'm seeing in the ruby world - developers do a quick 60-sec google search, see that a gem is written that sounds right, get it and assume it's bug free. Only when it doesn't work do they start to see that often times, Free Software isn't free.

So I'm trying very hard to keep my mouth shut about all this as I know the guys I'm working with on this don't have my experience with these tools, and in general, don't really even think about production issues at all. In fact, when I mentioned logging in clojure to the "clojure guru" here in the Shop, he said "I'm not really sure about any of that"

I just shook my head.

This is the kind of idealistic attitude I see every day. Most of these guys think it's enough to get the code working. Most times. So what's the big deal if you have to hack it up a bit now and again? No biggie, right?

I've seen this time and again in academics, and it's OK there, but if you want to make something that's going to run in the real world, you have to pay attention to all the details. Not just go with the brightest, shiniest, object in your field of vision.

Just keep your mouth shut, Bob. It's not going to change anything, anyway.

Posted in Coding, Cube Life | Comments Off on Old Biases in Java Tools Creeping into Clojure Work

Lexical Confusion Causes Problem

Monday, December 10th, 2012

I got hit with a nasty little bug this morning. In the multi-stage processing of the app I've been working on, if we have a real show-stopped problem with one of the divisions, we don't want to run subsequent steps in the processing as we know they will be wrong, and it'll make it that much harder to fix things in the morning when I check on them. So I did something like the following:

  function process_list {
    for div in $remaining_divisions; do
      if [ "`echo $failed_divisions | grep $div`" = "" ]; then
        # do the processing
      fi
    done
  }

but this gets us into problems when we have a single failed division in the first step of the processing:

  failed_divisions=" charleston-wv"

and we have another division named charleston.

The grep isn't catching the distinction, and so the failure of charleston-wv is causing charleston to not complete it's processing. Crud.

The solution was to include delimiters in the name, so that a list of failed divisions is really constructed like:

  for div in $remaining_divisions; do
    failed_divisions="$failed_divisions |$div|"
  done

and then the grep can be changed into the more robust:

  function process_list {
    for div in $remaining_divisions; do
      if [ "`echo $failed_divisions | grep \|$div\|`" = "" ]; then
        # do the processing
      fi
    done
  }

This should keep things honest and minimize the problems of these division name collisions.

Posted in Coding, Cube Life | Comments Off on Lexical Confusion Causes Problem

Google Chrome dev 25.0.1349.2 is Out

Friday, December 7th, 2012

Google Chrome

This morning I noticed that Google Chrome dev is now up to 25.0.1349.2 with another nearly useless set of release notes. While I like that they provide a link to the SVN log, that's not really release notes by any stretch of the imagination. I write very nice commit messages, but even I know those aren't anything like release notes. They're far too detailed. I wish they'd write release notes and not just reference the SVN logs.

Posted in Coding, Everything Else, Open Source Software | Comments Off on Google Chrome dev 25.0.1349.2 is Out

Getting Tools for Clojure Project Going

Thursday, December 6th, 2012

I've been asked to start full-time on a new phase of the project I've been on, and the first thing I want to get working is the infrastructure - the machines, the databases, the deploy scripts, etc. These make the rest of the coding a lot nicer. Since Leiningen already handles the creation of a deployment jar, all I needed to do was to build something that made it simple to build and deploy these jar files. Since the original project had a similar scheme, it made sense to carry that same theme through to this new project. The original used ruby and rake to make the deployment scripts, but the syntax was easy to reproduce with make and bash scripts.

The makefile was pretty simple as it just called the bash script(s), and while the bash scripts aren't too bad, there were plenty of things to work out because most of the interesting work was done on the remote host. The most interesting part was building the /etc/init.d script to stop and start the application. Again, it's not too hard, but it's something that had to take a little time to work out the details.

In the end, we have a nice init.d script for the project that's deployed with each deployment of the application. We can then use this and the makefile to deploy, start and stop the application on the two datacenter hosts. Not bad at all.

Posted in Coding, Cube Life | Comments Off on Getting Tools for Clojure Project Going

Tired of Waiting for People – Finishing Teradata Pull

Wednesday, December 5th, 2012

Building Great Code

After waiting for a few other folks in another group, I just decided that there was no reason to wait any longer. A co-worker in Palo Alto has been waiting on some data for weeks now, and there's no reason for it. I had the ruby code to pull data from Teradata and put it into JSON structures for use in the main code base. I had some time today, and just decided that there wasn't a good reason to wait any longer.

I got the code out of storage and refreshed the SQL query with my co-worker and then started summarizing the data as per his requests. Thankfully, it was all pretty straightforward - I needed to collect all deals for a merchant, and take the median of a few values and count up the occurrences of a few others. Nothing horrible, and a few helper methods made pretty quick work of it.

After I got it all generated, it was time to work the data into the Merchant model in the existing code. The final destination for this data is to update the sales value calculation by updating the Merchant's quality score based on previous deals. I needed to put it in the ETL for the raw merchant data and just merge in the new data with the existing data and then it's ready to be used in the calculator.

Not bad. And it didn't take more than an hour or two. No need to wait for the other group any longer. Now they can write their code and then we can make a simple REST client to it and fold in the data in the same way. Easy to update and simple to retrofit. Nice.

Posted in Coding, Cube Life | Comments Off on Tired of Waiting for People – Finishing Teradata Pull

Default Encodings Trashing Cronjobs

Wednesday, December 5th, 2012

This morning, once again, I had about 500+ error messages from the production run last night. It all pointed to the JSON decoding - again, but this time I was ready: the fail-fast nature of the script now didn't try to do anything else, and I could retry them this morning. So I did.

Interestingly, just as with the tests yesterday, when I run it from the login, it all works just fine. So I fired off the complete nightly run and then set about trying to see what about the crontab setup on these new boxes was messed up and didn't allow the code to run properly. Thankfully, based on yesterday's runs, I know I could get them all done before the start of the day.

So when I started digging, I noticed this in the logs:

  Input length = 1 (Encoding::UndefinedConversionError)
    org.jruby.RubyString:7508:in 'encode'
    json/ext/Parser.java:175:in 'initialize'
    json/ext/Parser.java:151:in 'new'
    ...

so I did a little googling and it brought me back to encodings - what I expected. Which reminded me of this issue I had with reading the seasonality data in the first place. Then I looked at our code, and we are using a standard reader method to get data for both CSV and JSON:

  def self.read_file(filename)
    contents = ''
    what = project_root + '/' + filename
    File.open(what) do |file|
      contents = file.read
    end
    contents
  end

which is all very standard stuff.

What the hits on google were saying was that I needed to think about the encodings, and so I changed the code to read in iso-8859-1 and then transcode it to utf-8:

  def self.read_file(filename)
    contents = ''
    what = project_root + '/' + filename
    File.open(what, 'r:iso-8859-1') do |file|
      contents = file.read
    end
    contents.encode('utf-8', 'iso-8859-1')
  end

Then I saw in another post about encodings in ruby, that I could collapse this into one step:

  def self.read_file(filename)
    contents = ''
    what = project_root + '/' + filename
    File.open(what, 'r:iso-8859-1:utf-8') do |file|
      contents = file.read
    end
    contents
  end

which simplifies the code as well as the understanding: The file is iso-8859-1, but I want utf-8. Perfect! I put this in and I should be good to go.

But the question is really then: Why does the login shell work? After all, if they both failed, that would make sense. But they both don't. That got me looking in the direction of what's defined in the login shell that's not in the crontab pseudo-shell. As soon as I scanned the output, it was clear:

  LANG=en_US.UTF-8

and that explained everything.

The crontab 'shell' doesn't define this, and you can't put it in the crontab file like you can the SHELL and MAILTO variables. So the solution was simple: put it in my main script right after the PATH specification:

  export LANG="en_US.UTF-8"

and all the problems should just go away! That would be nice. I'll have to check when the runs are finished this morning.

Posted in Coding, Open Source Software | Comments Off on Default Encodings Trashing Cronjobs

Updating Metrics App for Couch Changes

Tuesday, December 4th, 2012

Most of my day was spent struggling with the 'metrics' app - a simple web app that we use to present the metrics for all the runs we do. Now that we're running all of North America, the next most important issues to solve are adding a few columns to some CSV exports from this web app. But as I soon found out, this was far more involved than adding a column or two.

The reason they needed to be added was just additional information for the users investigating the data to spot problems. But what I soon found was that the changes we had made to how we wrote data to Couch - as four separate documents as opposed to one document and three (server-side) updates to that document, had a far greater impact than we knew. Most clearly evident in that a lot of the reports simply didn't work.

So I needed to go back and check every function on the page. Thankfully, most of the ties were to the javascript or backing ruby service code, but it was still a lot of work as there wasn't a ton of documentation on it, and I had to bop back and forth to the Couch web viewer to see what I had available to me to build with.

But the real kicker was when we needed to relate one document, the output of one process doesn't have any way to relate it's output to that of another. The best we've got is the loose relationship of time: one process starts pretty soon after the other.

So I had to add quite a few views, and complicate the logic in order to get what we needed from what we were given, and the timing relationship between the phases. It's not ideal, but it seems to work, and for all the crud I had to go through, it should work.

I'm glad it's over.

Posted in Coding, Cube Life | Comments Off on Updating Metrics App for Couch Changes

Lots of Little Tasks Add Up to Lots of Progress

Monday, December 3rd, 2012

Building Great Code

Today I've spent a lot of time doing a lot of little things that have really added up to some really significant changes for the application. We're already running all of North America, except the account reassignment, so that's a major goal already reached, but there are still a lot of little things that need to be done to get us to the next level.

From this morning's runs, it was clear I needed to put in a little time making the code a lot more robust to bad data. We were getting some nil class exceptions, and that's just being careless with the code. You have to make sure something it's nil before you assume it's not nil.

I also fixed the encoding on the CSV by:

  CSV.foreach(manual, :headers => true, :encoding => 'iso-8859-1') do |rec|
    # ...process the record
  end

in a very similar manner, we got a new file from the users for the seasonality data, and this guy had plenty of non-UTF-8 characters and rather than edit them out, I choose to use the different encoding to properly handle them.

Finally, I updated the logging on the reassignment phase so that we could really see what's happening on the unassignment and assignment phases - including a very easily extractable 'undo' text for those times that we may need to undo the changes we've made. This has been a problem for a while, and it really just needed to get punched out.

I had a few more, but they were even less exciting than these. All told, however, I cleared a lot of issues in the system, and that's what really counts.

Posted in Coding, Cube Life | Comments Off on Lots of Little Tasks Add Up to Lots of Progress

What's it all about, Alfie?

Archive for the ‘Coding’ Category

Working Through The Possible vs. The Pretty

Fixed Issues with Production

Old Biases in Java Tools Creeping into Clojure Work

Lexical Confusion Causes Problem

Google Chrome dev 25.0.1349.2 is Out

Getting Tools for Clojure Project Going

Tired of Waiting for People – Finishing Teradata Pull

Default Encodings Trashing Cronjobs

Updating Metrics App for Couch Changes

Lots of Little Tasks Add Up to Lots of Progress

Pages

Archives

Categories