What's it all about, Alfie?

Archive for the ‘Coding’ Category

Code Should be Simple – Not Hidden

Wednesday, September 12th, 2012

I was talking to a couple of guys in The Group today and I heard a few guys talk about extracting the logging and timing metrics from the code itself, and have them be simply meta-programmed into Ruby such that all methods of a certain class would be logged and timed. Now I'm all for simplifications - to a point, but this is really, in my mind, going way too far. There's something about minimalism that I think is attractive to the Math types in a group, and we have them, but that's not at all realistic, as code needs to function in the real world, which means that it has to check inputs, log lots of intermediate state, and in general do all those things that rookie coders don't do and it gets them into trouble when their code doesn't perform well in production.

Simple is one thing. Hidden is another.

Don't hide the complexity of logging. It's in your face, and it's meant to be. There's no way some meta-programmed log system is going to know where I want to put every one of the log messages I want. Timings is a little easier, but it's still a mistaken assumption that simply timing method calls is sufficient as I'll never need to sub-divide a method call.

Fiddlesticks.

You need to have logging and timings sprinkled in your code. It's not homework coding, but I'm coming to believe that there's a lot to be said for the C++/Java world that I come from. Ye,s it's not Ruby, and there are a lot of things to like in Ruby, but there's a lot that I think these guys take for granted and don't properly write code to guard against.

Make code simple, yes. But if you hide too much, then you'll forget what's really being done, and you're going to get hit in the rear pretty soon. Performance is a huge blind spot for the majority of these guys. They just don't see it, and don't see it as needed. Couldn't be more wrong. It's always important in a production system.

So I'm going to try and guide them away from this decision, as I am a firm believer that it's going to get them, and me, into hot water. We just don't need it.

Posted in Coding, Cube Life | Comments Off on Code Should be Simple – Not Hidden

Ruby `include` vs. `extend`

Wednesday, September 12th, 2012

Ruby

I learned a valuable lesson about Ruby this morning - if you have a module of shared code in Ruby:

  module SharedStuff
    def log
      puts "log"
    end
  end

then:

include makes the module's methods available to an instance of the class
extend makes the methods available to the class itself

This means that by using the same base code, you can add them in as class methods with the extend directive, and as instance methods with the include directive.

Totally 'Ruby', as it's a strange, massive difference that might be missed by new users of the language.

Glad I Learned it.

Posted in Coding, Cube Life | Comments Off on Ruby include vs. extend

Google Chrome dev 23.0.1262.0 is Out

Tuesday, September 11th, 2012

This morning I noticed that again, Google Chrome dev is bumped to 23.0.1262.0 with some more good release notes. There's an update to WebKit (537.10), and the V8 javascript engine (3.13.6.0), and at least one Mac-specific fix. Nice! The page refresh speed is really quite amazing, and has been for the last two releases. It's really impressive. I'm hoping they keep it up!

Posted in Coding, Everything Else, Open Source Software | Comments Off on Google Chrome dev 23.0.1262.0 is Out

Refactoring the Web Page for Faster Loading

Monday, September 10th, 2012

This morning I finished a refactoring that I'd started on Friday when I realized that the UX of the web pages I was making (to be thrown away) were really all wrong. I was doing the more direct scheme - loading a lot of data into memory and then manipulating it quickly to render different graphs. The problem with this is that the data sets are really very large, and the load times for them are longer than you'd want to sit for.

Once of the guys on the team was saying "Break it up into a bunch of small requests." And while I could see that approach, I thought that the overhead of all the calls was going to really kill performance. I never even really considered it.

But about 4:00 pm on Friday, when I was really very frustrated with the code I was working on for that page, I decided to give it a go. What made most sense to me was to break the requests into a few stages:

Get the list of all runs - based on the selected database, get the division/timestamp pairs for all runs in that database. We'll be able to parse them next, but this is a nice, short little method.
Parse the executions for divisions and timestamps - take the division/timestamp data for all runs in a database and create a data structure where the list of timestamps will be stored in a descending order for a given division.
Set HTML run options for a given division - when the user selects a division, take the parsed data and create the drop-down for the runtimes for that guy.
Query for the specific data - taking the data from the options - the database, the division, the timestamp, hit CouchDB for the exact data we need to visualize. In many cases this is less than 100 documents.
Parse the documents - once we have the targeted data from CouchDB, parse it into Google DataTable or ZingChart series.
Render the data - last step.

I was surprised to see that the resulting code was smaller than I'd had. The parsing of the data structures was really a lot more than I thought. Starting at the top of the list, the code to get the list of all runs is really simply:

  function reload_executions() {
    // hit CouchDB for the view of all executions it knowns about
    var svr_opt = document.getElementById('server_opt');
    var url = svr_opt.value + '/_design/general/_view/executions?' +
              opts + '&callback=?';
    $.getJSON(url, function(data) {
      parse_execution_tags(data);
    });
  }

Once again, jQuery really helps me out. Next, I need to parse this data into a structure of all the runs by division:

  function parse_execution_tags(data) {
    divisions = new Array();
    runtimes = new Object();
 
    for(var i in data.rows) {
      // get the execution_tag and exclude the very early ones
      var exec_tag = data.rows[i].key;
      if (!/-\D+$/i.test(exec_tag) || (exec_tag.substring(0,10) < weekAgo)) {
        continue;
      }
      // now get the timestamp and division from the execution_tag
      var runtime = exec_tag.replace(/-\D+/g, '');
      var division = exec_tag.replace(/^.*\.\d\d\d-/g, '');
      if (typeof(runtimes[division]) == 'undefined') {
        runtimes[division] = new Array();
        divisions.push(division);
      }
      runtimes[division].push(runtime);
    }
 
    // sort the divisions and create the contents of the drop-down
    if (divisions.length > 0) {
      divisions.sort();
      var div_opt = document.getElementById('division_opt');
      div_opt.options.length = 0;
      for (var d in divisions) {
        div_opt.options[div_opt.options.length] =
            new Option(divisions[d], divisions[d]);
      }
    }
 
    // given the default division, load up the run times we just parsed
    set_runs_for_division(divisions[0]);
  }

where I'd created the variable weekAgo to be able to let me know what the "recent" data was:

  // get the date a week ago formatted as YYYY-MM-DD
  var when = new Date();
  when.setDate(when.getDate() - 7);
  var weekAgo = when.getFullYear()+'-'
                +('0'+(when.getMonth()+1)).substr(-2,2)+'-'
                +('0'+when.getDate()).substr(-2,2);

Once the data is all parsed into the structures we can then build up the drop down for the runs for a selected division with the function:

  function set_runs_for_division(division) {
    division = (typeof(division) !== 'undefined' ? division :
                document.getElementById('division_opt').value);
    runtimes[division].sort();
    runtimes[division].reverse();
    var run_opt = document.getElementById('run_opt');
    run_opt.options.length = 0;
    for (var i in runtimes[division]) {
      var tag = runtimes[division][i];
      run_opt.options[run_opt.options.length] = new Option(tag, tag);
    }
    // at this point, call back to the the data we need, and then render it
    reload();
  }

Calling to get the actual data is pretty simple:

  function reload() {
    // hit CouchDB for the view we need to process
    var svr_opt = document.getElementById('server_opt');
    var view_opt = document.getElementById('view_opt');
    var run_opt = document.getElementById('run_opt');
    var div_opt = document.getElementById('division_opt');
    var et = run_opt.value + '-' + div_opt.value;
    var url = svr_opt.value + '/' + view_loc + view_opt.value + '?' +
              'startkey=' + JSON.stringify([et,{}]) +
              '&endkey=' + JSON.stringify([et]) + opts + '&callback=?';
    $.getJSON(url, function(data) {
      var tbl = parse_series(data);
      render(tbl);
    });
  }

but parsing it into a Google DataTable is not nearly as simple. The code is complicated by the different requests we need to properly create:

  function parse_series(data) {
    // now put all the data into an object keyed by the execution_tag
    var view_opt = document.getElementById('view_opt');
 
    var table = new google.visualization.DataTable();
    table.addColumn('string', 'Division');
    switch (view_opt.value) {
      case 'merchants_by_existing_merchant':
        table.addColumn('number', 'Deals');
        break;
      case 'merchants_by_research_ranking':
        table.addColumn('number', 'Rank');
        break;
      case 'merchants_by_status':
        table.addColumn('string', 'Status');
        break;
      case 'merchants_by_rep':
        table.addColumn('string', 'Rep SF ID');
        break;
    }
    table.addColumn('string', 'Merchant');
    table.addColumn('number', 'Sales Value');
 
    for(var i in data.rows) {
      var row = data.rows[i];
      var name = (row.value.name.length > 60 ?
                   row.value.name.substring(0,60)+'...' : row.value.name);
      var table_row = new Array();
      table_row.push(row.value.division);
      switch (view_opt.value) {
        case 'merchants_by_existing_merchant':
        case 'merchants_by_research_ranking':
        case 'merchants_by_status':
        case 'merchants_by_rep':
          table_row.push(row.key[1]);
          break;
      }
      table_row.push(name);
      table_row.push(row.value.sales_value);
      table.addRow(table_row);
    }
 
    // now let's apply the formatter to the sales value column
    var fmt = new google.visualization.NumberFormat(sv_format);
    switch (view_opt.value) {
      case 'merchants_by_existing_merchant':
      case 'merchants_by_research_ranking':
      case 'merchants_by_status':
      case 'merchants_by_rep':
        fmt.format(table, 3);
        break;
      default:
        fmt.format(table, 2);
        break;
    }
 
    return table;
  }

but the rendering is very simple:

    function render(tbl) {
      var dest = document.getElementById('table_div');
      var table = new google.visualization.Table(dest);
      table.draw(tbl, table_config);
    }

When I put it all together I was amazed to learn that the hits were exceptionally fast. The page is far more responsive, and in short - I could not possibly have been more wrong. The human lag is sufficient to make the calls invisible, and the sluggishness of the memory load on the old version was horrible. This is a far better solution.

I'm going to remember this for the future.

Posted in Coding, Cube Life | Comments Off on Refactoring the Web Page for Faster Loading

Spiffy Bash Prompt in Python

Monday, September 10th, 2012

This morning a co-worker tweeted that he found this spiffy Bash prompt generator built in Python. Now I'm not normally one to adorn my shells with command prompts like this, but Wow! this is impressive. I mean it's got your name, the path, the git branch you're on, and even history with a red background in case of an error. That's pretty impressive.

Then again, it's probably a serious Python script that takes some time to run, but for those that want a pretty prompt, this looks pretty amazingly stylish. I gotta hand it to him. It's really close to something that's nice enough for me to start using it.

So it goes… I'll keep it in mind for now.

Posted in Coding, Everything Else, Open Source Software | Comments Off on Spiffy Bash Prompt in Python

Lots of Web Stuff – And it’s All Going to be Tossed

Friday, September 7th, 2012

I did it again today. I pushed myself to get a lot of stuff done today for a big, important demo (again), and along the way a few people interrupted me to ask me to look at a few things. Had I been smarter… had I been wiser… I'd have said "I'm sorry, I'm on this push for the demo, how about I get to it on Monday?" But I didn't.

What I did was to push myself to the point that I was very upset with the things I was working on. Oh, it didn't start out that way. It started out with me thinking that I could easily track down this problem that one of the data science guys pointed out. It was in my code for grouping the merchants where I wanted to be smart and clever. I should have known.

First, I wasn't accounting for the case where two groups of overlapping merchants are built and then a single merchant bridges both groups. I messed up. So I needed to go back and fix a few things. First off, I didn't have a general overlapping service method. So I took the old one and expanded it:

  # this method returns true if ANY service is shared between the two
  # merchants. ANY.
  def self.services_overlap?(ying, yang)
    if ying.is_a?(Array)
      # explode the calls for an array in the first position
      ying.each do |d|
        return true if services_overlap?(d, yang)
      end
      return false
    elsif yang.is_a?(Array)
      # explode the calls for an array in the second position
      yang.each do |d|
        return true if services_overlap?(ying, d)
      end
      return false
    end
    # this is the simple call for a 1:1 check
    !(get_services(ying) & get_services(yang)).empty?
  end

Basically, I just allowed arrays to be passed in, and where necessary, I exploded them to allow the basic logic to be applied to the individual merchants. It's not hard, but at this point, I didn't need to worry about what I was passing in to check for an overlap.

The next thing was to completely redo the group_by_service() method as it was far too complex, and it wasn't even working. I didn't like the fact that it was doing a lot of extra checks, etc. but that seemed to be the Ruby Way. Poo. I changed it into a simple single-pass loop that's far simpler and far faster:

  def self.group_by_service(otcs)
    # start with the array of groups that we'll be returning to the caller.
    groups = []
    otcs.each do |d|
      added = []
 
      # add the OTC to all groups that it has some overlap with
      groups.each_with_index do |g, i|
        if services_overlap?(d, g)
          g << d
          added << i
        end
      end
 
      # if we added it to more than one group, then consolidate those groups
      if added.size > 1
        added[1..-1].each do |i|
          groups[added[0]].concat(groups[i][0..-2])
          groups[i] = nil
        end
        groups.compact!
      end
 
      # if he hadn't been added to anything, make a new group for him
      groups << [d] if added.empty?
    end
 
    groups
  end

The ideas here are a lot clearer - add a new guy in to all the group he'd match, then look for multiple matches. If there are, then simply and cleanly consolidate the groups, and continue. My co-worker in Palo Alto liked this code a lot more as well. I do too. It's ruby, it's just not incomprehensible ruby. And it's right.

But in the midst of this, I'm trying to get more web stuff done for another interruption for the demo, and it's nothing I'm happy with. It's all very lame. I don't like the interface I have to CouchDB, I don't have a lot of time, and it's making me very cranky.

I've run into this before, and it's not the first time I've not been able to say "No", and it's cost me something I didn't want to pay. I know I'm no great graphics designer. I know I can make things that work, but they aren't going to be "Wow!" with anyone but someone looking just at the functionality. That's just not in my DNA. And it is exceptionally frustrating to be in a situation where I'm forced to do this work.

I know what is good. I can appreciate it. But I can't generate it. And to be forced to do it is hard. Because I know they are going to throw it all away. Any half-decent designer will look at what I've done and say "Nice, but let's take out the data collecting and put it in a nice design" - as they should.

So I've got to work harder at keeping my cool.

I've got to say "No" more often.

Or I just won't last.

Posted in Coding, Cube Life | Comments Off on Lots of Web Stuff – And it’s All Going to be Tossed

Code Monkeys

Friday, September 7th, 2012

Code Monkeys

I was talking with a good friend this morning and I came up with a name for a lot of the ruby devs I've run into - but to be fair, it's not just for a good chunk of the ruby devs I've met - it's for a general class of developers. Let's pretend to be a little more precise about this:

Code Monkey - a developer that is more interested in learning a language and how to solve a few problems in it, than using it to solve real-world problems. This includes, but is not limited to, the clojure devs that have never written a comment, and only solved the zebra/water puzzle, as well as devs that never code defensively, or even think that production is important.

This came up because I'd been battling code that wasn't written at all defensively. It was basically assumed to have been run by a person, with a person to fix any problems as they occur. It's like a glorified Excel spreadsheet - I'm going to hit 'Go', and fix things that come up.

But this doesn't really work for real life, does it? Who wants a system that runs at night that has to be constantly monitored to make sure it doesn't get bad data, etc.

Yet they are the first ones that are onto a new language - like clojure. Saying that the real solution is to use a language that doesn't need all that checking as a functional language simply doesn't require it.

What world are they living in?

How is a language able to do ETL on it's own? Answer: It can't. You still have to do it. But the Code Monkeys are really skipping all that because they start with good data and then the process is clean, and simple.

No kidding? Really? Well, of course it is! The same is true for C++, Java, and any other language you want to pick. Start with clean data, don't worry about exceptions and potential problems, and you're going to be able to write amazingly clean code. But that's not how life really works.

We agreed that there were guys with language knowledge, and skills, but they never really dug in and made it work. It's nice to talk to Code Monkeys, but it's not nice to have to work with them. You're always cleaning up their messes.

Posted in Coding, Cube Life | Comments Off on Code Monkeys

Crazy Tired from Crazy Hard Work

Thursday, September 6th, 2012

Once again, we're gearing up for a big demo tomorrow with some of the users, so today has been full of a lot of things that needed to get done in order to have a successful presentation. I had to make quite a few new views for CouchDB, and then work those into a web page and publish everything up to UAT and production for a run. I'm becoming a big fan of CouchDB and it's views and reductions… those are some very powerful tools for looking at this JSON data in CouchDB. Very nice.

I've also done a little fiddling with the Sublime Text 2 styles, and I'm exceptionally happy how that's all turned out. It's made it much nicer to work with. I can't believe I haven't tried it up to this point, and I can't imagine a better editor for me. I'm going to have to brush up on my Python and write some packages some day.

Finally, I'm just dead tired. Great feeling. Lots of really good work done, and the team is really working together far better than I'd have thought a month ago. This is really quite fun.

Posted in Coding, Cube Life | Comments Off on Crazy Tired from Crazy Hard Work

Hacking on the Sublime Text 2 Syntax Highlighting

Thursday, September 6th, 2012

Sublime Text 2

This morning I was getting tired of the pretty lame syntax highlighting of YAML files in Sublime Text 2 - and I know it could be better. So I started digging. The first thing I looked at was the tmTheme file that I can cloned of the Eiffel theme in the standard release package. It's close… white background, nice colors, but it's not perfect, and I wanted perfect. So here's what I found out.

The matching of the language is really in the tmLanguage files in the packages. These are a bunch of regexs, and it's fine, but each pattern match then pins the color to use to some "classification" - a dotted-notation similar to a Java package. The idea is that if you specify only the first part or parts, then the last parts are up for specialization.

For instance, if you want to have a numeric constant style, it makes sense to build them hierarchically: constant -> numeric -> yaml, this leads to the classification: constant.numeric.yaml. But if you want all constants to be a certain style (by default), you can simply specify the constant style in your tmTheme file.

Alternatively, if you want all your numeric constants to be a certain style except those in java, you make a style for constant.numeric and then a new one for constant.numeric.java. Simple. But certainly not simple to figure out by looking at the files.

So I realized that for YAML, I didn't want the 'Embedded source' to have a colored background. So I added:

  <dict>
    <key>name</key>
    <string>Embedded source</string>
    <key>scope</key>
    <string>source.php.embedded.block.html, string.unquoted.yaml</string>
    <key>settings</key>
    <dict>
      <key>background</key>
      <string>#FFFFFF</string>
    </dict>
  </dict>

so now it's got a white background. Nice.

The next thing was to notice that I didn't like that the keys in YAML were red like almost all the text (strings, constants, etc.) so I wanted to make those keys blue:

  <dict>
    <key>name</key>
    <string>Markup name of tag</string>
    <key>scope</key>
    <string>entity.name.tag.yaml</string>
    <key>settings</key>
    <dict>
      <key>fontStyle</key>
      <string>bold</string>
      <key>foreground</key>
      <string>#1C02FF</string>
    </dict>
  </dict>

and now the keys are a nice blue. Much better!

All this is just in my clone of the Eiffel theme in the Packages/User/ directory in the Application Support for Sublime Text 2. Very nice.

UPDATE: I realized it should be easy to do the same for PHP - which has the annoying background color, and it was! You simply have to look into the tmLanguage file and see the tag name that's used and place it in the string in a simple comma-delimited list. Very slick!

UPDATE: I noticed a few more that I wanted to add - all from the HTML syntax highlighting. The code became:

  <dict>
    <key>name</key>
    <string>Embedded source</string>
    <key>scope</key>
    <string>
      source.php.embedded.block.html,
      source.css.embedded.html,
      source.js.embedded.html,
      source.python.embedded.html,
      source.ruby.embedded.html,
      string.unquoted.yaml
    </string>
    <key>settings</key>
    <dict>
      <key>background</key>
      <string>#FFFFFF</string>
    </dict>
  </dict>

Posted in Coding, Vendors | Comments Off on Hacking on the Sublime Text 2 Syntax Highlighting

Java 1.6.0_35 Out on Software Updates

Thursday, September 6th, 2012

Software Update

This morning I noticed that Java 1.6.0_35 was updated on Software Updates most likely due to a security issue that's been patched. Given the way Oracle is handling Java, I'm really wishing that Apple would retain control of Java for OS X. Right now, I'm wishing they had it slightly better integrated into the OS such that starting a JVM instance wasn't so time-consuming. Linux handles this with all the shared libs being loaded. Then it's a very lightweight thing to spin up the JVM. On OS X, it's a lot more.

Still, it's nice to see that they have at least one more update. Maybe cooler heads will prevail in the coming months? Most likely not, but a guy can wish, can't he?

Posted in Apple, Coding, Everything Else | Comments Off on Java 1.6.0_35 Out on Software Updates

What's it all about, Alfie?

Archive for the ‘Coding’ Category

Code Should be Simple – Not Hidden

Ruby `include` vs. `extend`

Google Chrome dev 23.0.1262.0 is Out

Refactoring the Web Page for Faster Loading

Spiffy Bash Prompt in Python

Lots of Web Stuff – And it’s All Going to be Tossed

Code Monkeys

Crazy Tired from Crazy Hard Work

Hacking on the Sublime Text 2 Syntax Highlighting

Java 1.6.0_35 Out on Software Updates

Pages

Archives

Categories