Archive for the ‘Coding’ Category

Really Hitting the Wall on this LKit Feature

Saturday, August 25th, 2012

LKit Language

I'm trying to implement user-defined functions in LKit. and I'm having a really hard time with it. I'm not at all sure how to handle things with this addition - especially since I want to have recursion and pass-by-value work as you'd expect in a lisp-based language. The parsing of the code isn't the problem - that I've got figured out. But when I compile the code into an evaluation-tree, it's going to point to variables - some of which are defined outside the user-defined function, and some of them are the arguments to the function.

If they are the arguments, then we can't really have a static evaluation-tree… I'd have to have some kind of dynamic evaluation of the arguments for each invocation. It's getting to be a lot harder than I'd expected.

Now it's true that this was something that I didn't even attempt in my previous version of the code, but I was really hoping for something more this time. I'd even thought that I could bang out this function definition code today. Not so fast, it seems.

I need to handle the question of a calling stack. Really. I need to be able to "dive into" the evaluation of a function and then return to where I was. This is really a different layer than what I'm doing now, and it's going to take some significant time to think about it.

Well… maybe my next vacation.

Updating my WordPress CodeHighlighterPlus to GeSHi 1.0.8.11

Saturday, August 25th, 2012

wordpress.gif

I was looking for something to do as a good excuse not to solve this problem in LKit today, so I dug into the latest news on GeSHi and the version I'm running on currently is 1.0.8.6, but 1.0.8.11 is out, and the language count is now above 200. Nice. Not that I usually write in more than 200 languages, but the odds that APEX code, for instance is supported goes up as the number increases.

Anyway, the first thing was to download the GeSHi package and place it into my repo for CodeHighlighterPlus. It's basically the root of the repo, but I hadn't realized that in the last time I updated GeSHi.

The only thing I needed to watch out for were those few edits that I made to geshi.php the last time, and those were pretty easily isolated. I'd repeat them here, but the fact is with the repo on GitHub, you can just go there and get everything you need. Simple.

The new language support is:

4cs            dot            lscript        pycon
6502acme       e              lsl2           pys60
6502kickass    ecmascript     lua            python
6502tasm       eiffel         m68k           q
68000devpac    email          magiksf        qbasic
abap           epc            make           rails
actionscript   erlang         mapbasic       rebol
actionscript3  euphoria       matlab         reg
ada            f1             mirc           rexx
algol68        falcon         mmix           robots
apache         fo             modula2        rpmspec
applescript    fortran        modula3        rsplus
apt_sources    freebasic      mpasm          ruby
arm            freeswitch     mxml           sas
asm            fsharp         mysql          scala
asp            gambas         nagios         scheme
asymptote      gdb            netrexx        scilab
autoconf       genero         newlisp        sdlbasic
autohotkey     genie          nsis           smalltalk
autoit         gettext        oberon2        smarty
avisynth       glsl           objc           spark
awk            gml            objeck         sparql
bascomavr      gnuplot        ocaml-brief    sql
bash           go             ocaml          stonescript
basic4gl       groovy         octave         systemverilog
bf             gwbasic        oobas          tcl
bibtex         haskell        oorexx         teraterm
blitzbasic     haxe           oracle11       text
bnf            hicest         oracle8        thinbasic
boo            hq9plus        oxygene        tsql
c              html4strict    oz             typoscript
c_loadrunner   html5          parasail       unicon
c_mac          icon           parigp         upc
caddcl         idl            pascal         urbi
cadlisp        ini            pcre           uscript
cfdg           inno           per            vala
cfm            intercal       perl           vb
chaiscript     io             perl6          vbnet
cil            j              pf             vedit
clojure        java           php-brief      verilog
cmake          java5          php            vhdl
cobol          javascript     pic16          vim
coffeescript   jquery         pike           visualfoxpro
cpp-qt         kixtart        pixelbender    visualprolog
cpp            klonec         pli            whitespace
csharp         klonecpp       plsql          whois
css            latex          postgresql     winbatch
cuesheet       lb             povray         xbasic
d              ldif           powerbuilder   xml
dcl            lisp           powershell     xorg_conf
dcpu16         llvm           proftpd        xpp
dcs            locobasic      progress       yaml
delphi         logtalk        prolog         z80
diff           lolcode        properties     zxbasic
div            lotusformulas  providex
dos            lotusscript    purebasic

Then it's time to check it all into GitHub and then pull it down on the servers and see how it goes.

After the pulls were done, all was well, and things are looking very nice. Success!

Good Intentions and Real Development

Friday, August 24th, 2012

cubeLifeView.gif

We're in the final hours for a big demo with the Top Brass, and I'm trying to get things done, but I go to check on a run being done on the UAT box, and I find out that someone has started another copy! Now I know he didn't mean to mess me over, but he did. And I know he didn't mean to trash 30 mins of my work, but he did.

It's all that:

The road to Hell is paved with good intentions - Proverb

I know what it's like to be doing the best you can. I really do. I remember being on a Class A softball team, and I was lucky to get hits and play catcher. I was clearly outclassed. But this is a job, not a recreation. This is where you're supposed to be good - not just want to be good.

When I asked him if he'd started running things, he was honest and upfront, but said "Doesn't it check against that?" Of course not! Why would it? Well… now I have the answer to that question - I needed to write it for guys like him.

There was another exchange I had with someone… I had someone show me the following javascript:

  function(doc) {
    var filter = function(doc) {
      return doc.meta.label == "QuantumLead.results" &&
             doc.otcs.length > 0 &&
             doc.merchant.category != null &&
             doc.merchant.sales_value > 0;
    };
 
    var key = function(doc) {
      return [
        doc.meta.execution_tag,
        doc.meta.division,
        doc.merchant.category,
        doc.merchant.sales_value
      ];
    };
 
    var value = function(doc) {
      return {
        name:        doc.merchant.name,
        sf_id:       doc.merchant.sf_id,
        sales_value: doc.merchant.sales_value
      };
    };
 
    if (filter(doc)) {
      emit(key(doc), value(doc));
    }
  }

I asked him why he chose to write it that way. Just what the motivation was for the specific structure. His response was that this was about clarity and maintenance. To me, it seems awfully complex for something that I'd have written as:

  function(doc) {
    if (doc.meta.label == "QuantumLead.results" &&
        doc.otcs.length > 0 &&
        doc.merchant.category != null &&
        doc.merchant.sales_value > 0) {
      var key = [ doc.meta.execution_tag,
                  doc.meta.division,
                  doc.merchant.category,
                  doc.merchant.sales_value ];
      var value = { name:        doc.merchant.name,
                    sf_id:       doc.merchant.sf_id,
                    sales_value: doc.merchant.sales_value };
      emit(key, value);
    };
  }

When I asked him if he really thought that his was clearer than mine, he said "Yup", and so I let it drop. After all, there's no reason to make a big deal over this. But again, this is not what I'd call a good format, but hey… I'm trying to be more flexible and I'm no code enforcer here.

I know they mean well… they really do. But it's stuff like this that is exactly why, in the past, I've stepped up and simply pushed folks like this aside.

I'm trying to be better.

Google Chrome dev 23.0.1243.2 is Out

Friday, August 24th, 2012

Google Chrome

This morning the Google Chrome Team upped the major version number to 23.0.1243.2 with a pretty decent set of release notes. The inclusion of the V8 3.13.1.0 javascript engine, and updating WebKit to 537.6 are both really nice.

I was wondering if the least-significant updates of late meant that we'd be seeing this, but I really thought they'd just hit a plateau. Guess not, and that's great.

I can say that the refresh of a page is amazing. Very nice, and this release just keeps moving the bar up. It's nice to see forward progress on Chrome.

CouchRest Bug – Using a Proxy to Get to CouchDB

Thursday, August 23rd, 2012

CouchDB

Because we have several data centers - including boxes at EC2, the standard set-up at The Shop is to have several proxies - forwarding all traffic through a port to a datacenter. So all traffic to the eastern EC2 center leaves my laptop on port 1234 (for instance). This means that when I want to use the CouchRest ruby client for CouchDB, I need to do something like this:

  require 'couchrest'
 
  CouchRest.proxy('http://localhost:1234/') if use_proxy?
  @db = CouchRest.database('http://whackamole.east:5984/hammer')

to connect to the hammer database on the whackamole.east server in the east EC2 datacenter. Not hard, but there seems to be a catch in there somewhere.

For when I try this, I get about 2000 of 4000 documents saved and then I get this problem. A nasty stack trace that appears to be about some timeout, and then an attempted reconnection. What's very interesting is that it has to be in the proxy handling code of CouchRest.

Why?

Because when I do a simple port forwarding on my box to the database server on the other box, and use that port - thus bypassing the need for the proxy setting, everything works. Also, if I run it on a box in the east EC2 datacenter so that I don't need a proxy, then everything works.

I need to dig into the proxy code. I'm not exactly sure what I'll find, but it's gotta be there.

CouchDB is Really Quite Amazing

Thursday, August 23rd, 2012

CouchDB

I've used mongoDB for a while, and it's a decent document database, but I've now been using CouchDB for a day or so, and I have to say that CouchDB beats mongoDB hands down! There just isn't any comparison - even though there may be slightly different targets, I'm not convinced that you can't do everything in CouchDB that is a "strength" of mongoDB.

The one thing that mongoDB has, that I wish CouchDB had was a nice CLI. It would be really nice to be able to do ad-hoc queries against CouchDB on the command line, but as a point of fact, you can do all that with a Temporary View… and get what you need. It's a close point, and certainly a mater of taste.

But for everything else, CouchDB is a far better implementation of a document database. The views are a fantastic feature. Their speed and power is really very impressive. I can't imagine that this is all running on my laptop. It's really pretty neat.

I think I've found my new document database - CouchDB. Fantastic!

Agile Workflow and Gobs of Stories

Thursday, August 23rd, 2012

Agile Methodology Kool-Aid

Over the course of the last few weeks I've tried very hard to embrace the Agile methodology of writing a bunch of stories for simple, isolated tasks, and putting them into PivotalTracker. This morning is no exception, but I wonder when they'll ask me to stop! After all, I just put in about 10 stories for different CouchDB views and visualizations based on those views, and while it makes perfect sense to have them as individual stories, it's also a lot of stories to wade through.

Lots of guys talk about a "scrollbar tax" with large methods - hence the "ruby way" of having methods with no more than 10 lines in them. But then they have the "scrollbar tax" with dozens of stories in Tracker all basically doing the same thing, but about different data and different views. So it's a bit of a head-scratcher.

Is scrolling bad? Or are you just looking for a reason to have exceptionally tiny methods in your exceptionally tiny classes?

I'm not saying that I prefer 200+ line methods in 1000+ line classes - there's a limit to be sure. But there's also a limit on the low-end as well. I've honestly seen this class in the codebase:

  require 'pipeline'
  require 'pinned'
  require 'app_log'
  require 'json'
 
  class PinnerWorker
    def self.perform(data)
      merchant = data[:merchant]
      otcs = Pinner.pin(merchant)
      Pipeline.notify(self.name, data.merge(otcs: otcs))
    end
  end

Excluding the two lines that define the class, and therefore must be there, the actual functional code is three lines! There are more require statements than that! Sure, I can see why they did it - because there used to be something here, and they didn't want to retrofit all the code if they removed this class. But that's just being lazy.

Anyway… I'm trying to be a Good Citizen and make all the stories and then check them off as I go. It's kinda interesting, but it's amazing how much work there is in the Agile Methodology that has nothing to do with coding. Make these stories, but then you can't use them in the documentation, it's in the tracker. You can use them in the docs - people don't make docs (except me), etc.

I'd be all for a scheme where making something meant that it stayed around!. Then there would be a reason for doing a good job of writing up the "need" initially as it'd be part of the eventual docs for the application of feature.

That would be nice!

Working with CouchDB’s Map/Reduce Framework

Wednesday, August 22nd, 2012

CouchDB

This afternoon I've been doing a lot with CouchDB's map/reduce framework for querying data out of CouchDB. The terminology is pretty simple: a Document can hold multiple Views where each view has a Map component that looks at each document in the database and returns something based on the inspection of it's data, and an optional Reduce function that takes all the results of the Map function calls and reduces it to a smaller dataset.

It's pretty standard in a lot of languages: first you operate on the individual elements in a collection, and then you summarize those values. In CouchDB it's all in Javascript. That's not bad, I've done a lot of that in my day, so it's pretty easy to get back into the swing of things.

One interesting issue is that CouchDB is written in erlang, and while I don't see myself digging into the guts of this thing, it's interesting to know where it all comes from, as it makes it a lot easier to understand why they chose Javascript, for instance.

Anyway, let's say I want to see all the merchants that have no OTCs assigned to them. I'd create a Temporary View in the CouchDB web page, and then in the View Code I'd have something like this:

  function(doc) {
    if (doc.meta.label == "QuantumLead.results" &&
        doc.otcs.length == 0) {
      var key = [doc.division,
                 doc.meta.created];
      var blob = { name: doc.merchant.name,
                   sf_id: doc.merchant.sf_id };
      emit(key, blob);
    }
  }

The interesting parts here are that the emit() method is really the action item in this function. When we want to add something to the output for this Map function, we have to call emit() with the first argument being the key, and the second the value. The key, as shown here, can be a multi-part key, and the value can be any Javascript object.

The thing I like about the use of Javascript here is that the attributes look like "dotted methods" and not hash members. This makes it so much easier to reference the data within a doc by just using the key names and dots. Very nice use of Javascript.

So now that I have my first few Views and Documents in the system, I need to work on getting things out of these calls, and into some nicely formatted output for the important demo that's coming up.

Getting Ready for an Important Demo

Wednesday, August 22nd, 2012

I just got an email from our project manager about a demo he's set up for the COO. The email included a response from the COO about the significance and importance of this project, and how it'll play into the long-term plans for this place. It's pretty scary to think of.

So all of a sudden, I'm feeling that same pressure to perform that I have felt for 16 yrs in Finance. It's the first real demo with this level of visibility since I've joined The Shop, and while it might be ho hum for a lot of the guys, for me, it's the "first impression" this guy is going to have of me and my additions to the team. It's not life-or-death, but it's important, and I want it to go well.

So I'm a little nervous… So many things to get finished and in place for the demo… it's not like we'll have time to run through it before time, it'll be wing-it all the way.

Yikes!

Problems Deploying CouchDB to EC2 Servers

Wednesday, August 22nd, 2012

Amazon EC2 Hosting

This morning Jeff is still having problems getting CouchDB deployed to our Amazon EC2 machines, and it's almost certainly due to the deployment system that's in place in The Shop. It's something I completely understand, but it's also based on the idea that you can't trust anyone. That, and it's an old RedHat-based distro that I know from experience is not as easy to deal with as something like a more recent Ubuntu.

Still, it's just the way it has to be, as that's the only way Prod Ops can deal with things, so there's no real way around it. The problem is that you need to be able to build the code on one box, and package it up - similar to an RPM or a deb package, and then deploy it across a lot of machines. All well and good, but Jeff is having a horrible time getting CouchDB 1.2.0 compiled on his build box.

There are some things he's trying, and even seeing if the other folks around here have any ideas. But the latest attempts have left something that looks like CouchDB running on the server, but when I go to add things to it, I get a nasty stack trace about 'Connection refused' after some kind of timeout. I've inserted about 1500 documents of the 2500 I need to, and it stops.

At the same time, I was able to use Homebrew to simply:

  $ brew install couchdb

and then follow a few instructions about getting it to run on my login startup, and that's it. It Just Works.

I would say that this would also be the case if we were looking at standard Ubuntu boxes in EC2 or Rackspace, and using yum or apt get. The real question is why do we need to do these custom packages for Open Source software when they are so easy to just install?

Again… no way to know… no way to answer. It just is and that's it.