Archive for the ‘Coding’ Category

Changing Versions of Gems with Bundler

Wednesday, September 5th, 2012

RVM - Ruby's Manager

I had to do a special gem today, and I wanted to get down how it was build and deployed. This is because I didn't remember it and I'd already done this once. So here goes.

First off, make sure that you have the skeleton of a gem - including the gemspec file - it comes with all GitHub repos, and if you make a new gem with the Bundler, it gives you a simple skeleton as well.

Next, write the code and then build it:

  $ rake build

this should deposit it in your pkg directory.

Upload it to a site - like ruby gems. Simple.

It's worth noting that it might help to delete older versions of the gem. This is easily done with:

  $ gem uninstall couchrest

If there are multiple versions of the gem installed, it'll give you a choice. If there's only one, it's gone.

Fix your Gemfile and then reload all the gems with:

  $ bundle install

I’m Loving CouchDB More by the Day

Wednesday, September 5th, 2012

CouchDB

Today I was really battling a nasty problem with the CouchRest client for CouchDB. This is a ruby gem, and in general, it's pretty decent, but it really starts to fall down when you need to use a proxy to get to CouchDB, and this guy starts having all kings of problems.

There were timeout issues, so I decided to try and make it work. So I forked the project on GitHub, and started to get to work. THe major point was that the underlying RestClient gem had the ability to set things like the timeout for creating the connection as well as timeouts for reading, etc. It's really very flexible. My goal was to allow the settings to be applied on a per-database basis. Then, for every command, use these as the defaults, but overlay any call-time options as well.

The idea was really nice. I was planning on submitting a pull request for this as it only took me about an hour to do. But when I went to test it, it failed with some 502 Bad Gateway error.

Argh!

More proxy problems!

Then I was talking to one of the guys in the group about this issue and he brought up that I could write to my local CouchDB, and then replicate it to a different database on a different server!

BING!

This is exactly what I'd been looking for. I get fast and efficient writes to my CouchDB, but it gets written up to the shared server as I'm connected to the network. This is great!

The configuration is simple - it's a simple doc in the _replicator database, and I'm in business. This is really quite amazing. First, go to the overview in the CouchDB web page, and select the _replicator database:

Replicator database

then create a new document:

New Document

Finally, populate it with these essential key/value pairs:

replication doc

  • source - this is the source for replication - it only goes one-way, so this can be a local database, or a remote one. But it's where the data is coming from
  • target - this is the destination for replication - again, local or remote, it makes no difference. Make sure to put in the port and the database name in the URL
  • proxy - if needed, put in the URL of the proxy to get to either of these databases
  • continuous - set to true if you want it to always be replicating

Save this document and then look at the CouchDB status page to see it replicate. It's like magic! OK, not really, but the handling of the proxy is so far superior to the way that CouchRest was dealing with it it's not even funny. This just works!

I'm more and more convinced that CouchDB is an amazing product.

Creating Software Plumbers

Wednesday, September 5th, 2012

I just read this tweet this morning:

Twitter / davehoover: Young people: consider ...

which leads to this article advocating that young people look to entering an apprenticeship program and not continue school. It says, in part:

Universities are the typical place that established businesses expect to find these high-potential beginners. While many software developers finish college with a good education, they’re often burned out, deep in debt, and understandably eager to cash in on their hard work. Apprentices, on the other hand, inject enthusiasm, hard work, and a thirst for knowledge into your teams. They will consistently launch from your apprenticeship program with context, momentum, and loyalty to your organization.

While I can understand the point of the article, and you should read it to get that it's not saying people shouldn't go to higher education, it's saying that you, as a business owner, can capitalize on the cost of higher education, and get those people that might go to college and get them into the workforce.

But is that what we want to have happen, as an industry? I don't think so. I think it's robbing the future to staff the present, and that's a mistake. A big one.

I'm biased. I've got the higher education and the advanced degrees, and I think they are the right thing to do. But even if you discount my position, and do what the author suggests, aren't we just creating a bunch of Software Plumbers? They'll know what they see, and will be able to work with it, but their understanding of how to solve new and unusual problems will be very limited. Oh sure, you'll have a few percent that naturally think outside the box, but their exposure to new things and new ideas will be incredibly limited.

This is the exact purpose of those liberal arts classes for engineers - to broaden a student's horizons. If we just allow people to learn what we want them to learn, aren't we really just forcing ourselves to re-train them when we want to change technologies? Of course we are.

While there are times to have an apprenticeship program - for those that can't make it into college, I think it'll be overused and draw the real future of the profession into one where only a few can really think creatively. And that would be very bad.

Logging all Incomplete Processing

Wednesday, September 5th, 2012

GeneralDev.jpg

This morning I decided that it'd be nice to have a complete list of all the merchants that didn't successfully complete their processing. Since we are now processing everything - a recent change to the code to make sure that we know exactly every single merchant is getting completely processed, we can now look at each merchant and make sure that they got through the critical processing phases. If it didn't "pick up" the right data, then we can assume that it didn't get to that point. The point of this is that we can then be sure that every merchant completed processing.

I log this, and write them to CouchDB, so we can keep a complete record of all the issues, and then updated the summary script to list the number of incompletely processed merchants so we can watch them over time.

Nice. This is starting to really get close to verification that all was done as it was supposed to have been done.

Google Chrome dev 23.0.1255.0 is Out

Wednesday, September 5th, 2012

It didn't take long - just a few days, and now Google Chrome dev 23.0.1255.0 is out with a nice array of fixes for crashing bugs - including a video problem on retina MacBook Pros. There are a few things about the security of apps in the browser, which I don't use, but I'm sure there are quite a few Angry Birds fans out there.

Slugging Through a Lot of Little Updates

Tuesday, September 4th, 2012

GeneralDev.jpg

Today has been a big day of a lot of little things. I've got five post-it notes on my desk - each filled with little things that need to be updated in the code. There are new rules for how to handle new merchants, more rules about aggregations, different rules about when to ignore merchants… all needed to be in the code as soon as possible, and because no one thing was that horrible to do, it was possible to get all the changes in today.

Yet, while trying to get things done, taking more requests is a little frustrating. No… it's quite frustrating. But I was able to get through it all without getting really upset, which is a nice win for me.

In all, it has been a really nice day - there are a few more things I need to do, but it was a good day.

Google Chrome dev 23.0.1251.2 is Out

Tuesday, September 4th, 2012

This morning I noticed that Google Chrome dev 23.0.1251.2 was out with a few nice fixes for crashing bugs. I haven't noticed them, but I'm not hammering on it with Javascript like a lot of the other folks are. Still, it's nice to see the decent release notes, and the improvement in Chrome continue.

Creating Really Dense Code – The Ruby Way

Friday, August 31st, 2012

Ruby

This afternoon I've written some of the most compact, potentially confusing code I've written in many, many years -- and it's perfect code by a Ruby developer. This is something that may be specific to The Shop, but given that they are such a big Ruby shop, I'm guessing that this is the Ruby Way, and like a lot of the functional code I've seen - completely undocumented. Now that's not to say my code is undocumented, in fact, it's got almost a 1:1 ration of comments to code because of it's compactness, but I've come to realize there's a bit of a blind spot in a solid group of young ruby coders that looks a lot like what I call Homework Problems.

In any case, the code I wrote today was specified by the quantitative analyst in Palo Alto as this:

Group the merchants by the services they offer so that any one merchant in the group shares at least one service with at least one other merchant.

Logically, this means that we can have a series of seemingly unrelated services so long as the group has these pair-wise matchings with at least one service.

If we look at the group of Macy's, the Gap, and a Movie Theatre:
Really Odd Groupings

You'd think there's no way the movie theater fits in the same "group" as the Gap, but because Macy's sells Jeans, and so goes the Gap, and because a movie theatre sells candy, and so does Macy's, then the Gap and a Movie Theatre "belong together" in a group.

I'm not making up these rules, I'm just trying to code them up.

Once we get these groups of merchants, we'll then process them and get some data from them. That's not the interesting part. The interesting part is the grouping, and how to get it.

My first idea was to write a few little methods that I knew I was going to need: one to get the services from a merchant into an array, and another to see if there are any overlap (set intersection) between two merchants:

  # this method returns a clean, unique set of services for the provided OTC.
  def self.get_services(otc)
    (otc['taxonomy'] || []).map { |i| i['service'] }.compact.uniq
  end
 
  # this method returns true if ANY service is shared between the two OTCs. ANY.
  def self.services_overlap?(otc_a, otc_b)
    !(get_services(otc_a) & get_services(otc_b)).empty?
  end

At this point, I knew that these were very "ruby-esque" methods - one line each, so it's got to be "minimal", right? At the same time, I was able to then start to deal with the idea of just finding the right pairs to feed to the second method, and then collecting them into the right groups.

But therein was a real problem. If I just looked at the merchants serially, then the order matters. Imagine the order: Gap, Movies, Macy's. In this case, the Movies would not match the Gap, so there'd be two groups, and then Macy's would match the Gap, and strand the Movies. Bad. So I had to have multiple passes, or I had to think up some other way of looking for the sets.

What happened was that I was scanning the ruby Array docs and noticed the product() method. Interesting, and after about another 10 mins of trying to think up a solution, the ideas came to me: use product() to make pairs of merchants to check, and then add things in and remove duplicates.

Sweet idea!

  def self.group_by_service(otcs)
    # start with the array of groups that we'll be returning to the caller.
    groups = []
    # look at all non-identical pairings in the original list and for each
    # pairing, see if there are ANY common services. If there are, try to find
    # a group to place the PAIR in, if we can't, then make a new group of this
    # pair.
    otcs.product(otcs).map do |pair|
      next if pair[0] == pair[1]
      if services_overlap?(pair[0], pair[1])
        groups << pair unless (groups.map do |g|
          (g << pair).flatten!.uniq!.size unless (g & pair).empty?
        end.compact.reduce(:+) || 0) > 0
      end
    end
    # verify that each OTC is in some kind of group - even alone
    otcs.each do |d|
      groups << [d] unless
          (groups.map { |g| g.include?(d) ? 1 : 0 }.reduce(:+) || 0) > 0
    end
    # return the array of groups to the caller
    groups
  end

It's like an expanded APL to me. Compact code. Chained method calls. More work by the CPU, but less code written by the person. It's not something I'd traditionally write because it's excessively wasteful in the work it's doing, but I'm guessing that it'll be seen as evidence of me "getting it" by the other Ruby guys in the group.

I get it, and in certain instances, I don't think it's wrong. But in a production app that's going to hit speed limitations, have code like this is a killer to performance. There's too much that doesn't need to be done. Yeah, it looks nice, but it's going to put a tax on the machines that shouldn't have to be paid.

I get it… I'm just not sure I think it's a great thing.

Having the Best Tools Really is Nice

Friday, August 31st, 2012

Apple Computers

I'm sitting here this morning smiling quietly to myself after working a while on some problems. I realize I'm smiling because the keyboard is just such a wonderful piece of work - the Apple (small) wireless keyboard and wireless mouse, to be precise. These are simply the best input devices that I've ever used. The keyboard is low to the desktop, responsive, small, and without a cord. The trackpad is large, the gestures wonderfully thought-out. It's amazing what I can do so effortlessly on this machine.

Then there's the machine. I'll admit I'm not a fan of the glossy Apple displays. I never have been. The HP 30" is really the best I've used in a long while. I think Apple could make a nice 30" display, but they don't want to, and there's no making them do something they don't want to do.

But back to the laptops… they are, without a doubt, the best in the industry - and have been for a long while. Really, the only solid competition was the IBM ThinkPad, and IBM sold that to Levono, and the quality has become so very ordinary now. Nothing to write home about.

But great hardware gets out of your way. You stop thinking about how to do something, and focus on the doing. In some cases, it even enables you to see things that you might have not seen - large displays with more code in them really are very beneficial.

So I have to tip my hat to the folks at The Shop that see this as well, and outfit the developers this way. It's really quite amazing.

Thread-Local Variables in Ruby

Thursday, August 30th, 2012

Ruby

Now this is probably not amazing news to long-time Ruby developers, but the simplicity and ease with which it's possible to make thread-local variables in Ruby is simply shocking. The ruby developers just don't know how good they have it. This morning I was looking at a threading problem with the CouchRest CouchDB client, and realized that it's not thread-safe. This isn't really shocking as thread-safety is something I've come to realize is not standard in Ruby libraries.

Still… I was determined to make it work.

What seemed logical was to have multiple database connections - one per thread, and then just have thread-local database connections. As the threads are born, they need a connection, create it, and use it. When they die, the connections are cleaned up automatically. Sweet. Simple.

But I know that dealing with thread-local storage in pthreads is not horrible, but it's certainly not "easy". I dig into the Ruby support for thread-local storage, and it's trivial:

  Thread.current[:foo] = {}

This creates the tagged thread-local variable foo. How simple! This is something that I never expected to see. Never! So why are these Ruby guys having so much trouble with thread-safety? I have no idea.

With the tools I've seen in Ruby, there's really no excuse for why there aren't more thread-safe libraries. All the tools are there - they're just unused. Lazy coders.