Archive for the ‘Cube Life’ Category

Letting Go – Regardless of Consequence

Wednesday, July 31st, 2013

cubeLifeView.gif

I like what I do - I really do. I like the company I work for - there are a lot of nice folks here, and I generally like the decisions that management makes. But as with every life a little rain must fall, there are times that your time in a group is done, and it's best to move on. The ideas that shaped the group and got it to this point were necessary and good, but now it's time to let someone else take over and take it from here.

Of course, that's not how it feels.

It feels like the new folks to the business think they have a monopoly on the project even though they just joined the company. It feels like they have no respect for the ideas the project was built on, so that their changes to the codebase make no sense, and in fact are counter to the goals that the project was built on.

It feels like they are being jerks.

And who knows… maybe they are. Maybe they aren't. It's not only impossible to tell, it's also completely unimportant. You find yourself in the minority and it's time to move on. No anger, no grief… maybe a bit of sadness for what's been lost, but loss is part of life. You can't allow the project to be what the new blood wants it to be - sees it to be in their minds, if you're there holding them back.

It's also not really fair to just sit in the group and allow the changes to occur around you. That's just gold bricking. Yeah, you know the code, yeah, you like the project, but it's all going in a different direction and it's time to just cut the cord. Allow the project to be what it will be under their stewardship.

It's time for me to move out of this group. As much as I'd like to keep working on what I'm doing, it's not good for the group or me.

The Amazing Arrogance of Youth

Tuesday, May 28th, 2013

Code Monkeys

It probably shouldn't be surprising to me at all that the event that got me back to writing is arrogance of Code Monkeys. These are the young Rock Stars of the community that think that Ruby is the only language you need to know, and everything else is so much less that it's hardly worth their time. Bash, C++, Python - all "toy" languages to the Code Monkey, as the One True Language is ruby.

Of course, they are pumped up by people telling them they are amazing. They solve a few problems by throwing hardware at it, and they think they are the Oracle at Delphi of everything software, and as long as they can do just a little bit more than the next guy, their confidence and assurance grows. It's kinda sad.

I can remember experiencing several humbling experiences in grad school, and for those I am eternally grateful. I have no desire to ever be like these guys, but I can realize that I probably was back before grad school. But that was a long time before I got out into "The World". By then I knew… I was not all that. I was just a hard worker.

So this came to mind this morning I looked at a failed job in a cron email. I sent the guys responsible for the job an email saying they might want to up the memory for the JRuby JVM, as that was the cause of the failure. The Code Monkey responsible for the job said that he upped it for the run, but didn't change it in the code in the git repo because it wasn't needed.

Now in my mind this isn't possible. If the code is a one-time (a.k.a. throw-away) script, or code, then it's not in the repo as it has no value past it's one use. But if it has value past it's one use, then it's in the repo. If it has value, then it should be fixed for the memory problem. So he's either wrong in putting it in the repo, or wrong for not updating it. I know this, but I try to be nice and ask him "What's the harm in updating it?"

His response I could have guessed: "It's done, there's no reason to update it." Spoken like a true Code Monkey. They hate comments. Why? All ruby code is self-documenting. What they really mean is that everyone should get used to re-reading and re-understanding the code every time they approach it. Documentation just gets in the way of that constant process of self-re-learning. He had no belief that the script would ever be used again, but that's because he's got the time horizon of two weeks. And because he can't see it, it must not be needed. Done.

So I'm going to let it go. He's convinced he's shown me a thing or two. In reality, he's shown his inability to be a tech lead for me. That's his position in the group - Tech Lead. Not a chance in the world. I respect my boss - he's a sharp guy, and I respect his skills. But if he really thinks this guy has serious skills, then he's fooled. But hey… anyone can be fooled - look at me - fooled for 27 years.

Anyway, it's just amazing to me that these Code Monkeys are as plentiful as they are. Sure, I knew plenty of C++ coders that weren't any good. I knew plenty of Java coders that weren't any good, either. But it's the consistency of these Code Monkeys that's really throwing me for a loop.

But you know… maybe that's a good thing. After all, he got me to write something. Maybe I'm making a little progress?

I Love Magic – as Entertainment

Tuesday, March 5th, 2013

Clojure.jpg

I love a magic show. Even one that my friends might think of as lame. I love the well-done illusion. I know it's not real, but it's fun to believe that it is. After all - it's entertainment, and if you don't enjoy entertainment, then watch something else. It's your time, your life, your choice. But where I don't like magic is in languages and coding - there I absolutely hate it.

Take clojure, and for that matter, ruby falls into this category as well. The Ruby-ism of Convention over configuration is a nice thought, and can be helpful for new coders starting out, but it obscures all the details, and in that obscurity, it masks all the performance-limitations, and that includes threading. Clojure is the same. What's really being done? Don't quite know with a lazy sequence, do you? What's loaded when? If it's a database result set, does the code load in all the rows and then construct them as a lazy sequence, or does it read a few rows at a time and leave the connection open? Big difference, right?

So I'm not a fan of this kind of code - except for simple one-off scripts and manual processes. You just have no idea what's really happening, and without that knowledge, you have to dig into the code and learn what it's doing. Don't forget to stay abreast of all the updates to the libraries as well - things could change at any time simply because of the cloaking power of that abstraction.

Why does this mean so much to me? Because there's never been a project I've been involved in in the last 15 years that doesn't come down to performance. It's always coming dow to how fast can this all be done, and how much can be run on a single box, and so on. All these are performance issues, and without the in-depth, continual, knowledge of every library in use, I'm bound to have to make some assumptions - until I'm proven wrong by the code itself.

And what's worse, is that I know to look, whereas plenty of the junior developers that I have worked with simply assume that it's par for the course, and don't even think about the performance consequences of their code. They've always had enough memory, and CPU speed, and if it takes 20 mins - so what? It takes 20 mins! I wonder if they would feel that way if it was charging a defibrillator for their parents? I'm guessing not.

Time is the one limited resource we all have. Waiting is not acceptable if you can figure out a way to reduce or eliminate the wait. That's what I've been doing for the better part of 15 years now: removing the wait.

Starting late yesterday, I realized that we have a real performance problem with the clojure code we are working with. I'm not sure that the code is really all that bad, as it works fine when the database isn't loaded, but when it is - and it doesn't have to be loaded very much, things slow to a crawl, and that's not good. So bad, in fact, that several processes failed last night as it was cranking through a new data set.

So what's a guy to do? Well… I know what to do to make JDBC faster, I just need to know what to do in clojure to get those parameters into the Statement creation code in the project. Unfortunately, there's no simple way to see how to do it. Clojure, like ruby, isn't well documented for the libraries - for the most part. This bites because I can see what I need to set, but not how to set it.

So I'm going to have to wait for our clojure expert to show up this morning and tell him to dig into it until he can give me a list of examples that I can work from. I have no doubt he's capable of doing it, but it's not terribly nice to have to wait for him to walk in.

But that's my problem, not his.

But Boy! I hate "magic" code.

Doing a Lot of Skut Work

Thursday, February 21st, 2013

Code Clean Up

Today has been a lot of skut work - clean-up stuff that has been sitting in the queue for months but no one wants to do. But if the project is going to really work, someone actually has to do it. So since I finished up a lot of tasks today, it seemed like a natural thing to just get to it and clear the decks.

None of this is hard stuff, it's just not very fun, and it takes time.

First off, I followed up with a request for backups to be made of all the database machines we use in the group. This includes CouchDB as well as PostgreSQL. It's nice in that the install of each of these packages places the data files in the largest partition on our boxes: /var/groupon/ so it's simple to just back up that partition. I submitted the request a few days ago, but hadn't heard anything back, so I followed-up asking if I was going to get a completion notice when the backups were working.

Response was: "Yup, likely tomorrow". Good enough.

Next, we needed to get Nagios monitoring of the free disk space on the boxes as well - so that should a process go crazy and start to fill up the disk, we can fix it before it becomes a database killer. This has happened to us on several occasions, and it's something to be avoided as the main processes can't run if the database is offline.

Finally, needed to do what I could to compact the CouchDB databases on the production and UAT hosts because we're at 93% disk space used, and there's very little headroom left. If the compaction of the views doesn't work, then I'm going to just drop the database and start fresh. We have a replicate of the production data, and with the backups (above) we'd be able to go back to it anyway. But this is something I'd rather not do, but it's certainly a sure-fired way to get the space.

It's not glamorous work, but it needs to be done, and no one else is picking it up, so I might as well just do it all and have it done.

Cracked the Speed Problem with Postgres

Friday, February 1st, 2013

PostgreSQL.jpg

This morning I put a lot of effort into making the selection of the children from the postgres database faster. There's just no way it needs to be 30 sec. Since I'd already tried the sub-select, I needed to have a completely different approach. Thankfully, I have a lot of ideas, and it didn't take long to find the one that solved the problem.

The problem is that I was always looking in the "wrong direction". I was using IN and joins, when it was really a much simpler problem. Let's find all the demand_id values from the demands table - based on the demand_set_id - which is keyed. This is a very fast request. Then we use that as a cursor to get the locations based on that.

There are a few nice things with postgresql that make this even nicer - like the FOR loop with the select embedded - very nice. Turns out to be really pretty simple:

The difference was dramatic. Now it's about 450 msec for the same load. That's a factor of 60x! I felt such amazement, what a lucky find!

Cool Sub-Selects in korma

Thursday, January 31st, 2013

Clojure.jpg

I was doing some selects from a postgres database into clojure using korma, and they were pretty straight-forward for a master/detail scheme:

  (defn locations-for-demand-set
    [demand-set-id]
    (select locations
            (with demands)
            (fields "locations.*")
            (where {:demands.demand_set_id demand-set-id})))

and it was working pretty well. The data was coming back from the database, and everything was OK. But as the database got bigger and bigger, we started to see a real performance penalty. Specifically, the pulling of the locations was taking on the order of 30 seconds for a given demand set. That's too long.

The problem, of course, is that this is implemented as a join, and that's not going to be very fast. What's faster, is a sub-select where we can get all the demand ids for a given demand-set, and then use that with an IN clause in SQL. Thankfully, I noticed that korma had just that capability:

  (select locations
    (where {:demand_id [in (subselect demands
                                      (fields :id)
                                      (where {:demand_set_id demand-set-id})]}))

Unfortunately, this didn't really give me the kind of speed boost I was hoping for. In fact, it only cut off about a half-second of the 31 sec runtime. Kinda disappointing. But the fact had to be related to the size of the sub-select. It was likely 25,000 elements, and doing an IN on that was clearly an expensive operation.

I like that korma supports this feature, but I need a faster way.

Hitting Teradata from Clojure

Monday, January 28th, 2013

Clojure.jpg

Today I worked on hitting Teradata from within clojure using clojure.java.jdbc, and I have to say it wasn't that bad. There are plenty of places that a few paragraphs of documentation could have saved me 30 mins or so, but all told, the delays due to googling weren't all that bad, and in the end I was able to get the requests working, and that's the most important part. I wanted to write it down because it's hard enough that it's not something I'll keep in memory, but it's not horrible.

First, set up the config for the parameters for the Teradata JDBC connection. I have a resources/ directory with a config.clj file in it that's read on startup. The contents of it are: (at least in part)

  {:teradata {:classname "com.teradata.jdbc.TeraDriver"
              :subprotocol "teradata"
              :subname "//tdwa"
              :user "me"
              :password "secret"}}

Then, because we're using Leiningen, the jars are loaded in with the following added to the project.clj file:

    [com.teradata/terajdbc4 "14.00.00.13"]
    [com.teradata/tdgssconfig "14.00.00.13"]

so that the next time we run leon, we'll get the jars, and they will know how to connect to the datasource.

Then I can simply make a function that hits the source:

  (defn hit-teradata
    ""
    [arg1 arg2]
    (let [info (cfg/config :teradata)]
      (sql/with-connection info
        (sql/with-query-results rows
          ["select one, two from table where arg1 = ? and arg2 = ?" arg1 arg2]
          rows))))

Sure, the example is simplistic, but it works, and you get the idea. It's really in the config and jar referencing that I spent the most time. Once I had that working, the rest was simple JDBC within clojure.

Learning to Code Like a Rubist

Thursday, January 24th, 2013

Code Monkeys

Today I spent some time fixing a bug, and while I was in the code, I wanted to see if my manager was right - that I was writing code that no one on the team liked working with. After all, I'm new to writing in ruby, and there's a unique style, I've come to learn, and I wanted to see if I've been doing it right, or maybe not right enough. I had something like this in my code:

  if meetings.empty?
    ru = fills * misses / factor
    fu = (1.0 / ru) * fill[0]
  else
    ru = fills.map { |f| f * misses / factor }
    fu = ru.map { |r| 1.0 * fill[0] / r }
  end
 
  all_zips = location.map { |l| l['zip_code'] }.compact.uniq
  {
    :ru => ru,
    :fu => fu
  }

Now, in truth, this makes no sense and I had about 10 lines in both branches of the 'if' statement, but the gist was that if the data was a constant, do one set of calculations, and if it was a vector, then do a slightly different set, but along the same lines as the single-valued calculations.

Nothing complex.

But it seems that The Ruby Way is to make anything that's more than a line or two into a new function - even if it's significantly expanding the lines-of-code in the file. So contrary to my prevailing understanding of the reasons for this extreme composition, it's really about the complexity of the code.

I was talking to a consultant new to the group, and his statement was that he trusted the code to do what it said it was doing, and by making the methods very small, and well-defined names, it was easy to understand what they were doing.

I asked him about debugging, and his statement was again about trust.

I was shocked. On so many levels.

First, that trust had anything to do with debugging. Second, that they felt this code was too complex, and required refactoring.

It's a few lines - say 10, and it's a series of calculations. There's no savings in making it functions. It's adding stack calls, tests on the input values, and re-testing the values that caused the branch in the first place. It was horribly less efficient than before, but that's what they wanted.

It's back to the "CPU cycles are Cheap - Dev cycles are Expensive" - Hogwash that I've heard before, and don't agree with in the slightest bit. Nothing is cheap. Everything is expensive, and it's all about balancing the costs of different factors into the final design. There's no free lunch, but it seems that fact hasn't found the Rubists yet.

Secondly, you never trust code in the debugging. You 'single-step' all the code when you have a bug to find out where it lies. If you trust code, you're going to get burned. Guaranteed. Debugging is a thoughtful exercise. You have to think what should happen, and then watch and see exactly what does. This means you can't take anything for granted. Not a single thing. That they would makes me very nervous that "fixes" on their part aren't going to be fixes at all. They'll patch or "refactor" until the bug goes away, but that's not the same as really fixing it. It might just pop up somewhere else.

Finally, I can't imagine that I'm going to be able to write code like this long-term. I don't see it as overly-complex, so I'm not going to think that it needs refactoring. If I take the absurd approach that every calculation is a method, I know I'll be wrong on the other extreme, so I can't just pretend to know what to do.

I'm frustrated, and I feel like giving up. It's been a hard day, and by all measures I seem to be incapable of getting the right mindset to write this code like the rest of the team. They aren't happy with my code, and they aren't willing to put in the hours to write it before I do. So they are going to complain about it, and I'm going to be powerless to understand their mindset properly to write code like they would.

I wish I knew what I could do to fix this problem…

Adding More Metadata to the Demand Service

Thursday, January 24th, 2013

Dark Magic Demand

This morning I was asked to add something to the demand service so that it'd be easier debugging what's happening within that service. It wasn't horrible, and given that we already had the provision for the meta data associated with the demand, it was simply a matter of collecting the data and then placing it in the map at the right time.

I was really pretty pleasantly surprised to see how easy this was in clojure. Since everything is a simple data structure (as we're using it), it's pretty easy to change a single integer to a list, and add a value to the end. Then it's just important to remember what's what when you start to use this.

Placed it into the functions, deployed the code and we should be collecting the data on the next update. Good news.

Things are Getting Complicated it Seems

Wednesday, January 23rd, 2013

I was in a meeting today - talking about an interview that was coming up today and the guy looking to make this hire was very explicit about the type of person he was looking to get: Type A. Serious go-getter. You know… me. OK… people like me - and he specifically talked about hiring from the finance industry here in Chicago. I smiled - maybe even giggled a little bit, and he asked me what that was about.

I explained what had been on my mind for a while: Was this place really ready for the kind of change that these kinds of hires was going to represent? I was very specific about the fact that I honestly had no idea what the long-term effects were going to be, but just based on my experience, I know that there was certainly going to be a division based solely on physical things like hours worked, and hours not worked while playing ping-pong.

These simple things, and the drive, intensity and others are going to make it clear to everyone that these new developers are not at all like the others here now. My fear is that the critical developers we have now will see this, sense the winds of change, and decide that this is no longer the place they know and love, and simply leave. It won't have to be all at once, but if the right few leave, then it'll be very hard to continue the development of the existing web apps, and then more significant changes will have to take place.

It could easily turn into an avalanche where a few Type A groups make enough of the old guard uncomfortable, and that forces even more of the new to replace/retool/re-work the old apps, and that makes even more old guard jittery, etc. It can be a nasty feedback cycle that causes a massive change in the landscape in a very short time.

So I was talking with my manager about this, and it quickly got to the point that we needed to talk about what's happening with me and this group.

They love the work I do. He said he's never seen anyone work anything like what I do. But he also sees that the other guys in the group are complaining about the fact that I'm taking ownership of too much - making it hard for them to feel involved in the project. I explained myself, and it's clear that we had the same situation that I've had several times before - I work to the point that others feel threatened, marginalized, and then they resent me even though there's nothing to resent. Not really.

I'm working on the things we, as a group, need to do. I'm doing my best to do this in the same way that I'm seeing them set by example. After all, I'm no great ruby coder - or clojure coder. I'm looking at what they do - how they do it, and following along as best I can.

I just happen to do it about four times their speed.

In the beginning, I was "cute" and interesting. Now that I've gotten up to speed, I'm not so cute, and I'm becoming a real annoyance to them. I get it. I really do. But I'm not doing any of this to be a trouble-maker. I'm doing this because I think this is what I should be doing. This is just how I work.

I'm 51, for Pete's sake! Aren't they the 20-somethings supposed to be showing this "Old Man" how things are done? Why are they crying "Uncle!"?

So I have to now work harder at including them. I do. I don't take things over. I don't change their code without serious reason, and when I do I document the heck out of 'why'. I am trying to fit in, but it's not turning out that way.

My manager is talking about finding something for me to do that's outside this, and building a team around me. But I know the key factor is that what I'm on now is the most important project he's got, and to take me off that is just crazy in management's mind. I'm the one that's making everyone breathe easy on this one. So to take me off now just isn't happening.

But I'm a problem for the guys in the group, it seems.

I wish I knew what to do - other than the obvious. The thing that pops to mind is a simple group chat. Plain, simple, clear the air. I don't expect anyone to change their minds, but it will at least let them know that I mean them no harm, and that it's all up to them. They get to choose how hard to work. They get to choose when to work. They choose.

Not me. Them.

And when they realize that - that it's less about me than them, then maybe - just maybe, they'll start to see that it's not me that's the problem.