Archive for the ‘Coding’ Category

The Illusion of Configuration isn’t Code

Wednesday, June 3rd, 2020

Building Great Code

One of the Finance Shops I was at happened to have a Post-Trade Management system, and it was built in Java that had all kinds of interesting capabilities to aggregate and plot different levels of aggregations and filtering on the positions, and folks really seemed to like it. One of the key components to that system was a UI toolkit from a consulting shop that was entirely driven on XML configuration files.

I can remember adding a few lines of XML, and adding a very complex dialog box to the app, and thinking - There is no way that's a 1:1 mapping of the config! And I was right. These were heavily built library modules, and adding a few lines really added entire subsystems and connections for those UI elements.

I've also worked at a Dot-Com Shop where the complete build of a machine was described in a single YAML file. This includes all the user accounts, all the software packages, all the configuration for the software packages, and all sitting in a YAML file.

I'm currently looking at a CI/CD Pipeline that is completely specified by a YAML file. This includes shell commands, with options, and variable expansion, and while it's understandable why the developers of each of these systems chose XML, or YAML, as their configuration files - there are loads of parsers that are solid and reliable, and the files can mix in simple data structures, and in those data structures, you can add shell commands and can then do variable expansion... so it makes sense.

What concerns me is that so many developers seem to feel that these configuration files are not nearly as important as the code backing them. That it's easy, safe, and simple to change the configuration and try it again... and maybe for some things it is... but chances are, if your configuration is really your code - then you need to treat it as such.

For example, it is very common to have multiple layers of configuration files. Maybe there's a company level, and overlaid on that is a project level, and overlaid on that is an environment level. These probably get parsed, and then merged one on another to allow certain things to be set at one level, and others at another, and the sum total of all the config data is what's used.

What could go wrong?

Well... imagine if one component of the config is a list of strings that have to be processed in a specific order - let's say top to bottom, as they appear in the file. But then another of the layered config files has another list - with the same name - maybe it's a typo, maybe it's not. How are these merged?

Does the latter stacked file overwrite the lower level? That's one way, but then it's hard to make sure that those lower-level commands are being run/parsed properly. Could lead to duplication in the upper-level files, and that's not really the point of the stacking, is it?

What if you simply append the upper entries to the lower entries? That could be just as bad because the writer of the lower ones may be making assumptions about the state left after the processing of the upper file.

In short - having configuration files store data structures, is fine - and it's useful... but having it include what amounts to executable code means that it needs to be treated like code. And can you imagine writing a function that's layered from multiple files, and then executed at runtime? The difficulty in tracking down errors would more than offset any gains in reuse.

So if you want to have layered configuration files - Great! Just leave it to data that's easily flattened, and tested... but if you're going to have it include executable code - make it simpler - a single layer. You'll be glad you did. 🙂

Being Kind to The Next Guy

Wednesday, June 3rd, 2020

cubeLifeView.gif

It's always good to be kind to the "Next Guy" who has to pick up your code. Writing clean code, and making sure it's readable and understandable is very important, but there's really far more to it than that, and I was reminded about that in Slack this morning, and I wanted to write it down so that I wouldn't forget about it.

When a project gets so large that the tests take more than, say 15 min, to run, often times, a developer will disable the tests because it really gets in the way of making progress. This makes perfect sense. But then you get into a problem that the more you do this, the more likely it is that someone will check in code that fails on one of these disabled tests, and then you have a broken build.

Sure, some will say "That's OK... that's what the CI/CD Pipeline is for." which could be true, until the tests are long enough that multiple people are submitting pull requests and they don't want to wait for the full set of tests to run, and so they move on. But then there's a failure, and another team member has more changes that need to go in, and their tests passed, and so the first developer has to let the other dev go through, and then update their pull request based on the new code, and that's frustrating to them.

So now, developers don't want to add tests, because that just makes the situation worse, and now there's less coverage on the code, and less reliability on the system overall, and it just starts to slide downhill. But there's more... if they are waiting too long for tests, they could be skipping writing good docs or commit messages because they know they have a long wait ahead of them, and they aren't thrilled about that.

Take away good tests, complete tests, and good docs, and now we're really starting to erode the quality of the code - simply because things take too long to be comfortable, and it's perfectly understandable. But we each need to push through that initial response, and work to make the tests faster, and the documentation better, and make sure that the "Next Guy" who works in this code gets the best possible codebase they can have.

GitPod has Impressive Equipment

Sunday, May 17th, 2020

DealPerf Svc

Today I was spending a little time moving another of my little projects to GitPod, and I was amazed at the execution speed I was seeing, compared to my 2019 decked-out 16" MacBook Pro. I mean Wow! On my MacBook Pro, the test ran in:

  $ make test
  ./quip 'Fict O ncc bivteclnbklzn O lcpji ukl pt vzglcddp' -kb=t -fwords
  [31967 us] Solution: When I see thunderstorms I reach for an umbrella

which isn't bad... 31.9 msec. About what I get on the ObjC code as well. And when I ran it on the GitPod instance:

  gitpod /workspace/Quip $ make test
  ./quip 'Fict O ncc bivteclnbklzn O lcpji ukl pt vzglcddp' -kb=t -fwords
  [26967 us] Solution: When I see thunderstorms I reach for an umbrella

So the instances they are fronting are at least as good, if not better, in single-core performance, than my recent MacBook Pro. That's impressive. Another good reason to look at doing a little more coding remotely.

Interesting Interoperability C++ to Java

Monday, March 9th, 2020

GraalVM

I have always hoped that GraalVM would be something that really launched Clojure, and Java, into something that would be fast - and like really fast. But then it was a licensing issue as GraalVM was licensed by Oracle, and so any production deployment was going to cost real money. But even then... for a company, that might not be bad - especially when you have to worry about performance, and you are looking at JNI, or a mixed-mode environment with C++.

So it was very interesting to me to see this article about calling Java from C++, with the GraalVM compiling Java to a shared library (and header), that could then be linked in with C++ like any other shared library.

The key is that this would completely avoid JNI, and it's complications - in coding, versioning, and execution speed. I can imagine having lots of Java code, built and tested in the normal Java way, and then compiled into a shared library for inclusion in a much more performant C++ system for those engines that had to run for long periods of time without restarts, and be very aware of resource management (memory, ports, etc.) and yet built with a simpler system than C++.

The article is interesting, but again... with the licensing, it's only really useful for organizations with revenue streams to support it.

Trying out a new Editor Font

Tuesday, January 21st, 2020

GeneralDev.jpg

A few days ago, I read about the new JetBrains Mono font that they had created for their IDEs, and it looks interesting with the increased hight of the lower-case letters, and the easily readable "1lI" combination... so I decided to give it a whirl on my editor, and in Xcode.

I have to say I like how Sublime Text and Xcode handles all the weights of the font, and the ligatures for common expressions makes it a lot easier to read the code - as if I'd written it on a chalkboard. Very nice indeed

JetBrains Mono for Clojure

I'm not sure how it'll play out, but it's pretty nice so far, and I'll play with it for a bit and see what I think. So far... nice. 🙂

Fantastic Lighthearted Javascript Graphing Package

Monday, November 18th, 2019

Javascript

This morning I was reading the newsfeeds, and came across probably my favorite Javascript graphing package: Chart.xkcd. The idea is that it can be used to create those seemingly hand-drawn charts the the xkcd comic so often generates as part of their work. But this is easily put into React and other web frameworks, and put on web pages for that casual look that brings a completely different feel to the data being presented.

From the simple docs on the web page, it seems pretty straight-forward... you have to set up all the data for the graph, and then it just renders it. There are some nice mouse-over features as you dig a little deeper, but it's the casual nature of the presentation that really appeals to me:

Example of Chart

I don't have a project yet that's in need of this, but I can feel that there will be something coming soon that could really use this less-serious approach to data presentation. After all, not everything is a publication-ready presentation.

Fun RESTful Paging with Bash and jq

Tuesday, October 22nd, 2019

Ubuntu Tux

I was pulling data from a RESTful API, and the returned data was paginated, as data like it often is, and I wanted to process it all in bash because I wanted to be able to share this with a large community. But how to really do that efficiently? I'm not worried about the time it takes to process data in bash, but just to make sure that we're not fetching the same data multiple times. What I came up with was pretty interesting.

First, let's set up the body of the RESTful call. Standard JSON, with the data in the payload as well as a nextLink key:

  {
    "events": [ { "dish": "Stevie is at Poppers!", "userId": "ralph" },
                { "dish": "Ralph is at Waffle House", "userId": "dorris" }
              ],
    "nextLink": "https://yoyo.com/api/rumors?page=4"
  }

where the events key is really holding the important data from the call, but because there was too much data to get in one (efficient) call, the service gave a complete link to hit to get the next page of data. Simple.

Now some APIs return a page number, or a placeholder value to pass to the same URL that generated this data, and this saves 100 bytes, but that's just a different way of getting the "next" page to the caller.

So... how to put all this in bash? Let's start with jq, and you can get it from Homebrew, with:

  $ brew install jq

and then we can look at the basic loop over all pages:

  # this is my YoYo API Token - you can get one too
  tok="111111111111111111111"
 
  # dump all the rumors into a file
  hit="https://yoyo.com/api/rumors"
  while [ "$hit" != "null" ]; do
    # make a temp file for the details of this page of the rumor group
    rumors=$(mktemp /tmp/yoyo.rumors.XXXXXXXX)
    # get the page, and possible next page URL...
    curl -s -X GET $hit \
       -H "accept: application/json" \
       -H "X-API-Token: $tok" > $rumors
    # write out as CSV, the data from this file
    for usr in `jq '.events[] | select(has("userId")) | .userId' $rumors`; do
      echo -e "$title,$usr"
    done
    # get the next URL to load from the service
    hit=`jq '.nextLink' $rumors | sed -e 's/^"//' -e 's/"$//'`
    # clean up the file as we don't need it any more
    rm $rumors
  done

In this example, the $title variable is something that's not defined, but would presumably be defined in the script before we get to the loop. Whatever the point of the gathering the data really is.

What I enjoyed about this is that we can get the data from the service with a simple curl command, and then process it with jq and the flexibility with jq is really quite impressive. In just a few lines, I was able to make a script that fetched a lot of data from a service, extract what I needed into a CSV for processing in a spreadsheet, and it didn't take all that long. Winner. 🙂

Getting Apache 2.4.41 + PHP 7.3.8 Going on macOS 10.15 Catalina

Tuesday, October 8th, 2019

Yosemite

This morning I thought I'd perform the ritual of getting the old web development tools that I've used in the past going again - this time on macOS 10.15 Catalina. Now I haven't used PHP in ages, but I've still got code and databases for Postgres to use that - so it makes sense to get this all working again, and it's always fun to see how things work out.

Getting Postgres 11.1

Loads of coverage here about Postgres, and it's just so simple to get the latest version from Homebrew:

  $ brew install postgresql

I've even posted how to upgrade from major version differences, so it's easy to get the latest Postgres running on your box, and the tools are just superb.

Activating UserDir in Apache 2.4.41

As in the previous updates, the UserDir extension is not enabled by default, so we need to get that going right away. This enables the code to be run from the development directories, and that's a big time-saver. First, we need to enable the UserDir module in Apache, and then make a specific config file for the user in question. Start by editing /etc/apache2/httpd.conf and line 183 needs to be uncommented to read:

  LoadModule userdir_module libexec/apache2/mod_userdir.so

and then similarly on line 520 uncomment the line to read:

  Include /private/etc/apache2/extra/httpd-userdir.conf

Next, make sure that the file we just included is set up right for including the user directories. Edit /etc/apache2/extra/httpd-userdir.conf and line 16 needs to be
uncommented to read:

  Include /private/etc/apache2/users/*.conf

At this point, you need to make sure you have at least one file in the /etc/apache2/users/ directory for each user, like: drbob.conf:

  <Directory "/Users/drbob/Sites/">
      Options FollowSymLinks Indexes MultiViews ExecCGI
      Require all granted
  </Directory>

where the last line - Require all granted is new as of Apache 2.4, and without it you will get errors like:

  [Thu Dec 18 10:41:32.385093 2014] [authz_core:error] [pid 55994]
    [client fe80::7a31:c1ff:fed2:ca2c:58108] AH01630: client denied by server
    configuration: /Users/drbob/Sites/info.php

Activating PHP in Apache

The mext thing to do is to activate PHP in the supplied Apache 2 with macOS 10.15. This is line 186 in the file - /etc/apache2/httpd.conf and you need to uncomment it to read:

  LoadModule php7_module libexec/apache2/libphp7.so

and then verify a file called /etc/apache2/other/php7.conf exists and contains:

  <IfModule php7_module>
    AddType application/x-httpd-php .php
    AddType application/x-httpd-php-source .phps
 
    <IfModule dir_module>
        DirectoryIndex index.html index.php
    </IfModule>
  </IfModule>

which does all the other PHP configuration in a separate file to make upgrades easy.

Finishing Up

At this point, a simple restart of apache:

  $ sudo apachectl restart

and everything should be in order. Hit a URL that's a static file with the contents:

  <?php
    phpinfo();
  ?>

and you should see all the details about the PHP install - including the PostgreSQL section with the version of Postgres indicated:

MacOS 10 15 PHP Config

What's really great is that Apple has included lots of support in the default PHP install:

  • PHP 7.3.8
  • Postgres 9.3.7
  • MySQL 5.0.12
  • SQLite3 3.28.0

so there's no reason to do anything more to get the kind of support that I used to get. And I get the other databases for free. This is great news! I then run my little test page to make sure the database connection is working:

MacOS Catalina Database Page

and everything is working exactly as expected!

Frustration with Speed Coding Interviews

Wednesday, September 18th, 2019

Clojure.jpg

Yesterday I had an interesting phone screen with someone, and the seemingly common practice of asking a candidate to code on the phone - in a shared web-based environment again came up. I recognize that any employer can have any legal criteria for employment, and the "Coding Phonescreen" is a very common one. You get to see if the person can write code in a few minutes as opposed to inviting them for a day-long interview cycle that can cost considerably more. It's decent logic.

But it really doesn't tell the story, does it?

Speed Coding has about the same relationship to real development as a college classroom has to Jeopardy!... yeah, the material is the same, but the skills to be able to do well in one do not necessarily translate to the other. And the most critical skill in the speed forms is pattern recognition of the problem.

If you've seen this problem before, and written a simple solution to it, then you're in good shape. You know the flow, you know the pitfalls, and you can talk your way through it - like you're talking your way through directions to a local restaurant. No pressure, you're showing someone something you know, and it happens to take a few steps. No biggie.

But if you're at all unsure, then you're not going to get time to think about the solution before being expected to answer it. This is the problem with Speed Coding - if you know the answer, it's fine. But then it's not really seeing if you can think on your feet... if you don't know the answer, you're likely going to make one or two edge-case mistakes, and those will be clearly visible to the person that knows the solution.

The problem I ran into was a binary tree issue, and while I had been practicing my Speed Coding in Clojure, the nature of the binary tree really leaned towards a C++ solution, and that was not horrible, but it was a lot less friendly to edge-conditions.

I ended up writing something like this:

  struct Node { int value; Node *left; Node *right };
 
  bool stored(Node *me, op) {
    bool   retval = true;
    if (retval && (me->left != NULL) && (me->left op me->value)) {
      retval = stored(me->left, op);
    }
    if (retval && (me->right != NULL) && (me->value op me->right->value)) {
      retval = stored(me->right, op);
    }
    return retval;
  }

and the missed edge-case is that once you are one, or more, steps down in the tree, it's possible to have the relative position of the values be correct, but the absolute position to be false. There are two ways to solve this, in this code:

  1. Pass limits down with the call - this could be done with max and min arguments and then in the recursive calls, place the correct values there, and test them as well.
  2. Scan up the tree on each check - this could be a walk-up the tree and check to see that you aren't in violation of the location you have. This would take more time because of all the walking, but it'd work.

But what I'd wanted to do was to write it in Clojure, but the data structure didn't jump out at me. Until this morning. 🙂 This morning I spent the minute or two thinking about the data structure, and then formulated the following solution:

  ;; [val left right]
  ;;       10
  ;;    5      15
  ;;  1   7  12   20
  (def good [10 [5 [1] [7]] [15 [12] [20]]])
 
  ;;       10
  ;;    5      15
  ;;  1   17  12   20    -- the 17 is out of place
  (def bad [10 [5 [1] [17]] [15 [12] [20]]])
 
  (defn sorted?
    "Function to check a binary tree to see if it's sorted for proper
     searching."
    [[v lt rt] op]
    (let [ltv (first lt)
          rtv (first rt)]
      (and (or (nil? lt) (and (every? identity (map #(op % v) (flatten lt)))
                              (check lt op)))
           (or (nil? rt) (and (every? identity (map #(op v %) (flatten rt)))
                              (check rt op))))))

What I really like about this solution is that it checks the entire subtree with the operation. This means that the effort to do one, is really checking all of them. This is what I wanted to write, and it works perfectly.

But I didn't force the issue, and pull back and take the time to think. My mistake. I won't make it again.

UPDATE: a friend and I were talking about this same problem, and he came up with a solution that was very clever - the structure can be validated by simply assuming that the structure is a sorted binary tree, and then calculating the min and max values of the tree.

The catch being that if you get to a node where the current value isn't within the min and max, then you have to fail, and return nil. It's really quite amazingly simple in that it's very fast, very easy to understand and adds the additional benefit of returning the extremum of the tree.

  (defn bst-rx*
    "returns extent [min max] of valid bst or nil if invalid"
    [t]
    (cond
      (nil? t) nil
      (vector? t) (let [[v l r] t
                        lx (if l (bst-rx* l) [v v])
                        rx (if r (bst-rx* r) [v v])]
                    (when (and lx rx (<= (lx 1) v (rx 0)))
                      [(lx 0) (rx 1)]))
      :else [t t]))
 
  (defn bst-rx?
    [t]
    (boolean (bst-rx* t)))

Thrown for a Bit of a Loop

Tuesday, September 17th, 2019

hostmonster.jpg

Yesterday morning, I thought to check my blog and see something, but to my complete surprise, it was down. Like very down, and I had no idea what was happening. I've been using WordPress and HostMonster for more than a decade, and I've never had anything even remotely like this. I didn't know where to start - so I went to their admin site, thinking I'd been hacked somehow...

First, it was clear that they had done a ton of upgrades to the host and its support platform. Massive changes. So the first thing was to get logged in. This was a little odd because it wasn't working on Safari, and it had been in the past - so I switched to Chrome, and finally got logged in. Step 1 - accomplished!

Then I looked at the installed users on each of the three WordPress sites I have, and in each case, there was a user that I couldn't explain. It certainly appeared to me that these were bad actors, and I had no idea how they got there. I stay up to date, don't allow logins, don't allow replies... it's a Journal more than anything else. But still... I could not deny these accounts. So I asked for help.

php.jpg

It took a long while to figure this out, but in the end, the logs for the site indicated that there was a PHP problem in one of my plugins, and one of my themes. Why this happened yesterday wasn't at all clear, but it became clear as I dug further.

HostMonster had dropped support for PHP 5.x, and the only versions available to me were 7.0, 7.1, 7.2, and 7.3, with the latter being the default. Now it seemed to be clear what had happened... nothing had changed, but in all the upgrades for the infrastructure on the hosts, they had switched to a different config for PHP, and the plugin and theme were doing something that they shouldn't. OK... now to the code.

The first one I tackled was the plugin, as the error in the logs was pretty clear. I did a quick search for =& and sure enough, it was a PHP 5-ism, and that was easy to fix. That solved the plugin problem, and it loaded and ran fine. The theme had a different problem with a deprecated function that wasn't really even needed in the theme, but I found a replacement, and used that, and the theme was fine.

All told, this took more than 5 hours... it was not quick, and I just ahead of the part where I found out that the timezone plugin I was using wasn't needed in WordPress 5, and so I didn't put that back into play, Also, when I got the site up, it was possible to see the errors on activation of the plugin (but not the theme), which made continued debugging a lot easier.

In the end, it was all cleaned up, and now it's set for PHP 7. I'm glad that there wasn't a bigger issue, but I really have to be careful of these things because there is almost no support for the plugin and theme - and I really like to have both of them working for me for this blog. 🙂