Archive for the ‘Coding’ Category

Mega Update Morning

Wednesday, January 23rd, 2019

Software Update

This morning, actually, last night, Apple released macOS 10.14.3, iOS 12.1.3, and tvOS 12.1.2 - and so I had a good morning updating my devices. It always puts me in a good mood to update machines... it's a fresh start... a new beginning, and it makes me smile.

I am glad to have updated to the latest Apple TV because the TV app is really a great thing for keeping track of the shows that I'm watching on Netflix and Hulu - all in one place. And it goes to my iPhone just in case I'd like to watch on the train home.

All updated and ready to take on the day!

Simple Immutable Data in Postgres

Monday, January 14th, 2019

PostgreSQL.jpg

A friend asked me today for the trick I've used in the past to make sure that the data in a database is immutable and versioned. This is nice because it matches what Clojure acts like, internally, and it makes it easy to see who's doing what, when. Assuming we start with a table - for the sake of argument, let's say it's customers, and looks something like:

  CREATE TABLE IF NOT EXISTS customer (
    id             uuid NOT NULL,
    version        INTEGER NOT NULL,
    as_of          TIMESTAMP WITH TIME zone NOT NULL,
    last_name      VARCHAR,
    first_name     VARCHAR,
    PRIMARY KEY (id, version, as_of)
  );

where the first three fields are really the key ones for any table to work with this scheme. You need to have a unique key, and in this case, the easiest I've found is a UUID, and then you need a version and a timestamp when that change was made. What's left in the table is really not important here, but it's something.

You then need to make an Audit table that has the same table structure, but has the string _audit added to the name:

  CREATE TABLE IF NOT EXISTS customer_audit LIKE customer including ALL;

What we then need to create is the following trigger that intercepts the INSERT and UPDATE commands on the customers table, and places the historical data into the Audit table, and the most recent version of the customer data is always kept in the customer table.

The trigger looks like this:

  --
  -- create the trigger on INSERT and UPDATE to update the version if it's
  -- not provided, and to maintain all versions in the audit table, but have
  -- the current version in the non-audit table. Importantly, NOTHING is
  -- deleted.
  --
  CREATE OR REPLACE FUNCTION audit_customer()
  RETURNS TRIGGER AS $body$
  DECLARE
    ver INTEGER;
  BEGIN
    -- get the advisory lock on this id
    perform pg_advisory_xact_lock(('x' || translate(LEFT(NEW.id::text, 18),
                                  '-', ''))::bit(64)::BIGINT);
 
    -- get the max of the existing version for the data now
    SELECT MAX(version) INTO ver
      FROM customer_audit
     WHERE id = NEW.id;
    -- and bump it up one and use that
    IF ver IS NULL THEN
      NEW.version := 1;
    ELSE
      NEW.version := ver + 1;
    END IF;
 
    -- if an update, then we need to insert the new
    IF tg_op = 'UPDATE' THEN
      -- now let's insert the old row into the audit table
      INSERT INTO customer_audit
        VALUES (NEW.*);
    elsif tg_op = 'INSERT' THEN
      -- now let's insert the new row into the audit table
      INSERT INTO customer_audit
        VALUES (NEW.*);
      -- and delete the old one in the customer table
      DELETE FROM customer
        WHERE id = NEW.id
          AND version <= ver;
    END IF;
 
    -- finally, return the row to be inserted to customer
    RETURN NEW;
  END
  $body$ LANGUAGE plpgsql;
 
  CREATE TRIGGER set_version BEFORE INSERT OR UPDATE ON customer
    FOR each ROW EXECUTE PROCEDURE audit_customer();

At this point, we can INSERT or UPDATE on customers and the previous version of that customer will be mmoved to the Audit table, and the the most recent version will be held in the customers table.

I have found this very useful, and I've put it in a gist for easy access.

The point of:

    -- get the advisory lock on this id
    perform pg_advisory_xact_lock(('x' || translate(LEFT(NEW.id::text, 18),
                                  '-', ''))::bit(64)::BIGINT);

is to get an exclusive lock on the data for a given id. This is necessary to make sure that updates from multiple services get serialized on the same data. This scheme can't ensure that there are merged changes - only that there is a sequence of changes to the table, and each one is entirely correct for the time it was entered.

So... what happens if you have a string as the primary key, and not a UUID? Well, use can use the MD5 checksum of the string as the lock indicator:

    -- get the advisory lock on a general string
    perform pg_advisory_xact_lock(('x' || md5(NEW.wiggle::VARCHAR))::bit(64)::BIGINT);

where the field wiggle is a varchar, and here, we are just computing the MD5, and using that as the basis of the lock. Yes, there could be some hash collisions, but that's likely not a huge performance problem, and it's conservative in that we'll over-lock, and not under-lock.

UPDATE: a friend asked about using an int as the primary key, and in that case, the line would be:

    -- get the advisory lock on a general string
    perform pg_advisory_xact_lock(NEW.id::BIGINT);

where the column id is an int. Again, we just need to get it to the bigint for the advisory lock call. After that, Postgres does the rest.

AWS Adds ARM Instances to EC2

Wednesday, November 28th, 2018

Amazon EC2 Hosting

I was surprised to read that at it's yearly conference, Amazon announced that you can now spin up EC2 instances based on their custom ARM CPU. This isn't a complete surprise - face it, Apple is close to launching ARM-based laptops and desktops. It's been batted about in the press for a while, and based on the old quad-fat binaries, the technology is there, and Apple certainly has all the experience to get macOS up and running on ARM.

This isn't necessarily the cheapest EC2 instances - for the a1.medium, a 1 CPU, 2 GiB RAM instance, is $0.0255/hr which rolls up to $233.38/yr for the instance. And the t3.nano starts at $0.0052/hr, but what's most interesting is that AWS did the math, and decided that building their own CPU - and then, of course, their own machines, was the cost-effective way to go. Amazing.

I have to believe that Intel is missing out - or maybe they will be tied to the x86 chipset and ride that for all it's worth. Who knows... but it seems like something they are missing out on. And how long can it be before we see laptops and desktops based on ARM? Not long.

SubEthaEdit 5 is Open Source

Wednesday, November 28th, 2018

subethaedit.jpg

This morning I saw a tweet from SubEthaEdit that they were Open Sourcing the editor - and the current version, SubEthaEdit 5, was still on the Mac App Store, and would be free. This was a real surprise to me. I've paid for several of the versions of this collaborative editor on the Mac - heck, I've written syntax highlighting definition files for Make and Fortran for this editor. It's a big part of my tools in the past.

I have worked with my good friend on Macs for many years, and when this first appeared, as Hydra, I thought that this would be a great tool for working on code with him. But it was commercial, and we were in different states, and we hadn't even started using Git - and GitHub wasn't even an idea at the time. So it just fizzled out.

But at several times in the last 5 years we've both talked about getting something like this going for remote pair coding. It's just an editor, and he's now using Cursive for his Clojure coding, so again, maybe it's not such a great fit... and there are other services that are going for an add-in mode for existing editors, so maybe it needs to be updated to really find it's market. If so, I think that would be great.

I hope it finds a great group of developers now that it's Open Source. I'd love to have a good tool that's really written to handle the collaborative editing from the jump. Then again, I'm not all that sure what we'd need above GitHub... but it's an admirable goal.

Paw is a Great REST API Tool

Wednesday, November 28th, 2018

Paw

This morning I noticed that Paw 3.1.8 was released, so I updated right away - it's about the best tool I've ever used for testing and exercising REST APIs on any platform, and on the mac, it's just gorgeous. This is a tool that I used constantly for years when working on Clojure REST services. It allowed me to have variables for each call, and then to group them into environments so that it was easy to switch from development to local to production and see the different responses - where the variables would include the name of the host, etc.

Paw 3 1

Postman is nice - and it's got a lot of features, but it isn't a native Mac app, and it's tied to the UI and workflow of a web app - which is fine, and I've used Postman a lot, but when I started using, and configuring, Paw, it wasn't even close. This is how Mac apps - tools - should be written, and sure, it's not cheap, but good things rarely are.

I still smile when I pull up the sets of calls, and how easy it was to load up a problem request, fire it off, document what was happening, and then see it in the logs... well... this was one of the tools that really made that job a dream.

Fixing Sublime Text 3 Clojure Markdown Blocks

Wednesday, November 21st, 2018

Sublime Text 2

Today I was looking for a solution to a problem I saw in the syntax highlighting of Sublime Text 3's Markdown files when there were Clojure code blocks in the file. All the other coding blocks I had been using were highlighting with a different background, and the text in the block was highlighted according to that language's rules. But not so with Clojure.

So I asked on the Sublime Text Forums about the issue and to my amazement, I got a response! The response was clear about what I needed to do, and while there was a slight issue with the installation of a package, I solved that with a git clone in the directory, and I was in business.

At the time, I also submitted an issue with the Sublime Text GitHub group, and let them know I was having a problem. Very much like the issue on the Forum post. When I got an answer on the Forums, I updated the GitHub issue, and included the answer I got from the Forum guy. He didn't sound like he was tied into the GitHub group.

Mistake #1.

I then got a response on the Issue kinda poking me to make a PR because I have an answer that worked for me. And me, trying to be helpful to the folks that helped me, wanted to respond. It was just the tone of the request that was a little off. I should have listened to that voice.

Mistake #2.

So I made the Pull Request on the project and put in the comments, etc. and submitted it. At this point, I really want to thank GitHub for making a tool that has the most amazing workflow I've ever used. The ability to fork, and make a PR on a GitHub repo, and have the fork linked to the PR so that updating one updates the other is just amazing. I know it's not impossible - but it is very nice, and that's so nice to see.

The PR, as you can see, is really a very specific YAML file section, and there were, of course, no comments in it. I had no idea that the elements have changed, so when the reviewer chastised me for not looking at the rest of the file, and detecting the changes, I got a little prickly. But I said to myself "Take it easy... this is a simple mistake", and I apologized on the PR, and made the changes.

At the same time, the guy who helped me on the Forum chimed in and threw me under the bus - saying "Yeah, he didn't do the right thing - I gave him something that worked on the release version, but not master". I love it when people do this. It's such a comforting thought to be in an industry of people with such high integrity.

So that was fixed, and I thought "OK... enough of this, moving on..."

Mistake #3.

I then got a note saying that I hadn't read the rules of PR submissions, and that I needed to write rendering tests for this new code block. And believe me, these are not easy, and they are not trivial, and all this for something I should have just said "Sorry, I'm too busy, you can do the PR on your own."

But I read up on the tests. How to write them - and they are stupid. But I did it. And then I wrote the rendering tests and added those to the PR. And they all passed. Which was nice. So now I'm thinking "OK... this is finally over." But when will I learn?

Then a maintainer comes along and says this will have to be put on hold because they don't understand something about what's happening in the code, and that is causing an issue that is unrelated to the PR, but the PR would make things worse, or something.

So... I try to be a nice guy after someone has been nice to me... you know... passing it on... and for that, I get to deal with the Slashdot Kids living in their parent's basement and holding some power trip over poor saps like me that try to help folks out.

But I have a fix. I know what to do for subsequent releases, and I'll never do this again with these folks. Lesson learned.

iTerm2 is Quite Impressive

Tuesday, November 13th, 2018

iTerm2

I've been using iTerm2 since it was forked from the original, as there were things in Terminal.app that I just didn't like at the time: forced scroll bars, difficult selection of words... lots of little usability and chrome things. Now it's true that since then, Terminal.app has gotten a lot better - and on each new macOS upgrade, I always give it a try for a little bit... just to make sure it's still not the tool I'd like to use. But for quite a while, it's been iTerm2.

So why write about it now? Well... with version 3.2, they have used Metal to make the text rendering amazingly fast and the scrolling super smooth. This makes the overall appearance a real treat. Just amazing, really.

Now I'm going to see what the status is for BBEdit and Sublime Text 3 - because this kind of scrolling and updating is really quite impressive!

UPDATE: Sublime Text 3 seems to be using the GPU for rendering! Great! No need to worry about that. I know that there are new MacBook Pros coming out this month with the new GPUs, and now may be the right time to look at upgrading!

Finished an Online Course

Monday, November 5th, 2018

cubeLifeView.gif

This is interesting... I just finished an online course about Data Science, covered by The Shop, in an effort to be able to reach across the divide that currently exists between the science group and the engineering group. It doesn't need to exist, but it's there, and I was hoping that by taking this course, I'd be seen as trying to reach out. Maybe help things a little.

The class was meant to be 5 weeks, and from the sound of it, it was going to be mentored by some folks here in the science group. Again, sounds like just what I want - bonding experiences in class, and all that. Good. But when I signed up for the class, it was clear that it was offered from a larger institution and it wasn't really mentored by folks here - as much as we would have 1 hr meetings each week about the content of the course for that week.

So not at all what I was hoping for. But I couldn't really get upset about the course - it was exactly what it said it was, I had just assumed facts without checking them first. That's all on me.

The course was focused on understanding the basics of Data Science work, installing and running R and RStudio. Working with Git and GitHub, and a few shell commands. Not bad - given that each week of work was about 25-30 mins of videos to watch. That's not a lot if you want to teach someone shell commands. So it's not bad.

But it got me thinking about a real Data Science class for The Shop. These developers all understand math, calculus, all that... and they know the tools... so what about really teaching them something? That would be something to sit in on. So I sent it to my group just as a "This would be nice..." thought.

I guess this will be my first grade after my PhD, which is in a way, very funny to me, but it's done, and now it's time to see what'ss next.

One of my Favorite Comments

Tuesday, October 30th, 2018

Code Monkeys

Many years ago, I built a market data server based on the Bloomberg API that was available on Solaris - back in the days when the Bloomberg Terminal was available on Solaris. In that code, I needed to solve a problem of thread starvation and in order not to confuse the Next Guy - which could have been me, I made sure to comment this code to get a reference to the "housekeeping mutex" in the object. This was all C++.

To this day, this is one of my favorite comments. It's just exactly what I want to write, tell people I think they can write:

/*
 * When any client or chat command wants to do something with the
 * server they need to get my housekeeping mutex and place a read
 * lock on it. The reason for this is that we are going to have
 * times when the controller knows that things aren't really stable
 * in the system and therefore we need to hold off on doing certain
 * things. Since this is a read-write lock, most of the time things
 * will run along swimmingly, but when there is maintenance underway
 * we will obtain a write lock and make the normal clients wait for
 * the maintenance to be done.
 */
CKFWRWMutex *BBGServerController::getHousekeepingMutex()
{
    /*
     * With so many readers (clients) hitting the server at all hours,
     * we run into the problem that the write lock is almost impossible
     * to get. This is further hanpered by the fact that the pthread
     * read-write mutex doesn't specify the writers as having a higher
     * priority than the readers. So we can get a writer starvation.
     *
     * The solution is to have another mutex "in front" of the read-write
     * mutex that controls everything. The way this works is that all
     * clients need to call this method to get the housekeeping mutex.
     * The first thing they'll need to do here is to get a lock on the
     * "out of order" mutex and then release it right away. Most of the
     * time this will happen with little to no delay. However... when the
     * major housekeeping tasks arise, they lock this guy and *leave it
     * locked* so that the write lock that comes after it can wait for
     * the pending readers to get done and *then* obtain it's write lock.
     * 
     * When the housekeeping maintenance is done, we unlock the write lock
     * and then unlock the "out of order" lock and this method can resume.
     * It's a clean process that allows me to prioritize the write lock
     * above the readers which is as it should be in this application.
     */
    mOutOfOrderMutex.lock();
    mOutOfOrderMutex.unlock();
 
    return & mHousekeepingMutex;
}

Writing comments like this just makes me smile.

Simple File Encryption

Friday, October 26th, 2018

Yosemite

This morning I decided to turn on two-factor authentication on my GitHub account using the Authy app for my iPhone. I've been able to use it for different accounts, and it's always mine - not the company I work for, and while I've been using SMS for a while on GitHub - and others, I've just decided that it's probably a good thing to get moving on this = given the privacy issues that we are all reading about these days.

The thing that I needed was a simple file encryption bash script/function so that I could store the recovery tokens in a file without having to worry about it getting snatched. So now I've got it. The code is based on openssl, and it's pretty simple:

#
# These are simple functions to encrypt and decrypt files so that I don't
# have to hassle with extreme things in order to secure one file at a time.
# They use openssl to do the work, and it's pretty simple.
#
function jack {
    openssl des3 -in "$1" -out "$1.enc"
}
 
function unjack {
    openssl des3 -d -in "$1" -out `basename "$1" .enc`
}

and this simply allows me to encrypt a file and it adds .enc on the end of the filename, and then I can decrypt it as well, stripping that addition as well. Nothing fancy, but it really works just great.