Archive for the ‘Coding’ Category

Ported the CryptoQuip Solver to Ruby

Wednesday, December 10th, 2014

Ruby

This morning I finished the little project I started yesterday on porting my old Obj-C CryptoQuip Solver from Obj-C to ruby. I wanted to brush up on my ruby skills as it's been nearly 18 months since I've done any real solid ruby development, and I've also wanted to get this codebase ported to Swift, and there are a lot of similarities between ruby and Swift. So this morning I finally got it all working as I wanted. Nothing really major, but I had made a few changes to try and capitalize on the strengths of ruby, and those made a few things that needed to be fixed up.

Nothing major.

But Wow... the codebase is significantly smaller. The original Obj-C code was well documented, and comes in at 3370 lines for all the headers and implementation files. Ruby is not as extensively documented, but it's a lot smaller so I didn't feel the need. It's a total of 316 lines. That's a factor of ten. Wow.

One of the nicest parts of the port was to change how I determined if a cyphertext word could potentially match a plaintext word. In the Obj-C code it looked like this:

  /*!
   One of the initial tests of a plaintext word is to see if the pattern of
   characters matches the cyphertext. If the pattern doesn't match, then the
   decoded text can't possibly match, either. This method will look at the
   pattern of characters in the cyphertext and compare it to the pattern in
   the argument and if they match, will return YES.
   */
  - (BOOL) matchesPattern:(NSString*)plaintext
  {
    BOOL   match = YES;
 
    // make sure that we have something to work with
    if (match && (([self getCypherText] == nil) || (plaintext == nil))) {
      match = NO;
    }
 
    // check the lengths - gotta be the same here for sure
    if (match && ([[self getCypherText] length] != [plaintext length])) {
      match = NO;
    }
 
    /*
     * Assume that each pair of characters is a new map, and then test that
     * mapping against all other cyphertext/plaintext pairs that SHOULD match
     * in the word. If we get a miss on any one of them, then we need to fail.
     */
    if (match) {
      unichar      cypher, plain, c, p;
      NSUInteger   len = [plaintext length];
      for (NSUInteger i = 0; (i < len) && match; ++i) {
        // get the next possible pair in the two words
        cypher = [[self getCypherText] characterAtIndex:i];
        plain = [plaintext characterAtIndex:i];
        // check all the remaining character pairs
        for (NSUInteger j = (i+1); (j < len) && match; ++j) {
          c = [[self getCypherText] characterAtIndex:j];
          p = [plaintext characterAtIndex:j];
          if (((cypher == c) && (plain != p)) ||
              ((cypher != c) && (plain == p))){
            match = NO;
            break;
          }
        }
      }
    }
 
    return match;
  }

where it's a pretty simple double-loop on the words. Nothing horribly hard here, but for ruby I went with a different idea to start with:

  # One of the initial tests of a plaintext word is to see if the pattern of
  # characters matches the cyphertext. If the pattern doesn't match, then the
  # decoded text can't possibly match, either. This method will look at the
  # pattern of characters in the cyphertext and compare it to the pattern in
  # the argument and if they match, will return true.
  def matches_pattern?(plaintext)
    return false unless @cyphertext && plaintext &&
                        @cyphertext.length == plaintext.length
    pairs = @cyphertext.chars.zip(plaintext.chars).uniq
    ctc = pairs.map { |p| p.first }.uniq.count
    ptc = pairs.map { |p| p.last }.uniq.count
    (pairs.count == ctc) && (ctc == ptc)
  end

I like the compactness of the ruby - very expressive, but the question is At what cost? The ruby version will check all the paris in the words, even if there is a mismatch in the second letter. This cost may - or may not - be a big deal. But it does simplify the code quite a bit.

The downside of the ruby implementation is that it's pretty slow. Where the Obj-C version was in the 60 msec range the ruby version (1.9.3 MRI) is more like 650 msec. This isn't a total shock, and I could have tried JRuby which will be considerably faster once the JIT gets warmed, but that wasn't the real point. It was just a fun little project to get this into ruby so that it might make it a little easier to get it into Swift.

And I got to brush up on my ruby too.

UPDATE: I tried it with JRuby 1.7.0, and yes, it's old, but it's a point of reference, and the times were worse. I did not expect that. The run times were in the 7 sec range. About an order of magnitude over the RMI version. Same code. Very odd, but I wanted to see what it'd be, and now I know.

UPDATE: I couldn't resist the urge to convert the new ruby scheme to clojure simply because I think clojure is going to be a lot more performant, and so I wanted to give that a try. The key pattern matching function is now:

  (defn matches?
    "Function to see if the cyphertext and plaintext have the same pattern of
    characters such that they could possibly match - given the right legend."
    [ct pt]
    (if (and (string? ct) (string? pt) (= (count ct) (count pt)))
      (let [p (distinct (map vector ct pt))
            ctc (count (distinct (map first p)))
            ptc (count (distinct (map last p)))]
        (= (count p) ctc ptc))))

and in a REPL, it's very easy to see that it's working:

  (matches? "wdllpy" "rabbit")
    => true
  (matches? "wdllpd" "rabbit")
    => false 

I have to admit that clojure is about the most fun language that I've used in a very long time. Ruby is nice, and it's very close - but clojure is just functional - and performant - all the way.

[12/12] UPDATE: my friend wrote me back with a significant simplification to the code:

  (defn matches?
    "Function to see if the cyphertext and plaintext have the same pattern of
    characters such that they could possibly match - given the right legend."
    [ct pt]
    (if (and (string? ct) (string? pt) (= (count ct) (count pt)))
      (let [pc (count (distinct (map str ct pt)))
            ctc (count (distinct ct))
            ptc (count (distinct pt))]
        (= pc ctc ptc))))

I was worried about doing this, initially, because I was thinking that it would be possible to re-arrange the letters while keeping the number of distinct characters the same, and therefore make additional possible matchings that would make it impossible to match.

What I realized with his help is that the key is that we have three checks - the pairs, and the distinct chars in the words. With all three, it's going to catch all the cases I was worried about.

Also, I really like how he went to using str as opposed to vector. Very nice. Just as effective, but far cleaner and easier to deal with in the distinct.

I can then go back and look at the ruby code and update the constructor for the CypherWord:

  def initialize(cyphertext)
    @cyphertext = cyphertext
    @cyphertext_chars = cyphertext.chars
    @cyphertext_uniq_count = cyphertext.chars.to_a.uniq.count
  end

and then the matches_pattern? becomes the simpler:

  # One of the initial tests of a plaintext word is to see if the pattern of
  # characters matches the cyphertext. If the pattern doesn't match, then the
  # decoded text can't possibly match, either. This method will look at the
  # pattern of characters in the cyphertext and compare it to the pattern in
  # the argument and if they match, will return true.
  def matches_pattern?(plaintext)
    return false unless @cyphertext && plaintext &&
                        @cyphertext.length == plaintext.length
    pc = @cyphertext_chars.zip(plaintext.chars).uniq.count
    ptc = plaintext.chars.to_a.uniq.count
    (pc == @cyphertext_uniq_count) && (@cyphertext_uniq_count == ptc)
  end

Yeah... we're beating this Bad Boy into the ground, but it's a lot of fun to be working with my friend on something - even if it's just a 30-year olg problem.

Refreshing my Ruby Skills

Monday, December 8th, 2014

Ruby

Last week I was in an interview where they wanted me to code in Ruby - nothing else. I wasn't really prepared to write Ruby, as it's been a good 18 months since I've written a line of it - maybe more. But I excused my poor memory of the syntax and dug in. But it got me thinking, and I decided to brush up on my Ruby skills, and have a little fun at the same time.

So I made a new directory, and started grinding the gears on my memory to get things back in line for working on Ruby. This means remembering rvm, and how to even find the version of Ruby for the .ruby-version file in the root of the project. Then there were the .rspec lines for getting nice looking test output - and running the tests in random order every time. And I still hadn't gotten to the code.

So I started working on the programming task I got in the interview, and it was really amazing to me that some of the hints I'd gotten in the interview were really not at all how I wrote it when I had the chance to do it on my own. For example, I needed to take a string, and get the first character, and then the rest - very much like clojure would do it. I was given the hint that "hello".chars would give me an Array of the characters, and on their box it did - but in Ruby 1.9.2, it returns an Enumerator, and that's an entirely different thing. Also, it requires that we allocate new storage.

What I really wanted was: "hello"[0,1] - that's a string of the first character in the string. And then the rest is just: "hello"[1,10], or another long number to get the rest. This is what was sitting in the back of my memory, and when I saw it, a lot of other little things started flooding back in.

Things started accelerating, and in no time I was looking at a far cleaner version of the test code I'd written just a few days ago. Far more idiomatic Ruby, and the tests were super simple to write, and took advantage of the contextual nature of the tests.

I'm not going to claim that I'm even a decent Ruby coder. I'm just fair. The class library is just far too expansive to know it all without dealing with it on a daily basis, and even then, on a large codebase that takes advantage of the different classes as well. It's vast. Which is nice, don't get me wrong, but it's why I know it'd take me a very long time to master even where things are - let alone their syntax and usage. But it's something I may have to do, so it's nice to get back in it.

VisualVM is Bundled with the JVM

Wednesday, December 3rd, 2014

java-logo-thumb.png

I have been using VisualVM for quite a while now... it's got the monitoring tools that work pretty nicely if you have the network connectivity you need to get to the box. What I learned this morning on the #clojure channel on IRC is that it's now included in the JVM.

That's sweet! SO I had to check:

  $ which jvisualvm
  /usr/bin/jvisualvm

Really nice! It's the same tool, but now it's integrated, and that's even better.

Now I'll be able to use it on any platform and not have to hassle with downloading it as a separate install. Very nice!

Finding the Joy in Life Again

Wednesday, October 22nd, 2014

Great News

I honestly would have put money on the fact that this would not have happened today. Big money.

I'm sitting on the bus riding to work, and I realize that I'm pretty happy without a pain-causing personal relationship in my life. That was a wow! moment. I've been separated for about 2 years, and the divorce is in the works, but I would have bet real money I'd feel horrible for the rest of my natural life. But today... on the bus... for a few minutes... I didn't.

That was huge for me. Huge.

Then I'm in work, updating a few postings with the results of the tests I'd done overnight, and I'm back into the swing of posting like I used to. It's been a long two years, but I'm back to writing about what I'm doing, and it's really helping. I'm feeling like I'm enjoying myself again.

This, too, was huge for me.

I don't expect this to last all day... but the fact that I have felt this way tells me that I need to keep doing what I'm doing - keep moving forward, and then maybe this will come again. And maybe when it comes again, it'll last longer. Maybe.

Fixed Log Configs for SIngle-File Logging

Thursday, October 16th, 2014

Storm Logo

This morning I wanted to take some time to make sure that I got all the log messages into one file, and that not being a redirection of stdout or stderr. This is something I've done a few times, and it just took the time to set up the Logback config file. The reason we're using Logback is that this is what Storm uses, and since this is a Storm jar, we needed to use this style of logging.

Interestingly, the config wasn't all that hard to get a nice, daily-rotating, compressing, log file for everything I needed:

<configuration scan="true">
  <appender name="FILE"
            class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>/home/${USER}/log/experiments.log</file>
    <rollingPolicy
        class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
      <fileNamePattern>
        /home/${USER}/log/experiments_%d{yyyy-MM-dd}.log.gz
      </fileNamePattern>
      <maxHistory>30</maxHistory>
    </rollingPolicy>
    <encoder>
      <pattern>
        [%d{yyyy-MM-dd HH:mm:ss.SSS}:%thread] %-5level %logger{36} - %msg%n
      </pattern>
    </encoder>
  </appender>
 
  <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <encoder>
      <pattern>
        [%d{yyyy-MM-dd HH:mm:ss.SSS}:%thread] %-5level %logger{36} - %msg%n
      </pattern>
    </encoder>
  </appender>
 
  <root level="INFO">
    <appender-ref ref="FILE" />
  </root>
</configuration>

I have to admit that this is a decent tool - if you know how to configure it properly. But I guess that's true for a lot of the Apache projects.

Firefox 33.0 is Out!

Wednesday, October 15th, 2014

Firefox3.5.jpg

I haven't used Firefox a lot, lately, but it's still a browser I keep updated because I read that it's starting to turn things back around. The new OpenH264 in Firefox 33.0 should be an interesting take on the video streaming, and it's fast enough, so maybe I'll look at it a little more.

I really did like the workspaces concept, but I just didn't get the feel that it was as smoothly integrated as it could be. Maybe it's getting better?

Google Chrome 40.0.2182.4 is Out

Friday, October 10th, 2014

Google Chrome

Well... I guess I can hit the major updates, and this morning dropped the move to 40.x.x.x for Google Chrome. It's pretty amazing to me that for as long as I've been using Chrome, it's gone from not good enough to be second-tier, to very good, to primary for work. It's amazing what a boatload of money will do when it comes to getting things done. Case in point... why did they add the name button to the tab-bar?

Odd Name Button

They used to have a little face up in the same level, or when you had only the one profile, they didn't show anything. Why not that now? I have only one profile on all my machines, but they insist on showing me this box with "me" in it.

Someone didn't put a lot of thought into this. Keep working, guys...

Looking at Cassandra for Fast SQL Storage

Thursday, October 9th, 2014

Cassandra

I've got a lot of streaming data in Storm, and I'm doing quite a bit of analytical processing on that data, but I can't store much of it because the cost of storage is so high - latency. So I end up using a lot of redis boxes and having boxes read out of that into more conventional storage - like Postgres. But I'm starting to hear good things about Cassandra and Storm working well together.

My concern is real speed. I have an average message rate of 40k to 50k msgs/sec, with peaks as high as four times that. I need to be able to handle those peaks - which are by no means the once a month peak levels of several times that, but it's something I see on a regular basis, and we need to be able to take all this data from the topology and not slow it down.

If I can really do this on, say, eight machines, then I'll be able to have the kind of deep dive we've needed for looking into the analytics we're calculating. This would be a very big win.

I've reached out to a few groups that are doing something very similar to this, and their preliminary results say we can do it. That's really good news. We'll have to see what happens with the real hardware and software.

Redis Cluster 3.0.0 RC is Out!

Thursday, October 9th, 2014

Redis Database

I have done a lot of work with redis, and of late, the most work I've been doing with it was to shard (aka cluster) many redis instances on a single box - and on multiple boxes. This isn't bad, but let's face it, it'd be great if redis could do this all on it's own - like it currently does replication.

Then I read:

Basically it is a roughly 4 years old project. This is about two thirds the whole history of the Redis project. Yet, it is only today, that I’m releasing a Release Candidate, the first one, of Redis 3.0.0, which is the first version with Cluster support.

Very nice! I'll be very interested in seeing how it works, and how it will scale with network load and if it's able to be configured for on-box vs. off-box connectivity. That could make a huge difference in the communication bandwidth.

Nice to see that it's almost here.

Having a Rough Couple of Weeks

Tuesday, October 7th, 2014

Bad Idea

I spent some time this morning going over my Git commit logs trying to find out what I've been doing in the last two weeks - and then converting those into decent posts - at least as decent as I could make them in hind-sight. The reason for all this is the re-org, and the resulting group that's I've been thrust into. It's not a pretty sight - and I've seen this far too many times in my professional life to think it's going to end well.

I had a conversation with the larger group about why I see so many git force push messages in HipChat. I'm sure there's a reason, and it's part of some work-flow - I just haven't used that, and it seems pretty harsh to be used so often. Fact is, their reason was that it just made the commits "prettier". That they wanted to edit their commits like they edit their code.

I was silent because that was almost certainly one of the most insane things I've heard in a long time. Yes, by all means, let's risk the corruption of the git repo on the server because you didn't take time to think about what you were doing before you decided to commit it. Yeah, that's a good plan.

Better yet - let's make it part of the standard work-flow that you teach new developers. Yeah, that's a great idea. I just can't wait to start messing things up because I refuse to put forth the effort to think about what I'm doing prior to doing it.

In another meeting, one of the other developers felt that a Core Value of the new group should be that... and I quote: We all should have fun. Yeah... I don't even have to respond to this because unlink this developer, I understand the point of a public company, and that they really aren't in existence to make their employees happy - or make sure they have fun - they are in the business of making money.

I tried to say that fun is a great thing to have, but holding it up as the reason we're doing - and not doing - things is kinda crazy. That we don't have to have fun to get work done. It would be nice - but it's not necessary. Flew right over his head.

And the odd thing is that this is not a bad guy - well... I've heard he's a good guy, so I want to believe that my assessment of him to date is way off, but when he goes on about how I couldn't be more wrong... well... it's hard to give him the benefit of the doubt - or write it off to a miscommunication or something.

Nope, I'm in a group that doesn't seem to share a single value I have. They think it's acceptable risk to have git force push in their workflows, and they think that they have to have fun or they don't have to work. Wacky.

It's been hard to get things done, and if I took their lead, I wouldn't have done a thing - but thankfully, I didn't, and did a good chunk of work each day. In the end, I need off this group. I like the work, I just can't stand the management and co-workers any longer.