Ported the CryptoQuip Solver to Ruby
Wednesday, December 10th, 2014This morning I finished the little project I started yesterday on porting my old Obj-C CryptoQuip Solver from Obj-C to ruby. I wanted to brush up on my ruby skills as it's been nearly 18 months since I've done any real solid ruby development, and I've also wanted to get this codebase ported to Swift, and there are a lot of similarities between ruby and Swift. So this morning I finally got it all working as I wanted. Nothing really major, but I had made a few changes to try and capitalize on the strengths of ruby, and those made a few things that needed to be fixed up.
Nothing major.
But Wow... the codebase is significantly smaller. The original Obj-C code was well documented, and comes in at 3370 lines for all the headers and implementation files. Ruby is not as extensively documented, but it's a lot smaller so I didn't feel the need. It's a total of 316 lines. That's a factor of ten. Wow.
One of the nicest parts of the port was to change how I determined if a cyphertext word could potentially match a plaintext word. In the Obj-C code it looked like this:
/*! One of the initial tests of a plaintext word is to see if the pattern of characters matches the cyphertext. If the pattern doesn't match, then the decoded text can't possibly match, either. This method will look at the pattern of characters in the cyphertext and compare it to the pattern in the argument and if they match, will return YES. */ - (BOOL) matchesPattern:(NSString*)plaintext { BOOL match = YES; // make sure that we have something to work with if (match && (([self getCypherText] == nil) || (plaintext == nil))) { match = NO; } // check the lengths - gotta be the same here for sure if (match && ([[self getCypherText] length] != [plaintext length])) { match = NO; } /* * Assume that each pair of characters is a new map, and then test that * mapping against all other cyphertext/plaintext pairs that SHOULD match * in the word. If we get a miss on any one of them, then we need to fail. */ if (match) { unichar cypher, plain, c, p; NSUInteger len = [plaintext length]; for (NSUInteger i = 0; (i < len) && match; ++i) { // get the next possible pair in the two words cypher = [[self getCypherText] characterAtIndex:i]; plain = [plaintext characterAtIndex:i]; // check all the remaining character pairs for (NSUInteger j = (i+1); (j < len) && match; ++j) { c = [[self getCypherText] characterAtIndex:j]; p = [plaintext characterAtIndex:j]; if (((cypher == c) && (plain != p)) || ((cypher != c) && (plain == p))){ match = NO; break; } } } } return match; }
where it's a pretty simple double-loop on the words. Nothing horribly hard here, but for ruby I went with a different idea to start with:
# One of the initial tests of a plaintext word is to see if the pattern of # characters matches the cyphertext. If the pattern doesn't match, then the # decoded text can't possibly match, either. This method will look at the # pattern of characters in the cyphertext and compare it to the pattern in # the argument and if they match, will return true. def matches_pattern?(plaintext) return false unless @cyphertext && plaintext && @cyphertext.length == plaintext.length pairs = @cyphertext.chars.zip(plaintext.chars).uniq ctc = pairs.map { |p| p.first }.uniq.count ptc = pairs.map { |p| p.last }.uniq.count (pairs.count == ctc) && (ctc == ptc) end
I like the compactness of the ruby - very expressive, but the question is At what cost? The ruby version will check all the paris in the words, even if there is a mismatch in the second letter. This cost may - or may not - be a big deal. But it does simplify the code quite a bit.
The downside of the ruby implementation is that it's pretty slow. Where the Obj-C version was in the 60 msec range the ruby version (1.9.3 MRI) is more like 650 msec. This isn't a total shock, and I could have tried JRuby which will be considerably faster once the JIT gets warmed, but that wasn't the real point. It was just a fun little project to get this into ruby so that it might make it a little easier to get it into Swift.
And I got to brush up on my ruby too.
UPDATE: I tried it with JRuby 1.7.0, and yes, it's old, but it's a point of reference, and the times were worse. I did not expect that. The run times were in the 7 sec range. About an order of magnitude over the RMI version. Same code. Very odd, but I wanted to see what it'd be, and now I know.
UPDATE: I couldn't resist the urge to convert the new ruby scheme to clojure simply because I think clojure is going to be a lot more performant, and so I wanted to give that a try. The key pattern matching function is now:
(defn matches? "Function to see if the cyphertext and plaintext have the same pattern of characters such that they could possibly match - given the right legend." [ct pt] (if (and (string? ct) (string? pt) (= (count ct) (count pt))) (let [p (distinct (map vector ct pt)) ctc (count (distinct (map first p))) ptc (count (distinct (map last p)))] (= (count p) ctc ptc))))
and in a REPL, it's very easy to see that it's working:
(matches? "wdllpy" "rabbit") => true (matches? "wdllpd" "rabbit") => false
I have to admit that clojure is about the most fun language that I've used in a very long time. Ruby is nice, and it's very close - but clojure is just functional - and performant - all the way.
[12/12] UPDATE: my friend wrote me back with a significant simplification to the code:
(defn matches? "Function to see if the cyphertext and plaintext have the same pattern of characters such that they could possibly match - given the right legend." [ct pt] (if (and (string? ct) (string? pt) (= (count ct) (count pt))) (let [pc (count (distinct (map str ct pt))) ctc (count (distinct ct)) ptc (count (distinct pt))] (= pc ctc ptc))))
I was worried about doing this, initially, because I was thinking that it would be possible to re-arrange the letters while keeping the number of distinct characters the same, and therefore make additional possible matchings that would make it impossible to match.
What I realized with his help is that the key is that we have three checks - the pairs, and the distinct chars in the words. With all three, it's going to catch all the cases I was worried about.
Also, I really like how he went to using str as opposed to vector. Very nice. Just as effective, but far cleaner and easier to deal with in the distinct.
I can then go back and look at the ruby code and update the constructor for the CypherWord:
def initialize(cyphertext) @cyphertext = cyphertext @cyphertext_chars = cyphertext.chars @cyphertext_uniq_count = cyphertext.chars.to_a.uniq.count end
and then the matches_pattern? becomes the simpler:
# One of the initial tests of a plaintext word is to see if the pattern of # characters matches the cyphertext. If the pattern doesn't match, then the # decoded text can't possibly match, either. This method will look at the # pattern of characters in the cyphertext and compare it to the pattern in # the argument and if they match, will return true. def matches_pattern?(plaintext) return false unless @cyphertext && plaintext && @cyphertext.length == plaintext.length pc = @cyphertext_chars.zip(plaintext.chars).uniq.count ptc = plaintext.chars.to_a.uniq.count (pc == @cyphertext_uniq_count) && (@cyphertext_uniq_count == ptc) end
Yeah... we're beating this Bad Boy into the ground, but it's a lot of fun to be working with my friend on something - even if it's just a 30-year olg problem.