Rewriting Bash to Ruby

Ruby

With all the efficiency changes in the code recently, the next most inefficient was the bash script that analyzed the run logs and generated some summary statistics for us to view in the morning. When I first created this script, it wasn't all that complex, and the logs weren't nearly as big as they are now. I used the typical assortment of scripting tools: grep, sed, awk, but the problem was that as I added things to the summary script the time it took to execute was getting longer and longer. To the point that it took several minutes to run it on the main pipeline process. That's no good.

So I wanted to rewrite it in something that was going to be fast, but process the file only once. The problem with the current version isn't that it's using bash, or grep, it's that the files are hundreds of megabytes and we need to scan them a dozen or more times for the data. What we needed was to make a single-pass summary script, and that's not happening with bash and grep.

So what then?

Ruby popped to mind, but given that we're using jruby, there's a significant startup penalty. But maybe we can force it to use a compiled MRI ruby in the deployment environments, and that will speed up the loading.

C, C++ both seemed like ideal candidates, but then I know how the rest of the guys in the group would react, and it's just not worth it.

So Ruby is is.

This shouldn't take long, as most of this is pretty simple stuff for ruby. Let's get going...