Archive for the ‘Coding’ Category

When it Rains, it Pours

Thursday, October 15th, 2009

This morning I was hoping to see a nice, stable, web app that I had to patch together last night. Instead, I had a crashing app that wasn't working at all. Crud.

It seems that the report updating thread - that worker thread that updates all the reports for new data arriving in the system, was dead. And after a restart, it died in only a few minutes. Not good.

So the first thing I did was to put the HashMap back into the code for the LRUCache which gave us the stability, but it was a ticking time-bomb. Around 2:00 this afternoon it was going to run into that same 12 GB limit and we were going to be in a mess of trouble. Not good, but I had no choice.

Once the HashMap was back in the code, I started looking for the fall-out of using the LRUCache. Because, in theory, there's no reason that the LRUCache should not have worked just like the HashMap - it's a Map, after all. It's just a question of removing some aged entries.

The code was telling me that once again, we had optimistic coding and the problems were not as easily removed as switching out one Map for another. The problems came about because the code assumed that everything was going to exist - no checks whatsoever. With the LRUCache not having some data that was once there, we had an easy NullPointerException. With no try/catch block on the thread, it simply died. Wonderful.

The changes were that horrible - basically just putting in the checks that should have been in the code in the first place, but the idea of not having something like a try/catch block on the Thread's run() method is just amazing.

Why?

Oh... I know. It's never going to fail.

I finally got something that stays up, looks like it's all OK, and most importantly, the memory footprint is stable. It's been running in NYC for a few hours and the memory is flattening out nicely. One issue was that the LRUCache was really only needed on the archive storage for the reports. The other HashMap was really just a mapping of the report description to the last report instance. That makes sense to leave as a HashMap as it won't be growing without bound like the archive will.

The only way to know is to find out what tomorrow will bring.

The Hidden Gotchas of Dodgey Design

Wednesday, October 14th, 2009

I've been working on this new deployment of this app I inherited and today I was nailed with a design decision that really shouldn't have been in the code in the first place, but was, and it had an ripple-effect that was really quite spectacular.

The design decision was to save every generated report for the lifetime of the web app. This was done so that anytime during the day a difference report could be made (on the server) between any two points in time. The reason for these difference reports was to enable the client to essentially freeze the report and then track the changes to the frozen state as time progresses.

One might say that this was the job of the client code - face it, it's already getting the data - it just needs to hold on to one dataset and then difference the incoming set to the saved set. It'd be minimal coding, easy, but it wasn't done. Nope.

So... when we added a lot of fields to this release, each report is now much larger, and we've added the roll-ups by product which only adds to the data per report. What happened was that by 2:20 pm the memory usage of the Tomcat instance was at 12 GB! It was slugging through Garbage Collection and I had to restart it. That helped, but I had to get a solution, and fast.

I talked to the original author and he suggested an LRU Cache on the data and not hold all the results - only the last n that have been accessed. It turns out, the implementation of an LRU Cache was pretty simple. I added that in place of the HashMap data structures for the retention of the reports and it appeared to work just fine. I checked that things worked, and that was what I put into production for tomorrow.

This should help a lot.

Implementing a Least Recently Used Cache in Java – Slick

Wednesday, October 14th, 2009

I have to admit that I'm not impressed by Java a lot. It's just the level of familiarity I have with the language. It's just not that often that something really surprises me. So when it happens, it really blows me away. Today is one of those days.

I was trying to implement a Least Recently Used (LRU) Cache, and a co-worker said it would be easy. I doubted it, it's not like it's that easy to make one. But he Googled a few places, and when you base it on the Java LinkedHashMap, it really is easy:

package one.bkit.util;
 
/**
 * Java System-level Imports
 */
import java.util.*;
import java.util.Map.*;
 
/**
 * Superclass Imports
 */
 
/**
 * Class Imports
 */
 
/**
 * This class is a simple implementation a Map where only the last
 * recently used entries are going to stay in the map. You can use
 * the default size of 100, or you can give it a size. In either case,
 * every <tt>get()</tt> and <tt>put()</tt> are put through the test
 * the 'most recently used' filter. If the next <tt>put()</tt> pushes
 * one out of the map, then it wasn't in the most recently used.
 */
public class BKLRUCache<K, V> extends LinkedHashMap<K, V> {
 
    /**
     * This is the size of the cache. No more elements will be held
     * in this map than this. After this, the oldest goes to make room
     * for the newest.
     */
    private int             _maxEntries = 100;
 
    // this is the serial version tag for this class
    private static final long serialVersionUID = 20091014;
 
 
    /**
     * ----------------------------------------------------------
     *		Constructors
     * ----------------------------------------------------------
     */
    /**
     * The default constructor assumes a default maximum size of 100
     * elements such that <b>only</b> the most recently used 100 entries
     * will be maintained in the map. After that, the oldest is discarded
     * to make room for the newer entries.
     */
    public BKLRUCache() {
        this(100);
    }
 
 
    /**
     * The general form of the constructor takes the maximum size of the
     * LRU cache, and uses that as opposed to any default.
     */
    public BKLRUCache(int maxEntries) {
        // fix the size, keep things flat for speed, and use access order
        super(maxEntries + 1, 1, true);
        _maxEntries = maxEntries;
    }
 
 
    /**
     * This version of the constructor takes a map and populates this
     * map with up to 100 entries - actually, it'll hold the <b>last</b>
     * 100 entries of the iterator on the argument. It's doing the
     * <tt>putAll()</tt> on the argument, and the way this works, only
     * the last 100 are saved.
     */
    public BKLRUCache(Map<? extends K, ? extends V> m) {
        this(m, 100);
    }
 
 
    /**
     * This version of the constructor takes a map and populates this
     * map with up to maxEntries entries - actually, it'll hold the
     * <b>last</b> entries of the iterator on the argument. It's doing
     * the <tt>putAll()</tt> on the argument, and the way this works,
     * only the last 'n' are saved.
     */
    public BKLRUCache(Map<? extends K, ? extends V> m, int maxEntries) {
        this(maxEntries);
        putAll(m);
    }
 
 
    /**
     * The magic happens here as the Java LinkedHashMap has a method
     * that we can intercept and tell it to keep (or not) the oldest
     * entry in the map. This is where we look at the size and then
     * see if it's a keeper or not. Simple.
     */
    @Override
    protected boolean removeEldestEntry(Entry<K, V> eldest) {
        return (size() > _maxEntries);
    }
}

This is the kind of code I love to see: simple... elegent... compact. It does something very useful and it does it without a lot of grief. Amazing.

Days like this make me want to do more Java coding and learn more of the newest language additions. It might be nice.

Bumpy Rollout and Flex Licensing Issues

Tuesday, October 13th, 2009

Today was a rough day. Yesterday evening we rolled out new hardware for this application. Should have gone smoothly, but there were a few little glitches. Even so, it was all running and looking OK when I left last night and I had a good (but tired) feeling that things were going to be just fine.

Looking back, I have to giggle at my silliness.

Part of the rollout was a Flex component that I was given and needed to build. At the time, I was given the tools by the original author, and it all seemed to work. You can see it coming, can't you? Anyway, when I built things for this release, a few of the graphs had watermarks saying "Hey, this isn't licensed".

There was nearly a panic.

Did it work with the watermark? If not, then it's a rollback - but we've done hours worth of work, it'll take hours to get it all back. Nightmare! So I kept with my plan - push forward not slide back.

What I needed was to get the Flex license codes so I could build this app without the watermarks. Simple, right? Ha! Silly me. I had to chase things around to first find out if the watermarked graphs worked (they did), and that took the gun away from my head. It was ugly, but not inoperable.

Next, I got the install package, but the problem was that there was no directions for getting it installed on Linux. The public domain Flex compiler was on linux (my build/deployment platform) so I knew it had to work there, but how?

I ended up finding a place on the Adobe web site about the different locations for different OSs for this license.properties file. I put it in two of the locations on linux (just to be sure) and tried it again.

Sweet! It worked. I could just deploy the compiled SWF file to the web servers and be done. I showed the users and things were fine. Whew! That was nasty for a while.

[10/15] UPDATE: I just saw the PO for Flex Pro, which is what I need, and it's $699! Amazing. I can't believe it's that expensive. Maybe there's a ton of widgets that I don't see, but that seems like a lot for a few widgets as the rest is Open Source and works just fine. Well... it's what the customer will pay.

For the Children – Don’t be an Optimistic Coder

Monday, October 12th, 2009

For the love of the Children... or your favorite religion... don't be an optimistic coder. That's the worst thing you can possibly be. This evening I've been struggling with the rollout of a new application and hitting a problem that defied explanation for at least two hours. This wasn't the only problem I had this evening, it was just the one that delayed me the most.

To start off, I was in the middle of the roll-out (London done, ready to do NYC) when I got a little "prairie dog" from my manager: "Hey Bob, can you hold off on the roll-out for a bit. I just want to check on something." Well... sure... I was 4 mins until the time I was to do the second of three phases of the roll-out, and somethings were already done from an infrastructural point of view, but sure, I'd hold off.

So I held.

For 45 mins.

Then he said "OK, go ahead." Nice guy, but really... the time to say "Hold off" is before I start the roll-out, not between phases I and II. It's a little bit of a problem when you do it that way. Since I didn't have control of the DNS entries, I was already at a point that rolling back phase I was going to be hard, so what's up? Never found out, but that's OK. We went ahead with it.

Then I got to the problem that held me up for a few hours.

I'm not one to give up easily. In fact, for a roll-out, I can't remember ever backing it out as opposed to fixing the issues right then and there. So I had to figure out why two of the four boxes in NYC were giving us grief. I was able to skip it and roll-out Chicago, but I came back to the problem boxes soon enough and had to face the music. It was nasty.

I looked at the evidence in the logs and it was as if the code simply stopped. It did one request, started another and that was it. Dead. No crash... no core... just stopped.

We tried network issues, DNS resolution issues... everything that might be a problem. In the end, I was just thinking of all the steps the code was doing and the memory popped into my head. I increased the memory on the JVM, and BINGO! It worked.

So here's the thing I can't stand about production coding: Optimistic Coders. The original coder of this little app had used a try/catch block in the code and 'swallowed' the Exceptions. He didn't think they'd ever be needed, I guess. Well... he was wrong. Had I been able to see a Java OutOfMemory exception, this would have held me up for about 2 mins and I'd never have wondered what was wrong - it would have been telling me what was wrong.

No, by hiding the true cause, this coder has hurt the reputation of Java, the developers in the group, and certainly himself. It's sad that the JVM can't take an arg that says "up to the limit of the box" for memory usage. But it can't. You have to "size" the apps. So be it. But when you assume that everything is going to be OK, and never check return values, never check to see if the thing you asked to create was, in fact, created, then you leave yourself open to all kinds of problems. All kinds.

It's over now, but I've lived through this so many times, I don't even bother trying to educate the unaware. I'll say something, in passing, and if he's interested in really understanding his problems, he'll ask. But I'll bet you he won't. If he was interested in doing a good job, then he'd have thought of it already. But he hasn't. He'll be like this as long as he's coding. Too bad.

Getting my App through QA and Migrating to New Hardware

Thursday, October 8th, 2009

Today has been interesting, and even a little fun. I've been pushing my web app through QA and they have interestingly come up with some issues - primarily UI issues, but ugly nonetheless. Also, there were a few issues with the new sliding median filter on the data, and I needed to take care of that zippy pronto.

A Set Means Unique Elements

Seems obvious, no? But I forgot it. I was looking to make a clever median filter and rather than continually sort the array with Collections.sort(), I thought Hey! I'll use a TreeSet and then get the middle element. Clever idea, if it worked. But the TreeSet is, after all, a Set, and that doesn't allow for duplicate values. This meant that when I put in a new value that was the same, numerically, as one already in the list, I was loosing it as only unique members can be in the set. When I then removed a value from the set, I ended up removing the one value, but that might have stood for multiple originals. Nasty.

In the end, I needed to replace the TreeSet with a simple ArrayList and then after I added the new point and removed the old, I sorted the List<Double> using Collections.sort() and then picked out the middle value and that was the median.

When I did this, the data started making a lot more sense.

Big Duh on my part.

Migrating to New Hardware

It's not glamorous, and there's a ton of things that can go wrong, but I think they are all working put pretty well on the migration from the old linux VMs to the new linux hardware. Four servers in all at this time - one I did several weeks ago in London, but they all had to be phased in smoothly.

For the two web servers for my main visualization app, I also set up mod_proxy and proxy_ajp.conf on the apache servers for these boxes. It was a little time-consuming because I had to see the changes, tell the admins what to change, and then iterate. It took a few times, but that's not too bad. The advantage here is that I don't have to hassle with the maintenance, and can enlist them when things go south.

The vanity URLs that I set up are making the access to the sites a ton easier. The long, drawn-out URLs for a typical Tomcat app are just too clumsy for users to really remember. The vanity URLs are just what the doctor ordered.

Setting Down Broad Brush Strokes for Next Project

Wednesday, October 7th, 2009

I've been asked to re-write my coding nemesis. It needs to do a lot more in the next cut, and there's just no way the existing version is going to be able to "stretch" to fit these needs. What I needed to do was to get the broad brush strokes down... the main ideas... on paper so that I was sure things would fit together and I'd be able to get the project working in a reasonable timeframe.

That's why I was looking at the embeddable web servers, among other things.

There's a lot to do, and dividing up the work between myself and the other guy working on the project needs to be worked out as well. Again, just the big broad strokes.

On the Art of Programming

Wednesday, October 7th, 2009

I was reading the web this morning and ran across this post by Guy English, and it perfectly expresses my beliefs on the subject of coding:

Programming is an exercise in overcoming how wrong you’ve been in the past. At first you’ll overcome the syntax errors, then you’ll overcome the structural errors, and then you’ll come to align your code with the standards of a greater community and you’ll feel safe and like you’ve made it. You haven’t – you’re still wrong because you’re always wrong. You are playing a game you cannot win. And let’s face it – if it was a game you could win you’d not be playing at all.

I read that, and it's exactly what I feel. There's always a better way to do it. But the beauty of it is that you can always make it better. Spend the hour and make that method better. Spend the day and refactor that mess and make it clean. You can do it. Always.

Adding Google Chrome Frame to Pages for IE Compatibility

Wednesday, October 7th, 2009

GoogleChrome.jpg

I've been working on a web app that's a very heavy JavaScript page, and as such, I've had to force users to run it in Google Chrome as that's the only browser that has a decent JavaScript Engine (on Windows) that can handle the load. It doesn't hurt that I'm using a lot of the Google AJAX libraries, and so they are sort-of made for each other.

This morning I took a whack at the new Google Chrome Frame. This slick IE plug-in is the complete Google Chrome environment in a plug-in frame for IE. This means that pages that run in Chrome will run in the Chrome Frame without modification.

That's big news. I was stunned.

So I decided to try it out. First, you need to put the meta tag in your pages to tell IE to use the Chrome Frame plugin:

  < meta http-equiv="X-UA-Compatible" content="chrome=1" />

That, in itself, will make things work, but if you wanted to verify that the plug-in was installed, then you can add the following to your page to do the check and redirect the user to the Google Chrome Frame download page if it's not installed. First, you need to load the script in the head of the page:

  < script type="text/javascript"
  src="http://ajax.googleapis.com/ajax/libs/chrome-frame/1/CFInstall.min.js" ></script>

Then in the body of the page, you need to define a div that will be the target of the download prompt frame:

  < div id="chrome_check" />

just make the id unique and you'll be fine.

Next, have the JavaScript code in the initialization of your page:

  CFInstall.check({ node: "chrome_check" });

The CFInstall.check() function checks to see if Chrome Frame is installed, and if not, will display at the div indicated by the node property a download page hosted by Google. The user can then download Chrome Frame and that's it. It's ready to go.

The docs on Chrome Frame say that the page should automatically reload, but I haven't found that to be the case. It could be just me, I've only tried it once, but even that isn't bad. It's something that allows users to send links in emails and have their default browser (IE) render the project's pages in a Chrome environment.

Super sweet.

Interesting Embeddable Web Server – mongoose

Tuesday, October 6th, 2009

I'm looking at the plans for my next project, and the guy I'll be working with side-by-side is pretty much set on C++ for his project, and since it feeds into mine, I really would need a spectacular reason not to pick the same thing. So I started looking at the current version of the project I'll be updating and seeing what it had that I might find difficult to get my hands on in C++. The key components I could come up with were:

  • Good Database Tools - with SQLAPI++, there's no problem there. We can connect to MS SQL Server which is the "standard" at the shop.
  • In-Memory Database - it might be very nice to have something like H2 in C++ so that should I choose to go down that path, it would be as easy to use as an external database, but much faster.
  • Web Server - the current project does a lot of serving of requests via 29 West and http requests (it's a Tomcat web app). If I"m going to make the transition as simple and easy as possible, then I need to be able to handle http requests.

When I started looking at the in-memory database, I realized that SQLite can be used to create in-memory databases, and since it's also supported by SQLAPI++, it's a perfect solution for what I might need. You only need to open a connection the the special file named :memory: and the connection returned will be to a blank, in-memory database. Close the connection, however, and the database goes away.

Additionally, if you open another connection to the same special file, you'll get a completely different in-memory database. This is really interesting in that it allows the user to create as many unique in-memory databases as they wish. Unfortunately, it's not as nice as H2 in that the connection can be set to stay open, but it's a reasonable compromise for an in-memory database.

Even if I didn't use the in-memory database, the SQLite filesystem database is strong enough to not corrupt the file through power outage, and it's very fast for a single system to hold data - so should I want to use it to hold configuration details or parameters, it's certainly more than up to the task.

So I think I have more than enough on the database front for this project in C++. But the real question was the web server interface. I needed to have an embeddable web server that was going to be reasonably fast, small, easy to interface to, and handle all the things I needed. I had to hit Google for this one.

After a lot of poking around, I found several that might fit the bill. The most interesting is Lighttpd - pronounced "Lighty" by the author, and it's goal was to be able to handle 10,000 connections on a single server. Admirable goal, and it's being used by some heavy hitters out there. The big problem seems to be that it's a complete web server with CGI capability. That means that I'd need to create something to use as CGI and then put that into the web server. Not a lot different from using Apache, except the loading capable with the server. So I kept looking.

I found a GNU project: libmicrohttpd which is a C library that you can embed and gives you tons of the goodies you'd need - and far more than I'd need in this application. It's in C, which means it's going to need to be wrapped at least a bit, but it's a strong possibility in my mind because it's a GNU project, and they tend to be pretty bug free.

Probably the most promising candidate seems to be mongoose hosted on Google Code. It appears to be a nice C/C++ library that runs on a ton of platforms and handles all the things I need a web server to do. Additionally, it boasts a simple, clean API to plug into for request processing.

After looking at mongoose, I think I have enough to get started - technically, on this project. It's not necessarily ideal, but I have to say that if I ported my Google DataTable code from Java to C++ and put it in CKit, I think there's not a lot I couldn't do in the existing project. The real problem is that there's a ton I want to do in the new project that's a lot more complex than what's being done in the current project that I'm still going to need to think about it.

I just don't want to get to the point that I have to re-write huge chunks of functionality from one language to C++ in order to get things working. It's a waste of time and a maintenance nightmare.