Java and HotSpot Errors – No Fun to Debug
This morning I had a Java HotSpot error in one of the components of a production system of mine, and I have to say, it's not been very helpful. I've looked at the HotSpot file and the problem seems to be in java.util.zip.ZipFile.getEntry() and Google tells me this is because of a possibly corrupt jar file, or someone has written to the jar file and made it so that the reading get very confused.
Unfortunately, all the jars have been stable for quite a while, but maybe the filesystem was the issue. We might have had some NFS issues which might have lead to this. Unfortunately, I can't do a lot about where the jars are stored, they need to be centrally located and there's only the one filer. Bummer.
I've sent a few emails the the guys who wrote this component to see if they have any ideas about what might be done to increase stability. I'm on JDK 1.5.0_10, but would 1.5.0_15 be better? 1.6.0? I'm searching for anything because there's no way the files were corrupted. I restarted the app and it all came back. Odd.
I just wish there were more information or a clear explanation of other possible causes so I might find the real culprit of this crash.
UPDATE: when it looks like the chips are down, sometimes you get a break. While digging into this with the developer that wrote the component having the problem, he noticed that the crash occurred when a script was being run through the system. I noticed that the last two crashes occurred at 8:00 am for the last two Friday mornings - just when cron launches these jobs. BINGO! While he was going to track down if the problem was in the bsh jar I was using, I decided to take a completely different track - forego the bsh script and write it in perl. Then, I'm not calling the system to do the work for me, I can do it outside the system and it's much safer and cleaner. Not to mention faster. I recoded this and it works like a champ. It's going to make things so much easier. Whew!