Problems with Large Java Web Apps

java-logo-thumb.png

I've got two large Java web apps that I'm working on - one I wrote from the ground up and another I inherited. Both have the problem that they are getting very large for the boxes we have, and I need to be doing something about their memory footprint. The problem is, in both cases, the values are relatively important and thinning the data footprint out is not as trivial as it seems.

Multi-Level LRU Caches

In one of the apps, I had recently put an LRU Cache so that I would not have to save every value, only those that had shown themselves to be of use to the users. Sounds like a good idea. But there's a limit to that. Each operation on the BKLRUCache is going to re-order the aging list, and if a lot of updates come into the system in short order, there's a possibility that they will purge the cache of "good" values.

Sure, you can increase the size of the cache, but that defeats the purpose. You can also buffer up the changes and treat them as a single batch, but that has negative effects as well. No, the idea is to somehow "flag" the requested entries so that they stay over the "untouched" ones in the cache. But what's the easiest way to accomplish that?

What I came up with, after much hassle and arguing with myself, was a two-level LRU cache. The primary level is the same, fixed size. When the user asks for a value, that value is first looked for in the secondary LRU Cache, and if it's not there, the primary cache is checked. If it's there, it's put into the secondary cache so that it's not competing with the other entries that may never be touched by a client.

I don't have to worry about the size issues - they are independent caches, and I can size them to suit my needs. I also don't need to worry about the flood of updates wiping out the one previously asked for. It only has to compete against the other requests. Very nice.

H2 File System Database Tests

In my other web app I've got a very large H2 in-memory database. This is getting to be a real problem. So much so that I'm starting to think about alternatives to the in-memory database. There's MySQL, PostgreSQL, SQLite, even H2 has a file-system database mode. So as I looked at each of these I realized that there's no way I would be able to tell the difference until I started running some tests.

Given that H2 has good specs and I already had it in the project, it seemed like a good enough place to start. What I did was just to make it not in-memory, but based on the local filesystem (simple disk), and run the application and see what the results were.

I was stunned to say the least.

Location Query Format
In-Memory 300 -> 500msec 8 -> 21msec
File 3200 -> 3300msec 9 -> 12msec

I tried a lot of things, but I kept coming back to the factor of 6x to 10x in speed. It's just not possible to have a 3 second response time to the queries that I need to process. No way. So I must come up with something that makes it possible to get better speed while still not taking up a hundred GB of RAM.

No solution yet.