Java Allocation Speed
One of the developers using some objects in a Java library I wrote and maintain came to me to ask why this one operation was taking as long as it was. It's basically a table objects and views that can be placed on this table (and other views) to "stack up" a "deck" so that the end result is a table that has just what you want, in the order you want, etc. After talking with this guy, I realized that there wasn't a good reason that an aggregation on top of an aggregation was taking longer than the first aggregation. I mean the data set was less, and so it should, in theory, take less time. But it wasn't. And not by a little. So I decided to dig into it.
The first step was to build a test frame for this kind of environment. You see, it wasn't effecting small data sets like it was effecting the larger ones, so I built up a 100,000 x 100 table, and then aggregated it to 10,000 x 100 and then to 100 x 100. What I saw was that the first aggregation took about 13 sec. and the second one took about 8 minutes. OK, this was a good test case, and so I went into profiling mode to find where the time was really getting spent.
The first thing I thought was that the rows and columns were being inefficiently accessed by linear searches of their labels. But after putting that code in (it was in the base table, which is why I thought the first one was faster), it turned out that it didn't really improve the speed a lot. It went down to 5.5 sec and 3+ min. Better, but not nearly good enough.
Then I looked at some of the individual operations and what was there blew me away. In one method, I was returning the column headers as a Vector of Strings. The implementation for the base table was to return the ivar that was the Vector of Strings, and to have in the comments on the method the warning that this is the reference to the storage of the column headers, so mess with it at your own peril. In the aggregate view I had the code making a copy of the Vector and returning that. This was a hold-over from several of the views where the underlying table's columns can change and the system needs to augment the column headers on the fly.
That was the killer.
By creating a new Vector of Strings each time the row was accessed it took so much longer than simply returning the ivar that when I changed it to use an ivar the times went to 5.5 sec and 0.5 sec - they're going in the right direction now! I was amazed at this, but then I started to think about it. Java's allocator is probably doing a lot more than a typical C/C++ copy constructor, and as such it's load on the system is more. Even so, it was not the best idea to have a construction in the tight loop of the aggregator. All is fixed and I'm looking at the last two views that might need changing, but I'm not sure that even they do as they aren't doing the same kind of work that the aggregator was doing. But I'll give them a look and see if I can speed them up as well.