Refactor DDO Loader for Long-Term Usage

DealPerf Svc

This weekend it was clear that the demand-data files we are loading five times a day were becoming a problem as they were blowing out the 16GB JVM process. When I first wrote the loader, we had files of 1.7 million records, and that fit very nicely in 16GB - with room to spare. Now we're pushing 6 million records a file, and the 16GB isn't able to do the job, and the ensuing garbage collection was eating up more than 10 cores, and bringing the box to it's knees.

Very bad.

So I needed to look at how I was loading the data - and break it up so that the bulk of the work can be done on a separate machine, and the only thing that needs to be done on the database server is the COPY of the CSV file into the table in the database. That's by far the fastest way to load 6 million rows and keep the database online and taking requests all the while.

I realized that it wasn't all that hard - I changed the bulk of the processing to a new box, that was easy, and then I just had to change some crontabs and scripts to have the new locations of the files. Then I simply SCP the file from the processor to the database server, and then use SSH to kick off the loader.

Really not that bad. I still want to walk through a complete cycle, but that shouldn't be too bad.