Doing a Lot of Skut Work
Today has been a lot of skut work - clean-up stuff that has been sitting in the queue for months but no one wants to do. But if the project is going to really work, someone actually has to do it. So since I finished up a lot of tasks today, it seemed like a natural thing to just get to it and clear the decks.
None of this is hard stuff, it's just not very fun, and it takes time.
First off, I followed up with a request for backups to be made of all the database machines we use in the group. This includes CouchDB as well as PostgreSQL. It's nice in that the install of each of these packages places the data files in the largest partition on our boxes: /var/groupon/ so it's simple to just back up that partition. I submitted the request a few days ago, but hadn't heard anything back, so I followed-up asking if I was going to get a completion notice when the backups were working.
Response was: "Yup, likely tomorrow". Good enough.
Next, we needed to get Nagios monitoring of the free disk space on the boxes as well - so that should a process go crazy and start to fill up the disk, we can fix it before it becomes a database killer. This has happened to us on several occasions, and it's something to be avoided as the main processes can't run if the database is offline.
Finally, needed to do what I could to compact the CouchDB databases on the production and UAT hosts because we're at 93% disk space used, and there's very little headroom left. If the compaction of the views doesn't work, then I'm going to just drop the database and start fresh. We have a replicate of the production data, and with the backups (above) we'd be able to go back to it anyway. But this is something I'd rather not do, but it's certainly a sure-fired way to get the space.
It's not glamorous work, but it needs to be done, and no one else is picking it up, so I might as well just do it all and have it done.