Backing up
2007-11-17 12:11 pmSo I finally decided to get serious about off-site backups: i.e., stop planning and start doing. This was assisted by the fact that work finally got around to installing a second T1 line yesterday -- my upstream bandwidth at home is barely sufficient to keep up with incremental backups; it would be hopeless for uploading the roughly 80GB already on the fileserver and needing to be backed up. (There's a lot that doesn't need to be backed up, fortunately.)
Sometime last Friday I dragged home a bare 500GB drive that was sitting around at work (originally intended for an outside-the-firewall server that never quite got off the ground), stuck it into a USB/eSATA enclosure, and loaded it up. Yesterday I mounted it on my desktop machine, and started uploading to my server at Dreamhost last night. Got about 250MB/s, which works out to about 890MB/h.
I'm doing it in pieces, of course: the web master directories last night, then my working directories today -- which amount to about 10GB, excluding the Audacity projects. Those are another 60GB -- I'll do those a little bit at a time, at night, with bandwidth limiting.
At that point, the only thing left will be the /home
partition -- I can't do that until I have my planned encryption scheme in
place. (Although in the interim I can fake it with an encrypted
tar file.)
The general idea is to categorize directories according to whether they need to backed up at all, and how much protection they need.
-
/shareis stuff that's basically public -- it consists mainly of website staging directories and open-source software projects. A few files may be password-protected or otherwise obscured on the web, but only for copyright and licensing reasons, not because there's anything confidential in them. -
/usersis for users' working directories: works in progress, including concert recordings, studio tracks, and the like. Much of it eventually gets published (by copying it into /share and uploading it to a website). Again, it's only copyright and licensing considerations that keep it from being completely public. -
/homecontains users' actual home directories. These include things like stored passwords, private email, digital camera images, etc. This is potentially sensitive, so it needs encryption if it's ever taken offsite. There's also a directory called/home/Config, which is where I put backups of all my machines'/etcdirectories, which of course contain passwords, host keys, and other equally sensitive information.
There's other stuff, of course: downloads, ripped CDs, installed packages, and so on. I'm already backing that up locally, and I'll keep a copy at work, but there's no hurry about uploading it.
Backing up /share and /users "in the clear" is
easy and moderately safe. It loses ownership information when it goes up
on dreamhost, but that's easily reconstructed.
The plan for /home is to use key-per-file encryption: a
file's encryption key is its MD5 hash, and its public identifier is the
SHA-1 hash of the resulting encrypted "blob". That's trivial; directories
are a little more complicated because you have to remember the filename,
the other metadata, and of course the ID and key. This results in a new
blob for each directory; all you have to do to keep track is to encrypt
the ID and key of each user's home directory blob, using their public key.
I know how to do it, and I've been planning this for at least two years;
I'm just lazy.
The other bit of complication is that the mh-style mail
folders I'm using have hard links in them; it's probably not worth
worrying about, at least at first. If I ever have to restore from the
off-site backups I can reconstruct the links (not perfectly, perhaps, but
well enough) because hash-based IDs are globally unique.
A trivial stopgap, as I mentioned above, would be to just tar
up and encrypt each home directory and save those.
Hopefully I'll have everything uploaded by the end of the year, which would be nice.
no subject
Date: 2007-11-19 03:07 am (UTC)no subject
Date: 2007-11-19 06:59 am (UTC)For comparison, a T1 is roughly the speed of a carrier pigeon hauling a couple of SD cards across town.