mdlbear: (technonerdmonster)

It's been a while since I described the way I do backups -- in fact, the only public document I could find on the subject was written in 2006, and things have changed a great deal since then. I believe there have been a few mentions in Dreamwidth and elsewhere, but in this calamitous year it seems prudent to do it again. Especially since I'm starting to feel mortal, and starting to think that some day one of my kids is going to have to grovel through the whole mess and try to make sense of it. (Whether they'll find anything worth keeping or even worth the trouble of looking is, of course, an open question.)

My home file server, a small Linux box called Nova, is backed up by simply copying (almost -- see below) its entire disk to an external hard drive every night. (It's done using rsync, which is efficient because it skips over everything that hasn't been changed since the last copy.) When the disk crashes (it's almost always the internal disk, because the external mirror is idle most of the time) I can (and have, several times) swap in the external drive, make it bootable, order a new drive for the mirror, and I'm done. Or, more likely, buy a new pair of drives that are twice as big for half the price, copy everthing, and archive the better of the old drives. Update it occasionally.

That's not very interesting, but it's not the whole story. I used to make incremental backups -- instead of the mirror drive being an exact copy of the main one, it's a sequence of snapshots (like Apple's Time Machine, for example). There were some problems with that, including the fact because of the way the snapshots were made (using cp -l to copy directories but leave hard links to the files that haven't changed) it takes more space than it needs to, and makes the backup disk very difficult -- not to mention slow -- to copy if it starts flaking out. There are ways of getting around those problems now, but I don't need them.

The classic solution is to keep copies offsite. But I can do better than that because I already have a web host, and I have Git. I need to back up a little.

I noticed that almost everything I was backing up fell into one of three categories:

  1. Files I keep under version control.
  2. Files (mostly large ones, like audio recordings) that never change after they've been created -- recordings of past concerts, my collection of ripped CDs, the masters for my CD, and so on. I accumulate more of them as time goes by, but most of the old ones stick around.
  3. Files I can reconstruct, or that are purely ephemeral -- my browser cache, build products like PDFs, executable code, downloaded install CDs, and of course entire OS, which I can re-install any time I need to in under an hour.

Git's biggest advantage for both version control and backups is that it's distributed -- each working directory has its own repository, and you can have shared repositories as well. In effect, every repository is a backup. In my case the shared repositories are in the cloud on Dreamhost, my web host. There are working trees on Nova (the file server) and on one or more laptops. A few of the more interesting ones have public copies on GitLab and/or GitHub as well. So that takes care of Group 1.

The main reason for using incremental backup or version control is so that you can go back to earlier versions of something if it gets messed up. But the files in group don't change, they just accumulate. So I put all of the files in Group 2 -- the big ones -- into the same directory tree as the Git working trees; the only difference is that they don't have an associated Git repo. I keep thinking I should set up git-annex to manage them, but it doesn't seem necessary. The workflow is very similar to the Git workflow: add something (typically on a laptop), then push it to a shared server. The Rsync commands are in a Makefile, so I don't have to remember them: I just make rsync. (Rsync doesn't copy anything that is already at the destination and hasn't changed since the previous run, and by default it ignores files on the destination that don't have corresponding source files. So I don't have to have a complete copy of my concert recordings (for example) on my laptop, just the one I just made.)

That leaves Group 3 -- the files that don't have to be backed up because they can be reconstructed from version-controlled sources. All of my working trees include a Makefile -- in most cases it's a link to MakeStuff/Makefile -- that builds and installs whatever that tree needs. Programs, web pages, songbooks, what have you. Initial setup of a new machine is done by a package called Honu (Hawaiian for the green sea turtle), which I described a little over a year ago in Sable and the turtles: laptop configuration made easy.

The end result is that "backups" are basically a side-effect of the way I normally work, with frequent small commits that are pushed almost immediately to a shared repo on Dreamhost. The workflow for large files, especially recording projects, is similar, working on my laptop and backing up with Rsync to the file server as I go along. When things are ready, they go up to the web host. Make targets push and rsync simplify the process. Going in the opposite direction, the pull-all command updates everything from the shared repos.

Your mileage may vary.

Resources and references

Another fine post from The Computer Curmudgeon (also at computer-curmudgeon.com).
Donation buttons in profile.

mdlbear: blue fractal bear with text "since 2002" (Default)

A few days ago I got a comment on my weekly post that went Oohh, you're doing what looks to me like a bullet journal? Only online. So I wrote a quick explanation. And then I realized that I might be doing something unusual, that I ought to write up in more detail. So here you are:

The Legend

Let's start off with the file called Journals/Dog/legend.do:

			       ===legend.do===

= item flag notation for to.do and to.done files:

= notation for to.do and to.done items:
  = note: keep  o to do  * done  x abandoned  ~ modified  . in progress
  & added after completion  (recurring items get * when completed)
  $ financial transaction (flagged as  o before completion)
  ? query/decision...  - choice  + chosen  ->chosen
  @ link/research
  ! emotion noted at the time, or soon after.  NOT added the next morning; 
    I'm trying to pay more attention at the time
  | body sensation worthy of note: pain, noticable change...
    (more recently replaced by %; should maybe go back to |)
  : observation or external event.  Weather, news, etc
    + external observation with positive emotional content
    - external observation with negative emotional content
  % observation/insight about myself
  # meta - flags, flist, filters, ...
  <b>...something I feel good about...</b> (may be added next day)
  <i>...something I feel bad about...</i>
  [ ... ] delete from public posts
  ... ongoing items
  " quotation
  ' interior dialog

= Notation for meetings and conversations:
  <- point to bring up.  After meeting, point to bring up next time
  *- point brought up
  x- point not brought up
  ~- point partially brought up, or brought up in different form
  &- additional point raised  
  -> information/point raised by someone else/consequence/resolution
  => action item for me
  =* action item done
  <= action item for somebody else.

===

The History

My usage has shifted a little over the years. I first started posting "to.do" items around 2006, though I'd undoubtedly been using at least the o and * flags for years before that. At first, since I was part of a support group working on procrastination and avoidance, I used it as an accountability thing: I would post a list of open items, followed (hopefully) by the items as they got checked in. It was a little discouraging, until somebody suggested just posting about what I'd done. That led to &, and my expanded use of the file as more a log than a to-do list and calendar.

Whenever the list of "done" items got too long, I would move them into a ".done" file -- the first one I have is 2006.done. In 2009 I switched to quarterly archives; by 2009/q4.done the file had most of its present features. By 2011 I was archiving monthly. I don't remember offhand when I stopped making daily posts in LJ and switched to weekly.

Sometime in September of 2011 I decided that the set of unfinished and probably never-to-be-completed items had gotten too long, and moved it to wibnif.do, as in "Wouldn't It Be Nice If..." My present Makefile plugin reports the current number of unfinished items in to.do and wibnif.do; the current numbers are 70 and 126 respectively.

The Files

So there's that. The file is called to.do, and edited with emacs. There are a couple of important marker lines in it:

=========================================================================================+
Ongoing:                                                                             89->|
recurring items and long-term goals go here
=then===================================================================================>|
this contains entries from the first of the month to the present
=now===-^-===this-month-v-==============================================================>|
scheduled items for later this month
=later===-v-===this-month-^-============================================================>|
scheduled items after this month
=sometime===-V-===later-^-==============================================================>|
items with no specific due date
=Done-v-================================================================================>|

Dates, in the form mmddWw (e.g., 0122Su), start in the first column; flag characters are indented two spaces. The marker at column 89 makes it easy to properly size the editor window when I first open it after rebooting; it's where lines wrap.

I'll put approximately-scheduled items in the this-month and later sections after the dated entries, and a few of the more important ones above =now. That doesn't keep me from procrastinating them, but it does help keep them where they'll be noticed.

Note that, except for the breakpoint at =done, entries are in chronological order from top to bottom. That makes this a log, not a blog or feed. My to.do and its associated history (see below) are one of a handful of journal-like collections under my Journals directory; the to.Do lOG is kept in a a directory called Dog.

The Archives

By now, I have a fairly well-established routine:

  • I maintain the to.do file using emacs, of course.
  • Sometime on Sunday, I move the last week's worth of entries from the working location near the top of the file, to the end.
  • At this point I still have the week's entries in the Region (emacs terminology for the current selection). I move point down two lines to scoop up the HTML boilerplate that I'll need for my weekly post, and copy (M-w).
  • Then I run lj-update, currently bound to M-L, and yank into the body. The boilerplate is arranged so that all I have to do is move back up two lines, cut, down one, and yank.
  • From there it's an easy step to go back to the first line (which is invariably the start date) copy it, and yank it into the subject line.
  • Write my summary. Edit out any [...] sections, if necessary.
  • Post.

Then,

  • Every month -- actually, on the first Sunday of the month, after making my weekly post -- I move the month's entries to yyyy/mm.done.
  • Every so often I go through and pull out obsolete entries, marking them with * or x as appropriate, and put them after the preceeding week's entries at the end of the file.
  • Every year, on New Year's Eve, I gather up my list of goals and make my end-of-the-year post.
  • The next day, I cons up my new list of goals and make a New Year's post.

Variations

I keep other, project-specific, to.do files. Most of them are much simpler, with undated items above the =done line (which is usually just a line of equal signs), and dated items after it in what I now call a "work log". It's convenient, because I can just go to the end of the file and make an entry, but it wouldn't work nearly as well if I had to schedule things.

Most Popular Tags

Syndicate

RSS Atom

Style Credit

Page generated 2025-06-30 10:14 pm
Powered by Dreamwidth Studios