[nSLUG] Backups: dealing with large, growing photo collections
synrg at sanctuary.nslug.ns.ca
Tue Mar 30 13:28:25 ADT 2010
On 03/30/2010 01:37 PM, Richard Bonner wrote:
> On Tue, 30 Mar 2010, Ben Armstrong wrote:
>> You've got to be kidding. OK, I won't disagree that backups are a good
>> thing, but as my 1 terabyte drive fills up, you would counsel me to use
>> such small media?
> *** No, but we were talking about those with older, used computers
> with considerably less storage areas. (-:
>> Even the largest of these, flash drives, would be prohibitively
>> expensive at that size.
> *** Costs are dropping, but that aside, one only need back up data,
> not programs. Even with movies, one could transfer them to DVD.
/home is almost entirely data, so that's all we're talking about. Even
after dropping junk/transient material, it still exceeds a "comfortable"
number of DVDs to archive. That's because they include years of
accumulated photos, plus in recent years, resolutions of cameras owned
by family members have been on the rise, plus they're starting to make
video clips with their cameras too.
> *** My guess is that many of those photos can be dumped. People
> have a tendency to keep too many photos taken milliseconds apart,
> plus all the duds, too. )-:.
I encourage people to weed those out. But it's a losing battle. Every
inch of footage (as much as you can say that digital videos can be
counted in inches :) shot by our aspiring videographers is precious and
could be mixed and remixed at a later date to make some new creation.
Disk drives are large and cheap. Time is not so cheap. And as a
parent, as well as a sys admin, "pick your battles" is my motto. I
could be a BOFH and put quotas on /home, forcing them to do the cleanup,
but I don't think I could handle all of the screaming and moaning.
>> I'm pondering whether a unionfs would help here. I could build a
>> unionfs on top of already-archived material (as the read-only layer)
>> with a read-write layer on top. In my regular system-wide backups, I
>> need only backup the read-write layer. At some point, I would want to
>> re-archive, with the new archived disks containing any updated files,
>> and then re-build the union with the newly archived material forming the
>> new read-only layer. The trick would be to identify and throw out any
>> disks in the archive that have directories that have changed ...
> *** What about using a second TB drive?
That wouldn't fundamentally change the problem. Ultimately, the problem
is not "where do I put all of this?" That's obvious, I put it on a
bunch of DVDs. As I indicated, I have plenty. The problem is, how do I:
- archive stuff that is large, slowly grows over time, needs to all be
on the hard drive at once, and yet *sometimes* changes in mostly very
- keep a set of archive disks with as little duplication as possible
storing the whole collection
My standard backup approach fails here:
A cycle of full & incremental archives
Let's say I do "full" archives only monthly and the incrementals never
actually go to DVD, but rather are copied to the second drive (on a
second machine, so if the whole machine goes up in smoke, the data
The problem with this approach is in the monthly full archives, I keep
rewriting archives of the same old material, all of those years of
accumulation of photos, which haven't really changed.
"ok, so make the full cycle longer, say, one year ..."
Well, this is going to put my data on the incrementals at bigger risk of
loss, isn't it? since both drives could go at the same time ...
"ok, then make the full cycle longer, but burn a backup of all
incremental data accumulated that month to disc every month for
Well, now the risk is shifted to the full ... what if last year's full
backup goes bad? now both it *and* all of my incrementals are worthless.
So it is merely a matter of scheduling. I haven't yet found a way to
rotate the data to archival disks (oh yeah, and while we're on this
topic, my father has urged me to look into "archival quality" disks
because standard DVDs will eventually go bad ...) and replace any
material that has changed on old disks with new as needed (should be
fairly rare) without re-burning a whole set.
> *** To how many users are you referring?
Fair question. Eight, seven of whom are family members and six of whom
live under my roof.
> *** I dump millisecond-duplicates and duds, and tend to keep the
> rest on CD-ROMs.
And sure, that's another option, just remove the material entirely from
the hard drive once you've archived it to disc, and make the users have
to pull the disc and copy stuff from it if they want to mess with their
older material. But that adds wear-and-tear to those discs and adds a
barrier to creativity (my users are more likely to reach for material on
the hard drive than go hunting through archival disks, even if they have
a good index of them on the hard drive).
So what I was fishing for, and still haven't heard from anyone about, is
some sort of schedule you have set up for photo archiving to DVD that
works around the whole conventional full/incremental scheduling problem
I outlined above.
More information about the nSLUG