[nSLUG] Backups: dealing with large, growing photo collections

Ben Armstrong synrg at sanctuary.nslug.ns.ca
Tue Mar 30 13:28:25 ADT 2010

On 03/30/2010 01:37 PM, Richard Bonner wrote:
> On Tue, 30 Mar 2010, Ben Armstrong wrote:
>> You've got to be kidding.  OK, I won't disagree that backups are a good
>> thing, but as my 1 terabyte drive fills up, you would counsel me to use
>> such small media?
> ***   No, but we were talking about those with older, used computers
> with considerably less storage areas.  (-:

Fair enough.

>> Even the largest of these, flash drives, would be prohibitively
>> expensive at that size.
> ***   Costs are dropping, but that aside, one only need back up data,
> not programs. Even with movies, one could transfer them to DVD.

/home is almost entirely data, so that's all we're talking about.  Even 
after dropping junk/transient material, it still exceeds a "comfortable" 
number of DVDs to archive.  That's because they include years of 
accumulated photos, plus in recent years, resolutions of cameras owned 
by family members have been on the rise, plus they're starting to make 
video clips with their cameras too.

> ***   My guess is that many of those photos can be dumped. People
> have a tendency to keep too many photos taken milliseconds apart,
> plus all the duds, too.  )-:.

I encourage people to weed those out.  But it's a losing battle.  Every 
inch of footage (as much as you can say that digital videos can be 
counted in inches :) shot by our aspiring videographers is precious and 
could be mixed and remixed at a later date to make some new creation.  
Disk drives are large and cheap.  Time is not so cheap.  And as a 
parent, as well as a sys admin, "pick your battles" is my motto.  I 
could be a BOFH and put quotas on /home, forcing them to do the cleanup, 
but I don't think I could handle all of the screaming and moaning.

>> I'm pondering whether a unionfs would help here.  I could build a
>> unionfs on top of already-archived material (as the read-only layer)
>> with a read-write layer on top.  In my regular system-wide backups, I
>> need only backup the read-write layer.  At some point, I would want to
>> re-archive, with the new archived disks containing any updated files,
>> and then re-build the union with the newly archived material forming the
>> new read-only layer.  The trick would be to identify and throw out any
>> disks in the archive that have directories that have changed ...
> ***   What about using a second TB drive?

That wouldn't fundamentally change the problem.  Ultimately, the problem 
is not "where do I put all of this?"  That's obvious, I put it on a 
bunch of DVDs.  As I indicated, I have plenty.  The problem is, how do I:

- archive stuff that is large, slowly grows over time, needs to all be 
on the hard drive at once, and yet *sometimes* changes in mostly very 
small ways
- keep a set of archive disks with as little duplication as possible 
storing the whole collection

My standard backup approach fails here:

A cycle of full & incremental archives

Let's say I do "full" archives only monthly and the incrementals never 
actually go to DVD, but rather are copied to the second drive (on a 
second machine, so if the whole machine goes up in smoke, the data 

The problem with this approach is in the monthly full archives, I keep 
rewriting archives of the same old material, all of those years of 
accumulation of photos, which haven't really changed.

"ok, so make the full cycle longer, say, one year ..."

Well, this is going to put my data on the incrementals at bigger risk of 
loss, isn't it? since both drives could go at the same time ...

"ok, then make the full cycle longer, but burn a backup of all 
incremental data accumulated that month to disc every month for 
safekeeping ..."

Well, now the risk is shifted to the full ... what if last year's full 
backup goes bad?  now both it *and* all of my incrementals are worthless.


So it is merely a matter of scheduling.  I haven't yet found a way to 
rotate the data to archival disks (oh yeah, and while we're on this 
topic, my father has urged me to look into "archival quality" disks 
because standard DVDs will eventually go bad ...) and replace any 
material that has changed on old disks with new as needed (should be 
fairly rare) without re-burning a whole set.

> ***   To how many users are you referring?

Fair question.  Eight, seven of whom are family members and six of whom 
live under my roof.

> ***   I dump millisecond-duplicates and duds, and tend to keep the
> rest on CD-ROMs.

And sure, that's another option, just remove the material entirely from 
the hard drive once you've archived it to disc, and make the users have 
to pull the disc and copy stuff from it if they want to mess with their 
older material.  But that adds wear-and-tear to those discs and adds a 
barrier to creativity (my users are more likely to reach for material on 
the hard drive than go hunting through archival disks, even if they have 
a good index of them on the hard drive).

So what I was fishing for, and still haven't heard from anyone about, is 
some sort of schedule you have set up for photo archiving to DVD that 
works around the whole conventional full/incremental scheduling problem 
I outlined above.


More information about the nSLUG mailing list