Skip to content

Storing Digital Memories (Part 1)

It is surprisingly difficult to digitally store personal records such as photos so that they are available for generations. During the last decade millions of dollars were spent to come up with strategies on how to preserve digital media in institutions such as public libraries. But what about all the personal records before they are deemed worthy to be preserved by institutions?

Looking at my immediate family, the last 10 years of digital photography produced about 500GB of data. And that data is growing exponentially with our appetite for higher resolution (and our need for greater data safety — but we will get to that). With this kind of data accumulation burning of CDs or even DVDs becomes quickly unwieldy. Furthermore, CDs and DVDs might only have a life time of two years. So in addition to burning new data to DVDs one would have to copy everything to new DVDs every two years. Sounds like this could quickly grow to a full time job — and an incredibly boring one, too!

So I ended up using disk drives. They are very convenient but they come with their own set of problems. The first problem of course is that disk drives fail and do so surprisingly often (see for example here and here). Mirroring disks is one obvious measure — but not a sufficient one.

It matters how disks are housed and how they are connected: disks running in not sufficiently cooled enclosures have a significantly higher failure rate. Disks mirrored across individual firewire or USB enclosures can easily disconnect accidently and break the mirror. Reestablishing the mirror requires a lengthy reconstruction period during which only one copy of the data exists and which therefore exposes the archive to data loss. And finally, disk drive controllers fail which can lead to weird transient failure modes such as temporarily garbled data, disappearing files, or even permanent data corruption. I currently run four drives in a single four-bay firewire enclosure that is well-vented (but also very noisy). The mirrored pairs are ATA master/slave of the same controller so that a disconnect from a firewire cable or a controller failure is not breaking mirrors.

It also matters how disks are used. It turns out that disks that don’t spin for an extended period of time can develop crystals between the head and the platter. Once these crystals have formed, the disk self-destructs on the next spin-up. More generally, I can only discover the failure of a disk while I use it.

And it matters that disks are writable. I can have my disks mirrored, well housed, cooled, and cabled. I can have my disks well-exercised by a program that for example randomly accesses all my data over time. But that program can also corrupt my data. Short of re-introducing the before-mentioned DVD backup solution one approach would be to aggressively write-protect originals. There is a trade-off between frequently using a disk-based archive so that failures are discovered quickly, and restricting the use of the archive to prevent data corruption due to software bugs or human error.

For the past three years I have been using mirrored drives — so far without any data loss. I migrated to larger disk drives already twice but with hardly any effort. However, I noticed that there were weeks when I didn’t spin up the disk drives at all. When I started up the archive after those longer periods, the disks didn’t mount until repeated power cycles of the enclosure, and the fans started to get really noisy. That concerned me.

In the next part of Storing Digital Memories I will talk about how I ensured more frequent use of the archive by integrating it into an interactive entertainment system in the living room.

One Trackback/Pingback

  1. Catching a Moment › Storing Digital Memories (Part 3) on Thursday, March 20, 2008 at 6:20 pm

    […] a year ago I started a multi-part series of postings about storing digital memories at home. In part 1 I wrote about the importance of frequently using archived files so they don’t get lost due to […]

Post a Comment

Your email is never published nor shared. Required fields are marked *