[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ARSCLIST] long range file storage
----- Original Message -----
From: "David Seubert" <seubert@xxxxxxxxxxxxxxxx>
> We are currently creating parallel preservation copies, both online and
> on optical media, but eventually I see us phasing out the physical
> media. Before we do that, one thing I feel is necessary and that we have
> been discussing is the integrity of the data stored online. Once the
> data goes onto disk, there is no practical way to manually make sure
> that files haven't become corrupted over time, during a backup and
> restore process, or during a migration from one system to another. We've
> discussed using checksum files created upon ingest that would be
> periodically and automatically compared against the files to ensure that
> nothing has become corrupted. In case of corruption, the original file
> could be restored from tape. I've noticed that the audio files in the
> Internet Archive have associated checksum files so you can make sure
> that the file you have downloaded is identical to the original. I don't
> know if they also use these to ensure data integrity over the long term.
>
> Has anybody looked into this further or implemented this for archiving
> audio files?
>
Note that hard drives are rated by "MTBF"...or "Maximum Time Between
Failures!" This implies that a failure will be, at some point, inevitable...
which is not surprising considering the precision involved. I have
presonally lost 400MB of data on a hard drive due to a controller
failure. This, of course, could be recovered (or most of it) but at
great cost and expense. We know that magnetic media can be
easily corrupted by exposure to strong magnetic fields, as well
as damage or corruption which makes it impossible for the
magnetic medium to rotate within its encasement...and is
therefore less than perfect or eternal!
As well, using any type of media for long-term preservation is,
by implication, assuming that the devices to read the media
will be available when access to it is necessary...and consider
the several media which are effectively "Dead" so far...
...stevenc
http://users.interlinks.net/stevenc/