[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ARSCLIST] Hard disk drives and DAT
Hi Richard:
Can you explain this:
"One neat thing I just negotiated is that I will make MD5 hashes of my
files and then when they get in T-Space, I will get the MD-5 hashes
generated automatically within T-Space back and I'll run a comparison
between the T-Space MD-5 hashes and my originals. That way, I can be
sure that (a) all files got onto T-Space and (b) There is no
corruption of any files."
I'm not as informed on this topic as some of you other guys and ladies, which is why I suggested
others pipe up.
What's an "MD5 hash"? Also, what do you mean by "community"?
-- Tom Fine
----- Original Message -----
From: "Richard L. Hess" <arclists@xxxxxxxxxxxxxxx>
To: <ARSCLIST@xxxxxxxxxxxxxxxx>
Sent: Monday, March 26, 2007 10:43 PM
Subject: Re: [ARSCLIST] Hard disk drives and DAT
At 05:48 PM 2007-03-26, Tom Fine wrote:
I hope Richard and/or Parker and/or Spec Bros. jump in here. The ONLY answer is managed and
constantly migrated storage. You simply cannot live by the old "put it on a shelf in a clean, cool
room" idea anymore. Digital storage must be in constant motion --
literally since hard drives have been known to fail or never start up again if left idle on a
shelf (ask around Hollywood, everyone has a horror story or two). You have to plan to have a
"living" hard drive array that is redundant, preferably with a constantly mirrored clone at a
different location, and plan on swapping out drives every XX hours of use or at worst when they
inevitably fail. There are firms that do this on an out-source basis, I think. I believe the 90's
dot-bomb term was "storage farms." Some of them are actually located in old bomb shelters and
missle bunkers.
Tom, I don't know why you'd want Parker, Peter, or me to jump in here. You stated it excellently
yourself.
While I don't think we need a mass storage system for someone's wedding tape -- that will work
nicely with several gold CD copies (LOCKSS - LotsOfCopiesKeepStuffSafe), much above that you
really do need managed storage.
The good news is that many Universities and other organizations are implementing such storage
systems and if you wish to make your material publicly available, especially, you can find sites
willing to host your material in perpetuity for a relatively small fee.
The metadata and search capabilities of some of these systems -- and being an organized repository
the ability of it to be included in federated searches -- is excellent. There are fewer options if
you want to squirrel your data away someplace and keep it hidden. But that is another discussion.
There has been a rather heated discussion over on AMIA-L about the shelf-storage model of HDDs
which Jim Wheeler is promoting. I think it has flaws -- especially when you consider a movie is in
the neighbourhood of 5TB or more.
LTO is becoming one of the few options with a future. I think one of the other tape formats is at
end of life now. I'm not sure which one. S-AIT has not been getting much traction. LTO
(LinearTapeOpen) is gathering supporters. The neat thing is that it is OPEN, as in NOT
proprietary.
As all manner of data multiplies and remultiplies, we will continue to see more attractive storage
options, but bringing the data into a digital repository is a good method for the future.
Here is a community that I have nothing to do with at Univ of Toronto's T-Space, but I've been
studying this as a model for my project that will hopefully drop into T-Space in a year or so.
https://tspace.library.utoronto.ca/handle/1807/3004
The "Community" that I am working on will be at this level in the system. Note that there are
persistent handles for all the items. Each item can contain multiple files. That's the D-Space
model.
T-Space (U of T's implementation of the D-Space system from MIT and HP) includes checksums and the
hardware is IBM with the Tivoli Storage Manager and LTO tape underneath.
One neat thing I just negotiated is that I will make MD5 hashes of my files and then when they get
in T-Space, I will get the MD-5 hashes generated automatically within T-Space back and I'll run a
comparison between the T-Space MD-5 hashes and my originals. That way, I can be sure that (a) all
files got onto T-Space and (b) There is no corruption of any files.
While D-Space is one model, ContentDM is another which I don't know anything substantial about.
There are others. The nice thing about D-Space is there is a link between my client and Univ of
Toronto so they have access to place their oral history archive at something that is part of the U
of T library.
Cheers,
Richard
Richard L. Hess email: richard@xxxxxxxxxxxxxxx
Aurora, Ontario, Canada (905) 713 6733 1-877-TAPE-FIX
Detailed contact information: http://www.richardhess.com/tape/contact.htm
Quality tape transfers -- even from hard-to-play tapes.