Cornell University and the Xerox Corporation, with the support of the Commission on Preservation and Access, have collaborated for the past two years in a Joint Study to investigate the use of digital technology to preserve library materials. The primary emphasis of this study has been on the capture of brittle books as digital images and the production of printed paper facsimiles.[1] Of equal interest, however, has been the role of digital technology in providing networked access to library resources, and preliminary work in this area has also been accomplished.
The Joint Study has led to a number of conclusions regarding preservation, access, electronic technology, and the role of the library. In particular, participation in this study has convinced Cornell of the value of digital technology to preserve and make available research library materials. Such digital preservation presents a cost effective alternative to photocopying, and--subject to the resolution of certain remaining problems--a potential adjunct or alternative to microfilm preservation. The greatest promise of digital technology as a preservation option is to improve access to materials. Cornell expects to work with others to find ways to resolve the remaining issues surrounding the use of digital technology.
The primary preservation benefits of the current state of this technology are image quality, duplication capabilities, paper output, and cost effectiveness. The intellectual content of a brittle book may be captured in a highly acceptable manner, and a hardcopy version produced on permanent, durable paper that replicates the presentation and format of the original, provided the original scanning is performed with sufficiently high resolution. The Joint Study indicates that the quality of scanning is competitive with photocopying and the costs are projected to be about 20 percent lower in a production environment (see Section V--Cost Study).[2]
The high-quality paper facsimiles produced to replace the deteriorating originals have proven attractive to members of the Cornell research community and increased their support for the program. For the same reasons, this approach could generate support from the research community at large. At a fraction of the original scanning cost, additional paper copies can be "printed on demand" from the digitally-stored images at any time in the future. [3] The Joint Study suggests that digital images can also be duplicated without loss of fidelity and distributed widely across the nation's networks, providing remote access to other institutions either by local printing of facsimiles or by viewing at desktop workstations. There may also be opportunities to underwrite some of the costs of preservation through the sale of facsimile editions.
Digital images may be transmitted over communications networks for use at remote locations. The Joint Study demonstrated the feasibility of such remote access through the delivery of digital images over the Cornell network for printing and for viewing on a prototype workstation. With network access, researchers will ultimately be able to use the resources of a library at any time from across the campus or across the country. As has been noted by many observers, access will no longer be defined geographically or temporally. It has also been noted that digital technology could spur the library to shift further along the continuum from physical ownership of materials to providing access to information regardless of its location. Much remains to be done, however, to define the architecture and develop the systems needed to support such remote access. This will be the focus of the next phase of Cornell's investigation in this area.
The Joint Study has recognized the need for a document control structure to facilitate navigation through the scanned digital book. Xerox has designed a flexible file format and indexing structure to facilitate direct access to the contents of individual books in the digital library. The architectural definition and early testing of this document control structure has been completed. So far, only references to whole books are stored in the file. The details about each book have not yet been included. Once this information is stored in the document structure file, researchers at desktop workstations will be able to move from the on-line catalog record to a description of the content and structure of the book so as to assist them in determining whether the book meets their needs. This, in turn, will allow direct retrieval or printing of all or parts of the book as identified in the document structure file.
The library application can be broadened beyond an exclusive focus on preservation for use in storing, accessing, and distributing other information. For example, the Cornell Campus Store has started to use the system to produce customized course packs and class notes, and Cornell University Press plans to test its application to a reprint series. Electronic journal production and dissemination are also being investigated. Application to other areas helps to ensure that the substantial investments required in technology development can be amortized over a variety of projects.
These general conclusions of the Joint Study derive from extensive experimentation with one digital scanning system. This report describes the findings, as well as the assumptions underlying the study, and the products and processes adopted. While many issues surrounding preservation, access, the library's role, and electronic technology remain to be resolved, this study represents a solid beginning. It also represents the initial phase of Cornell's continuing investigation in the use of digital technology. The second phase, again funded by the Commission on Preservation and Access, establishes a testbed to further the exploration and use of digital technology to meet library preservation needs.