[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[padg] The "Google Five" Describe Progress, Challenges
For those who did not make it to the Google Five panel discussion at
ALA:
The "Google Five" Describe Progress, Challenges
Brittle books, quality control, and better metadata loom large for
scan plan.
http://www.libraryjournal.com/info/CA6456319.html?nid=2673#news3
Their numbers have now swelled to 25, but what's up with the five
pioneering libraries that signed on with the ever-growing Google Book
Search? At the American Library Association Annual Conference,
panelists from each library said they were pleased with the progress,
though they acknowledged continuing challenges ranging from damaged
books to search quality. Google product manager Adam Smith led off by
describing the new "About the Book" page under construction for
titles in Google Book Search, which includes key terms and phrases,
references to the book from scholarly publications or other books,
chapter titles, and a list of related books—even for books that
aren't digitized.
At four Harvard libraries, public domain works have been scanned and
links are being put in the catalog, said Harvard University Library's
Dale Flecker. "We're filtering out a lot of works that are not
physically up to being scanned," he noted, citing not just brittle
paper but problems with binding. "We also find that condition is a
filtering factor," said John Balow of New York Public Library (NYPL).
Sarah Thomas of Oxford University's Bodleian Library said that "there
are many books rejected because of fragile conditions." By contrast,
Catherine Tierney of Stanford University said that less than one
percent of books can't be sent for scanning; however, a surprising
fraction of volumes are limited because they lack bar codes. Are
damaged copies, one person asked, good enough to scan elsewhere, or
is any library ready to sacrifice a volume to be digitized? "The
things we can't send to Google, we have in the queue," Tierney said.
The accumulated texts would take 36 years, 24/7, to be digitized, she
said, suggesting that the issue would be reviewed as more scans
appear elsewhere.
Flecker, praised the "About this book" feature and predicted that
"text mining" will be an important part of research. Tierney said
that seven to ten reference questions or interlibrary loan requests a
week are generated by use of Google Book Search. Dunkle added that
Michigan has received more international reference questions through
GBS. Thomas said that the scan plan has produced "much more detailed
knowledge about our collection," including the surprise that about
one percent of the Bodleian Library's books have uncut pages, meaning
they've never been opened.
Challenges remain, Smith conceded, including generating better
metadata. Dunkle said that librarians in the Committee on
Institutional Cooperation (CIC), the 12-library group that recently
signed a deal with Google, hope to find ways to search across the
books, though "I personally think Google will get there first."
Flecker said Harvard librarians also hope Google will solve some
access problems. "Right now, to be frank, I don't find the retrieval
in Book Search to be that impressive." Flecker said. "There's a long
ways to go." NYPL's Balow said that "good, old-fashioned librarian
work" will be needed to refine searches. "There's still a great deal
of room for the skills we've been working on for a long time."
As for specific drawbacks Tierney said her library received email
complaining that scans have thumbs visible. "It's a lot of work,"
conceded Flecker. "C'mon, that's it?" asked a voice from the crowd.
"Are going to sing 'Kumbaya'?" Dunkle called the tension
"unfortunate" over whether the scan plan is the right thing to do.
Emory University's Martin Halbert, speaking from the audience,
briefly described his university's alternative plan in which
libraries retain control of the digital volumes, and can focus on
coherent subject areas. Google's Smith was magnanimous. "From
Google's perspective," he said, "We view this as complementary."
How to measure success? "We'll define success as getting as much of
our collection digitized as we can," observed Oxford's Thomas, noting
that most of the collection doesn't circulate, and that digital
access can transform scholarship. Stanford's Tierney said that she
hoped the growth of the program would help convince publishers to
release more material in copyright "available in non-snippet view."
She said she hoped the "orphan works" issue, which leaves so much
published material in copyright limbo, is resolved. "I would not want
my physician to be using pre-'23 medical texts," she observed.
---------------------
Holly Robertson
Preservation Librarian
University of Virginia Library
Alderman Library
Preservation - Rm 113
Charlottesville, VA 22904-4105
434.924.1055
(f) 434.243.7756
AIM: h011y2121 | GoogleTalk: h011yr0b3rts0n
hollyr@xxxxxxxxxxxx
www.lib.virginia.edu/preservation