Advertisement

Google To Scan Library Books

Books from University collections will be searchable online in new initiative

Google is partnering with Harvard to make 40,000 of the University’s library books searchable online in a pilot project that could lead to the digitization of all 15 million books in Harvard’s collections.

The initiative, set to be announced today, will allow anyone on the internet to browse through uploaded works as part of the Google Print project, an arm of Google. The University of Michigan, Oxford University, Stanford University and the New York Public Library are also participating in similar programs to be announced today.

Harvard will make the books available and Google will scan the books and bear all costs, according to Pforzheimer University Professor Sid Verba ’53, who is director of the Harvard University Library (HUL). Verba said that making Harvard’s books available digitally has long been a priority but until now has been infeasible.

“The collaboration with Google allows us to do something that we could not have possibly afforded on our own,” Verba said. Google Product Manager Adam Smith declined to comment on the cost of the project, citing company practice of not disclosing the terms of business deals.

The 40,000 books will be selected mostly at random from the 5 million books at the Harvard Depository, with the most delicate works exempted from the process, according to Peter Kosewski, director of publications and communications for HUL. He said Google will keep the books at the depository and digitize them on site to minimize disruption to researchers.

Advertisement

The project—which is estimated to take six months—has been in the planning stages for over a year and received the approval of the University’s top governing body, the Harvard Corporation, Verba said.

Verba said the new partnership with Google marks an unprecedented step in making Harvard’s books accessible to the public. Harvard’s libraries have made a smaller number of books available digitally through its Open Collections project.

Verba explained that researchers will not “really get the book out of their computer.”

“What they’re going to get is information about what’s in books and information on where you can find the books,” Verba said.

Books in the public domain will be fully available online so researchers can use Google Print in place of a trip to the library, according to Smith. Much smaller excerpts of copyrighted works will be displayed, he said.

While Google may upload some of Harvard’s copyrighted works, they will not be displayed for now, Kosewski said.

Verba said the collaboration was sparked when Google approached Harvard a few years ago. He originally expressed concerns about keeping the collections available to researchers and avoiding damage to the books. But he said that when representatives from Google came back to talk a year later, they had addressed these concerns with an effective scanning technology.

“It is very specifically designed to be non-destructive,” Smith said. “In working with these libraries and their collections, we need to be extremely careful.”

“One of the reasons we are doing this pilot study with 40,000 books is to see if that’s true,” Verba said. “We really don’t want to plunge into a really mega project without that assurance.”

Kosewski said the project would be a “learning experience.”

Advertisement