Google Book Search

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Google Book Search
Google Book Search's Beta Logo

Google Book Search screenshot
Developed by Google
Operating system Any (web based application)
Type Online Library Book Search
Website http://books.google.com/

Google Book Search is a tool from Google that searches the full text of books that Google scans, converts to text using optical character recognition, and stores in its digital database. The service was formerly known as Google Print when it was introduced at the Frankfurt Book Fair in October 2004. When relevant to a user's keyword search, up to three results from the Google Book Search index are displayed above search results in the Google Web Search service (google.com). A user may also search just for books at the dedicated Google Book Search service. Clicking a result from Google Book Search opens an interface in which the user may view pages from the book as well as content-related advertisements and links to the publisher's website and booksellers. Through a variety of access limitations and security measures, some based on user-tracking, Google limits the number of viewable pages and attempts to prevent page printing and text copying of material under copyright.[1]

The Google Book Search service remains in a beta stage but the underlying database continues to grow. Google Book Search allows public-domain works and other out-of-copyright material to be downloaded in PDF format. For users outside the United States, though, Google must be sure that the work in question is indeed out of copyright under local laws. According to a member of the Google Book Search Support Team, "Since whether a book is in the public domain can often be a tricky legal question, we err on the side of caution and display at most a few snippets until we have determined that the book has entered the public domain."[2]

Many of the books are scanned using the Elphel 323 camera[3][4] at a rate of 1,000 pages per hour.[5]

The initiative has been hailed for its potential to offer unprecedented access to what may become the largest online corpus of human knowledge,[6][7] as well as criticized for potential copyright violations.[8]

Contents

[edit] Number scanned

By March 2007, Google had digitized one million books, according to the The New York Times at an estimated cost of US$5 million.[9] On October 28, 2008, Google stated that they had 7 million books searchable through Google Book Search, including those scanned by their 20,000 publisher partners.[10] Of the 7 million books, 1 million are "full preview" based on agreements with publishers. 1 million are in the public domain. Most scanned works are no longer in print or commercially available.[11]

[edit] Competition

  • Microsoft started a similar project called Live Search Books in late 2006. It ran until May 2008, when the project was abandoned.[12] All of the Live Search Books are now available on Internet Archive. Internet Archive is a non-profit and the second largest book scanning project after Google. As of November 2008 it had over 1 million full-text public domain scanned works online.
  • Europeana is to host about 3 million digital objects including video, photos, paintings, audio, maps, manuscripts, printed books, and newspapers from the past 2,000 years of European history from over 1,000 archives in the European Union.[13]

[edit] Timeline

[edit] 2004

  • December 2004: Google signaled an extension to its Google Print initiative known as the Google Print Library Project.[14] Google announced partnerships with several high-profile university and public libraries, including the University of Michigan, Harvard (Harvard University Library), Stanford (Green Library), Oxford (Bodleian Library), and the New York Public Library. According to press releases and university librarians, Google plans to digitize and make available through its Google Book Search service approximately 15 million volumes within a decade. The announcement soon triggered controversy, as publisher and author associations challenged Google's plans to digitize, not just books in the public domain, but also titles still under copyright.

[edit] 2005

  • September - October 2005: Two lawsuits against Google charge that the company has not respected copyrights and has failed to properly compensate authors and publishers. One is a class action suit on behalf of authors (Authors Guild v. Google, Sept. 20 2005) and the other is a civil lawsuit brought by five large publishers and the Association of American Publishers. (McGraw Hill v. Google, Oct. 19 2005)[8]
  • November 2005: Google changed the name of this service from Google Print to Google Book Search.[15] Its program enabling publishers and authors to include their books in the service was renamed "Google Books Partner Program" (see Google Library Partners) and the partnership with libraries became Google Books Library Project.

[edit] 2006

[edit] 2007

  • January 2007: The University of Texas at Austin announced that it would join the Book Search digitization project. At least one million volumes will be digitized from the University's 13 library locations. (As of late 2008, the University of Texas has withdrawn from continuing to help the digitization project.)
  • March 2007: The Bavarian State Library announced a partnership with Google to scan more than a million public domain and out-of-print works in German as well as English, French, Italian, Latin, and Spanish.[20]
  • May 2007: A book digitizing project partnership was announced jointly by Google and the Cantonal and University Library of Lausanne.[21]
  • May 2007: The Boekentoren Library of Ghent University will participate with Google in digitizing and making digitized versions of 19th century books in the French and Dutch languages available online.[22]
  • June 2007: The Committee on Institutional Cooperation (CIC) announced that its twelve member libraries would participate in scanning 10 million books over the course of the next six years.[23]
  • July 2007: Keio University became Google's first library partner in Japan with the announcement that they would digitize at least 120,000 public domain books.[24]
  • August 2007: Google announced that it would digitize up to 500,000 both copyrighted and public domain items from Cornell University Library. Google will also provide a digital copy of all works scanned to be incorporated into the university’s own library system.[25]
  • September 2007: Google added a feature that allows users to share snippets of books that are in the public domain. The snippets may appear exactly as they do in the scan of the book or as plain text.[26]
  • September 2007: Google debuts a new feature called "My Library" which allows users to create personal customized libraries, selections of books that they can label, review, rate, or full-text search.[27]
  • December 2007: Columbia University was added as a partner in digitizing public domain works.[28]

[edit] 2008

  • May 2008: Microsoft tapers off and plans to end its scanning project which reached 750,000 books and 80 million journal articles.[29]
  • October 2008: A settlement is reached between the publishing industry and Google after two years of negotiation. Google agrees to compensate authors and publishers in exchange for the right to make millions of books available to the public.[30][8]
  • November 2008: Google reaches the 7 million book mark for items scanned by Google and by their publishing partners. 1 million are in full preview mode and 1 million are fully viewable and downloadable public domain works. About five million are currently out of print.[31][11][32]
  • December 2008: Google announces the inclusion of Magazines in Google Book Search. Titles include New York Magazine, Ebony, and Popular Mechanics and others.[33]

[edit] Google Books Library Project participants

The number of participating institutions has grown since the inception of the Google Books Library Project;[14] The University of Mysore has been mentioned in many media reports as being a library partner.[34][35] They are not, however, listed as a partner by Google.[36]

[edit] Initial partners

[edit] Additional partners

Other institutional partners have joined the Project since the partnership was first announced.

[edit] Copyright infringement, fair use and related issues

The publishing industry and writers' groups have criticized the project's inclusion of snippets of copyrighted works as infringement. In the fall of 2005 the Authors Guild of America and Association of American Publishers separately sued Google, citing "massive copyright infringement." Google countered that its project represented a fair use and is the digital age equivalent of a card catalog with every word in the publication indexed.[8] Despite Google taking measures to provide full text of only works in public domain, and providing only a searchable summary online for books still under copyright protection, publishers maintain that Google has no right to copy full text of books with copyrights and save them, in large amounts, into its own database. [37]

Other lawsuits followed. In June 2006, a French publisher announced its intention to sue Google France.[38] In 2006 a previously-filed German lawsuit was withdrawn.[39]

In March 2007, Thomas Rubin, associate general counsel for copyright, trademark, and trade secrets at Microsoft, accused Google of violating copyright law with their book search service. Rubin specifically criticized Google's policy of freely copying any work until notified by the copyright holder to stop.[40]

The Authors Guild, the publishing industry and Google entered into a settlement agreement October 28, 2008, with Google agreeing to pay a total of $125 million to rightsholders of books they had scanned, to cover the plaintiff's court costs, and to create a Book Rights Registry. The settlement has to be approved by the court, which will occur some time after May 2009.[8] Reaction to the settlement has been mixed, with Harvard Library, one of the original contributing libraries to Google Library, choosing to withdraw its partnership with Google if "more reasonable terms" cannot be found. [41]

As part of the $125 million settlement signed in October 2008, Google created a Google Book Settlement web site that went active on February 11, 2009. This site allows authors and other rights holders of out of print (but copyright) books to submit a claim by January 5, 2010.[42] In return they will receive $60 per full book, or $5 to $15 for partial works.[42] In return, Google will be able to index the books and display snippets in search results, as well as up to 20% of each book in preview mode.[42] Google will also be able to show ads on these pages and make available for sale digital versions of each book.[42] Authors and copyright holders will receive 63 percent of all advertising and e-commerce revenues associated with their works.[42]

Siva Vaidhyanathan, associate professor of Media Studies and Law at the University of Virginia has argued[43] that the project poses a danger for the doctrine of fair use, because the fair use claims are arguably so excessive that it may cause judicial limitation of that right.[44] Because Author's Guild v. Google did not go to court, the fair use dispute is left unresolved.

Google licensing of public domain works is also an area of concern.[45] Google apparently is claiming a restrictive 'No-Commercial use' term in respect of the PDF electronic versions it provides, as well as using digital watermarking techniques with them. Some published works that are in the public domain, such as all works created by the U.S. Federal government, are still treated like other works under copyright, and therefore locked after 1922.[46]

[edit] Language issues

Some European politicians and intellectuals have criticized Google's effort on "language-imperialism" grounds, arguing that because the vast majority of books proposed to be scanned are in English, it will result in disproportionate representation of natural languages in the digital world. German, Russian, and French, for instance, are popular languages in scholarship; the disproportionate online emphasis on English could shape access to historical scholarship, and, ultimately, the growth and direction of future scholarship. Among these critics is Jean-Noël Jeanneney, the former president of the Bibliothèque nationale de France[47]

[edit] Google Books vs. Google Scholar

While Google Book Search has digitized large numbers of journal back issues, its scans do not include the metadata required for identifying specific articles in specific issues. This has led the makers of Google Scholar to start their own program to digitize and host older journal articles (in agreement with their publishers).[48]

[edit] References

  1. ^ Greg Duffy (March 2005). "Google's Cookie and Hacking Google Print". Kuro5hin. http://www.kuro5hin.org/story/2005/3/7/95844/59875. 
  2. ^ Ryan Sands (November 9, 2006). "From the mail bag: Public domain books and downloads" (blog). Inside Google Book Search. http://booksearch.blogspot.com/2006/11/from-mail-bag-public-domain-books-and.html. 
  3. ^ Google currently uses Elphel cameras for book scanning and for capturing street imagery in Google Maps
  4. ^ "Adapted firmware of Elphel 323 camera to meet needs of Google Book Search"
  5. ^ Kelly, Kevin (May 14, 2006). "Scan This Book!". New York Times Magazine. http://www.nytimes.com/2006/05/14/magazine/14publishing.html?_r=1&oref=slogin&pagewanted=all. Retrieved on 2008-03-07. "When Google announced in December 2004 that it would digitally scan the books of five major research libraries to make their contents searchable, the promise of a universal library was resurrected. ... From the days of Sumerian clay tablets till now, humans have "published" at least 32 million books, 750 million articles and essays, 25 million songs, 500 million images, 500,000 movies, 3 million videos, TV shows and short films and 100 billion public Web pages." 
  6. ^ Bergquist, Kevin (2006-02-13). "Google project promotes public good". The University Record (University of Michigan). http://www.umich.edu/~urecord/0506/Feb13_06/02.shtml. Retrieved on 2007-04-11. 
  7. ^ Pace, Andrew K. (January 2006). "Is This the Renaissance or the Dark Ages?". American Libraries. American Library Association. http://www.ala.org/ala/alonline/techspeaking/2006columnsa/techJan2006.cfm. Retrieved on 2007-04-11. "Google made instant e-book believers out of skeptics even though 10 years of e-book evangelism among librarians had barely made progress." 
  8. ^ a b c d e Copyright infringement suits against Google and their settlement: The original lawsuits in 2005:
  9. ^ Hafner, Katie (March 11, 2007). "History, Digitized (and Abridged)". New York Times. http://www.nytimes.com/2007/03/10/business/yourmoney/11archive.html. Retrieved on 2008-04-10. "Google, on its own, is digitizing books at the Library of Congress, which has its hands full with other items. ... In its quest to scan every one of the tens of millions of books ever published, Google has already digitized one million volumes. Google refuses to say how much it has spent on the venture so far, but outside experts estimate the figure at at least US$5 million. The company has also been scanning and indexing academic journals to make them searchable, and is working with the Patent Office to digitize thousands of patents dating back to 1790." 
  10. ^ "New Chapter". Google. http://googleblog.blogspot.com/2008/10/new-chapter-for-google-book-search.html. Retrieved on 2008-10-29. 
  11. ^ a b "In Google Book Settlement, Business Trumps Ideals". PC World. October 28, 2008. http://www.pcworld.com/businesscenter/article/153085/in_google_book_settlement_business_trumps_ideals.html. Retrieved on 2008-10-31. "Of the 7 million books Google has scanned, 1 million are in full preview mode as part of formal publisher agreements. Another 1 million are public domain works." 
  12. ^ "Microsoft starts online library in challenge to Google Books". AFP. http://www.theage.com.au/news/biztech/microsoft-starts-online-library-in-challenge-to-google-books/2006/12/07/1165081127665.html. Retrieved on 2008-11-24. "Microsoft launched an online library in a move that pits the world's biggest software company against Google's controversial project to digitize the world's books." 
  13. ^ "Europe's Answer to Google Book Search Crashes on Day 1". Wired. 2008. http://blog.wired.com/business/2008/11/eu-launches-mas.html. Retrieved on 2008-11-24. 
  14. ^ a b O'Sullivan, Joseph and Adam Smith. "All booked up," Googleblog. December 14, 2004.
  15. ^ Jen Grant (November 17, 2005). "Judging Book Search by its cover" (blog). Googleblog. http://googleblog.blogspot.com/2005/11/judging-book-search-by-its-cover.html. 
  16. ^ UC libraries partner with Google to digitize books
  17. ^ University Complutense of Madrid and Google to Make Hundreds of Thousands of Books Available Online
  18. ^ UW-Madion + WHS + Google digitization project partnership announced
  19. ^ The University of Virginia Library Joins the Google Books Library Project
  20. ^ Bavarian State Library + Google digitizing project partnership announced
  21. ^ Reed, Brock. "La Bibliothèque, C'est Google" (Wired Campus Newsletter), Chronicle of Higher Education. May 17, 2007.
  22. ^ Ghent/Gent + Google digitizing project partnership announced
  23. ^ CIC + Google digitizing project partnership announced
  24. ^ Keio + Google digitizing project partnership announced
  25. ^ Cornell + Google digitizing project partnership announced
  26. ^ Google's digitized "snippets" feature announced
  27. ^ Google's "personal library" feature announced
  28. ^ Columbia + Google digitizing project partnership announced
  29. ^ "Microsoft Will Shut Down Book Search Program". New York Times. May 24, 2008. http://www.nytimes.com/2008/05/24/technology/24soft.html?_r=1&ref=technology&oref=slogin. Retrieved on 2008-05-24. "Microsoft said it had digitized 750,000 books and indexed 80 million journal articles." 
  30. ^ "Some Fear Google’s Power in Digital Books". New York Times. February 1, 2009. http://www.nytimes.com/2009/02/02/technology/internet/02link.html?em. Retrieved on 2009-02-02. "Today, that project is known as Google Book Search and, aided by a recent class-action settlement, it promises to transform the way information is collected: who controls the most books; who gets access to those books; how access will be sold and attained." 
  31. ^ "Massive EU online library looks to compete with Google". Agence France-Presse. November 2008. http://www.google.com/hostednews/afp/article/ALeqM5gQBJ3FLg32GX_cAVFLQo1feO6Ckg. Retrieved on 2008-11-24. "Google, one of the pioneers in this domain on the other hand, claims to have seven million books available for its "Google Book Search" project, which saw the light of day at the end of 2004." 
  32. ^ "Google Hopes to Open a Trove of Little-Seen Books". New York Times. January 4, 2009. http://www.nytimes.com/2009/01/05/technology/internet/05google.html?partner=rss&emc=rss&pagewanted=all. Retrieved on 2009-01-05. "The settlement may give new life to copyrighted out-of-print books in a digital form and allow writers to make money from titles that had been out of commercial circulation for years. Of the seven million books Google has scanned so far, about five million are in this category." 
  33. ^ "Google updates search index with old magazines". Associated Press. 2008. http://www.businessweek.com/ap/financialnews/D94VIH600.htm. Retrieved on 2008-12-10. "As part of its quest to corral more content published on paper, Google Inc. has made digital copies of more than 1 million articles from magazines that hit the newsstands decades ago." 
  34. ^ Ars Technica
  35. ^ Hindustani Times "Google to digitise 800,000 books at Mysore varsity"
  36. ^ Google Library Partners
  37. ^ People's Daily Online (August 15, 2005). "Google's digital library suspended". http://english.peopledaily.com.cn/200508/15/eng20050815_202595.html. 
  38. ^ John Oates (June 7, 2006). "French publisher sues Google". The Register. http://www.theregister.co.uk/2006/06/07/france_sues_google/. 
  39. ^ Danny Sullivan (2006-06-28). "Google Book Search Wins Victory In German Challenge" (blog). Search Engine Watch. http://blog.searchenginewatch.com/blog/060628-152950. Retrieved on 2006-11-11. 
  40. ^ Thomas Claburn (March 6, 2007). "Microsoft Attorney Accuses Google Of Copyright Violations". InformationWeek. http://www.informationweek.com/internet/showArticle.jhtml?articleID=197800578. 
  41. ^ "Google Online Book Deal at Risk". http://www.thecrimson.com/article.aspx?ref=524989. 
  42. ^ a b c d e "Google Book Settlement Site Is Up; Paying Authors $60 Per Scanned Book", by Erick Schonfeld on February 11, 2009 at TechCrunch
  43. ^ Siva Vaidhyanathan,. “The Googlization of Everything and the Future of Copyright,” University of California Davis Law Review volume 40 (March 2007), pp. 1207–1231, pdf
  44. ^ First Monday Transcript September 2007.
  45. ^ Michael Liedtke (May 24, 2005). "Publishers Protest Google's Online Library Project". Associated Press. http://www.livescience.com/technology/ap_050524_google_scan.html. 
  46. ^ Robert B. Townsend, Google Books: Is It Good for History?, Perspectives (September 2007).
  47. ^ Jean-Noël Jeanneney (2006-10-23) (book abstract; Foreword by Ian Wilson). Google and the Myth of Universal Knowledge: A View from Europe. ISBN 0-226-39577-4. 
  48. ^ Barbara Quint, "Changes at Google Scholar: A Conversation With Anurag Acharya", Information Today, August 27, 2007.

[edit] See also

[edit] External links

Personal tools