On 5/18/2016 9:45 PM, JT Whitfield wrote:
One concern I have is a situation when the same file is available via more than one web page/landing page/wrapper/access page. If you are cataloging from a page created by or for the institution or individual that created or digitized the file, you would have one set of information. If you were cataloging a page where the file was “reposted” you may not have an attribution, and the further removed you are from the “creator” the greater the chance may be that you have erroneous or incomplete information about the file. (You may even have a “spoofed” or altered file on the page, and you may not know what alterations have been made or why.) Of course, you would catalog based on the information you have “on screen.”

An example of a “spoofed” file are those created by the agency “BiblioBazaar” where the organization takes a book that has been digitized by someone else (Google, a library) and is also in the public domain, puts a new t.p. on it, and “re-publishes” it. Here is an example: “Sinks of London Laid Open” with authors “BiblioBazaar, LIGHTNING SOURCE INC” publication information: “BiblioBazaar, 2008” (https://books.google.co.uk/books?id=On5ViQkr43QC)

You can’t access the book, but we find this note: “This is a pre-1923 historical reproduction that was curated for quality. Quality assurance was conducted on each of these books in an attempt to remove books with imperfections introduced by the digitization process. Though we have made best efforts – the books may have occasional errors that do not impede the reading experience. We believe this work is culturally important and have elected to bring the book back into print as part of our continuing commitment to the preservation of printed works worldwide.”

This noble sounding statement is all hogwash. It turns out that this company has done nothing of the sort. All they have done is take this book, already digitized by Google https://books.google.co.uk/books?id=lPxnKPkEiIUC, and put a new t.p. on it, along with new publication information. (You can see some of their books too, e.g. https://books.google.dk/books?id=yFkCAAAAQAAJ where all they did was add a new t.p. and some metadata) This company does this because they hope that someone will ask for a print-on-demand version from them.

An interesting discussion of this phenomenon is at http://sappingattention.blogspot.co.uk/2014/04/biblio-bizarre-who-publishes-in-google.html. This article discusses the problems of spamming the Google n-gram service, but there is a more general problem of spamming the entire bibliographical apparatus. There are several “publishers” who are doing this.

There is nothing new about this and should be expected during transitional times like these. I remember reading a story (but don’t remember where) of a group of book sellers, printers, and librarians in early 1700s London, who had all been complaining about a certain printer who kept reprinting the same few books over and over, but under different titles and publication information. This group of people found the printer and forced him to sign an agreement that he wouldn’t print anything for 10 years. I don’t know if he held to his pledge.