Copernicus, Cataloging, and the Chairs on the Titanic, Part 1 [Long Post]

Posting to NGC4LIB

Alexander Johannesen wrote:

Jim, give up *now*. There is no way you even have the slightest time to even look at a squirt of what’s available. You simply cannot go through it all; not only is it too much (the 2003 estimates are no good; “According to an IBM study, by 2010, the amount of digital information in the world will double every 11 hours.”;,
but it is *ever* changing! The Internet doesn’t stand still. Even WikiPedia pages change, not only in content, but in links and in meaning (as content change). Resources and content die and gets born in a continuous line that will never end. There’s no way for you to go through it all, no way to monitor it, no way to catalog it … you cannot put it on a shelf. Of course, you can make a copy and catalog the copy, and as such make it obsolete like old books, that’s fine, I’m sure you can do that for a selection of sorts. But the sher amount of stuff you have to wade through to even make that selection is simply unsurmountable.

It would be the easiest thing in the world to give up, but one thing I have seen that all of my users want–from students to researchers–much more than our cataloging which they find weird, is selection. (That’s why I started the thread) What library selection means in the popular mind is quite different from what it really is. The public has always believed that libraries selected materials because they were the “best” and the most “correct” but that is untrue. Library selectors are only human, and the role of the selector is absolutely *not* to mold the library’s collection into a mirror of the selector’s own opinions and tastes but instead to help to show people, so much as possible, the range of information that is available.

As an example, I heard a story, perhaps apocryphal but neverthless enlightening, about a great (unnamed) library collection of Russian literature at a great (unnamed) university offering a doctorate in Russian literature. The person who did the selecting for the library in Russian literature was not a librarian, but actually a great (unnamed) Russian writer and faculty member. It turned out that this great Russian writer hated Dostoyevsky with every drop of (his or her) being and therefore, refused to mention Dostoyevsky in any literature classes, but also refused to purchase anything by or about Dostoyevsky for the library. When this great Russian writer passed away, a librarian was set in place who had to begin to build the collection on Dostoyevsky because, after all, how could an important collection of Russian literature have almost nothing on Dostoyevsky? I think this shows the difference rather clearly between the attitudes of a librarian and those of a faculty member. Faculty members can both have and teach personal opinions about anything they want–after all, that is an important part of their jobs and is vital for academic freedom. But a librarian definitely has other goals. Both are needed but they are quite different.

Therefore, when I am selecting, I must add materials to the collection that I do not agree with, even adding opinions that I violently oppose. My opinions should not get in the way of other people before they form their own opinions. (I wrote a page in my library’s information wiki about this

I agree that selection as it has traditionally been done must change–somehow–for materials on the web. And we *must* face it: selection is happening right now, only it is done automatically through Google’s spiders (which do not get everything) and their page ranking algorithm, the details of both are quite secret. After all, if something is #500 in a search result, it may as well not exist. Businesses and other organizations understand this very well now and realize that their goods & services must rise to the top, otherwise they die. Therefore we have a strange situation: selection is being done using automated means by a very secretive company (Google), and their “selection policies” (here I am thinking of page rank) can and are being manipulated to serve the private agendas of all kinds of other individuals and organizations. For a quick overview, see This are merely statements of fact.

Again, is the job too big as you suggest? So long as librarians remain mired in 19th and early 20th century thinking and processes, it definitely is. I think selection must change from the traditional methods I mentioned above (i.e. a sense of disinterestedness), not necessarily into that of “what is best” but one that strives to provide alternative opinions on a topic: pro/con, left/right, fascist/anarchist, technical/humanistic, or whatever. We could leave to various types of crowdsourcing the task of “what is best”. Such a tool, no matter what it is and perhaps only an addon to Google, would be the “catalog”. But no matter what the catalog would be, it means little without the concept of “selection”.

In short, selection of materials on the web *is* being done today. I am asking if these rather bizarre, secretive methods of selection are best for us. If we do not deal with the problem of selection, we leave it to some of the most unscrupulous characters in the world who understand its power very well and will do anything–I mean *anything*–to continue manipulating the page ranking algorithms. Going through my library catalog’s log files, I have discovered some attempts at spamming that I consider nothing less than works of genius. Too bad these clever minds get dragged into such directions. I don’t think it is best that these are the people selecting information for the citizenry, who believe they are looking at the most “relevant” results, but if we leave it all up to Google et al. that is exactly what everybody gets.

I find this potentially extremely dangerous and why I believe there must be something better, although it may not be perfect. Libraries have their codes of ethics, which should make them important and vital players in this world.

That is, so long as librarians are willing to change in fundamental ways. I don’t know if they can do it though.