Re: How Google makes improvements to its search algorithm

Posting to NGC4LIB 

On 29/08/2011 14:54, Eric Lease Morgan wrote:

The comment above is very interesting because it begs the question, “To what degree are search results against a database or index expected to be objective?” Our profession has taught us there are right ways and wrong ways to do searching. As a corollary, there must be correct search results and incorrect search results. “If you search the database in the right way, then you will get the correct — most accurate (precision) and complete (recall) — results.

Yet our profession does not emphasis the inherent characteristics of the reader. (Increasingly I don’t like writing the word “user”.) If I put in the word “pizza” into Google, I get pizza things close to my geographic location. Our profession does make such assumptions, and we expect the searcher to qualify the query with a location.

In this way, Google is easier to use and why different research results will be returned for different searchers. Google sort of knows about you. Ironically, a good librarian will also know about their patrons, and they will create search results tailored for the individual. This is is what reference librarianship is all about. Unfortunately we have yet to migrate this expertise into a computerized environment. “That is artificial intelligence and it can’t be done. That threatens me; I will lose my job if that comes to fruition. In order to provide that sort of functionality we will have to record characteristics of readers, and that violates privacy.” In short, our own professional ethics have limited us, and others, who don’t have these beliefs, have literally profited and grown without them.

These are some of the considerations why I think we need to look at the controls in the library catalog as aimed more at librarians (read “information expert”) instead of members of the general public. I also don’t think we should label the results of a catalog as more or less “correct” or “better” than a result from a full text search engine; that is a treacherous path since anybody can justifiably take issue with it.

What is “correct” and/or “better” is always subjective.

Results from library catalogs should be framed more in terms of “standardized” or perhaps “guaranteed”. *If* a database has personal name authority control, this means that within certain parameters (i.e. currently, rule of three, and perhaps practices of analysis) it is possible to guarantee that all items authored by certain people can be retrieved by the catalog. This is achieved through standardization. The library catalog allows standardized methods of finding resources by other concepts, too: corporate names, various kinds of titles, subjects, and so on. Full-text retrieval tools do not allow this.

The existence of such a tool does not mean that an untrained person can retrieve those records successfully, just as an untrained person cannot necessarily use a band saw very effectively. They very well may need

Google does not allow any kind of “guaranteed” or “standardized” access–just the opposite. If the results vary for you and me, and even vary for ourselves depending on where we are searching from, plus it is tweaked almost twice a day, I think the public could possibly understand the argument for a more standardized means of access.

But I wouldn’t call our results better.