RE : Old School Search Engines

Posting to Autocat

On Wed, Dec 14, 2011 at 3:27 PM, Mitchell, Michael wrote:

James Weinheimer wrote:

For instance, the subject heading browse (alphabetical) for “chess” mashes together not only the topic with its subdivisions, but people’s names, series titles, names of computer programs, corporate bodies, and so on. Additionally, before and after the topic of chess come personal names of people, and all kinds of topics and other entities who have nothing whatsoever to do with chess.

I don’t see why a properly functioning catalog would mash together “people’s names, series titles, names of computer programs, corporate bodies, and so on” when doing a subject browse. That sort of mashup is what one gets from a keyword search. And, that is why I don’t care to use keyword searching in a library catalog except in an initial search to discover LCSH terminology or classification areas.

This is exactly what I tried to demonstrate happens right now in the LC Authority File, but this is not to find fault: all dictionary-type catalogs work the same way. This has been known for a long, long time and has always been one of the main complaints people have had with the dictionary catalog.

For instance, in the LC Authority file, you can look for Dogs. What is the subject heading that comes just before that? It takes awhile to go through the personal names, corporate bodies and such, (nothing having anything to do with the topic of dogs) but then the first subject heading you come to: Dogrib mythology. So, even if we could magically get rid of everything except 150s, we would still be looking from Dogrib mythology to Dogs. That is a very strange leap that happens only because of English spelling. If I am interested in Dogs, I don’t want to see Dogrib mythology.

So, what comes after Dogs? Again, there are personal names, but then comes a reference from Dogsharks (nothing to do with Dogs). At least, there are dogsleds but after that, there is nothing whatsoever to do with dogs. This occurs right now whenever you browse an alphabetically arranged list–and always has.

A classified list can be imagined by disassembling the alphabetical lists we see, and rearranging them all by the BT, NT, RT that are inside each authority record. We can see it to a point in with, but a much better example are the Getty Vocabularies.

The Art & Architecture Thesaurus display for “armchairs”¬e=&english=N&prev_page=1&subjectid=300037776 includes the cryptically-named “hierarchical position”. Click on the little triangle-thingy for “chairs” and you will see the amazing number of the different types (NTs) of chairs. The A&AT is arranged alphabetically to a point as we see here, but the primary arrangement is classified (conceptual).

Once again, compare this to the dictionary-type of authority file in the LCSH, look for “armchair” and the headings before and after it, which includes people, corporate bodies, titles and all kinds of things that have nothing whatsoever to do with any aspect of the concept of “armchair”.

So, it’s not as if the information is not in the records, because it is. The issue is: what is the best arrangement for someone interested in “dogs” or “armchairs”: by alphabetical arrangement or a classified arrangement? This is an old, old debate that will probably come alive again.