Posting to Autocat
On 12/04/2012 08:12 PM, William Anderson wrote:
Are we essentially talking about leveraging the cataloging record, authority and bibliographic, based on an initial keyword search of the full text when such text is available. Thats how I’m interpreting “make our records work coherently and usefully with full-text searching.” at first glance.
More or less. I think experience has demonstrated that given the choice, people will always prefer to search full-text over our metadata. Therefore if we want to have people use our metadata, it must be included within full-text searches somehow and done as easily and as seamlessly as possible.
Why do I think this?
Simple observation. It is obvious that people much prefer full-text searching to our tools, primarily because they perceive full-text searching to be simpler. This perception is not correct but is generally held. People have a strange and magical trust in relevance ranking algorithms, confusing algorithmic relevance with the everyday idea of “relevance” and even with the concept of “best”.
Full-text searching can only improve (or perhaps more precisely, people will perceive it as improving) and everything will become digitized. I have been shocked at how quickly it has happened already. The Google Books project shows how a bold company can break through all the red tape, all of the hostile feelings and actually do something that most said, just 15 years ago, would take hundreds of years.
What might have happened if everyone hadn’t spent their time fighting but instead cooperated? For instance, in 2009, it was estimated that Google had spent 300 million dollars to scan 10 million books (http://news.cnet.com/8301-30684_3-10321371-265.html) while a single B-2 Stealth bomber costs over $2 billion (http://blogs.telegraph.co.uk/news/willheaven/100080689/the-b-2-stealth-bomber-how-the-us-military-will-break-gaddafis-spirit/). Things could work out differently. While the amount Google has spent is for me, a huge amount of money, for a Berlusconi, a Buffett, a Gates, much less an entire group of rich countries, it is quite literally pocket change.
It is obvious that everything will be online sooner or later–or at least the vast majority of materials that the vast majority of people want. The publishers cannot ignore the rising public demand forever and stick to their dried-up old business methods that no longer serve the purposes they were meant to serve. Our library methods and tools were made for that old world and they work pretty well there, but it is the new world that the cataloging community must prepare itself for. In fact, except for the decision of a single judge, we could be living in it already.
In that world, will our cataloging records, already not wanted by the public, be needed? If yes, which parts? The description? The name headings? The subjects? The item records? The authority files? The classification numbers? All of them? I have read arguments for and against each of those parts.
How would each work in that environment? The attempts I have seen haven’t worked too well. That’s OK–you must stumble and fall before you run, but you also need imagination. How could these tools work in a keyword environment so that both keyword and controlled vocabulary supplement one another? And do it in a way that is simple and seamless for the untrained user?
As only one example, in another thread on this list there is a discussion about cataloging abbreviations, in this case [sic]: some like it, some don’t. The question should rather be: when an item can be retrieved at the click of a button, or maybe even appears at the same time you see the metadata record, is the principle of strict transcription still so important? Why, or why not?
I made up a quick example: http://www.jweinheimer.net/temp/balfour.html (I took the cataloging from another page so I don’t know if it is exactly the same)
What is the need here for “writen [sic]” since everybody can immediately see what it was in the original. In those cases when you don’t have immediate access to the t.p. there definitely is a need, of this I have no doubt. But times are changing, and in this case, I am not so sure there is a need for it. I suspect there is, but at least the point is debatable. What about the rest of the record?
If catalogers do not come up with good reasons why their metadata is needed for online resources that can be searched with algorithms, which will happen much sooner than they would like, then the direction we are
headed can only be called “ominous”.
But once again I keep forgetting the RDA and FRBR are what people really want! 🙂