Posting to RadCat
On 6/19/2014 5:58 AM, Ron Miller wrote:
When I was creating these types of products to sell to libraries, my solution was to allow the user to use the single search box that Google has conditioned them to think they know how to use, but to craft the algorithm to transparently weight our subjects and other metadata to appear at the top of the results set. Librarians are correct that users prefer a single search box, but I tried to set it up so that without realizing it they were returned results with subject level accuracy automatically. Of, course, that means that someone does have to pay the freight to hire subject experts to create and implement the vocabulary, though are some poseurs out there claims to get subject headings from software. Librarians can make the arguments about user preferences that you site, but they must realize that these efforts have nothing to do with what works best for users. It’s entirely about maximizing return to the corporations that are creating these products.
Yes. The world of interest to the users/searchers has always been greater than the world of the library catalog. Anybody who wanted to do at least half-way comprehensive searching has always had to go far beyond the catalog: to subject indexes, to journal articles and so on. To do so however, you always had to look in many more places than in the catalog and there is nothing new in this respect. What is new is that you can search it all with a single search box.
I like analogies, so I try to compare the situation to someone who works in a multi-story office building and is told to find the information in the company on a specific person. To do that, you would have to go onto each floor and into each office to search everybody’s file cabinets. Also, the way the files are arranged in each cabinet may be completely different from one to another. One cabinet may arrange employees by surname and forename, another may arrange by date of employment, another by department, another by employee number, another in some totally different way. In any case, it would be a lot of work to go from office to office to search through all of the hundreds of filing cabinets.
To make things easier for you, you could get the workmen to move all of the filing cabinets into a single room. While that would help you in a physical sense in that you would only have to go to one room and not all over the building, it would not help so much in any other way because you will still have to search each filing cabinet separately, knowing each one’s peculiarities. If you really wanted to make it “one search” for all information for employees, you would have to merge all of the files into a single sequence. That would be the real work–not just moving all of the cabinets into one room!
The single search box replicates this. It brings everything together in one “virtual room” but it stops there. One database may use “United States. Congress” while another may use “U.S. Congress” or “Congress (USA)” or there are a multitude of possibilities that we can only guess at. How do you find all of those?
The IT folks put their faith in their algorithms, which are truly impressive. We should not minimize any of that. But allowing this kind of specific access has clearly been beyond them. Linked data is a step toward a solution, but not a complete solution in itself because in the final analysis, somebody somewhere needs to add the links, and link everything together, in other words, doing the real work.
The other, very powerful argument that I have run into is: There is no debate that traditional authority control is good, wonderful, immensely helpful and so on, but facts are facts: it just cannot be done in today’s environment. There is just so much being created that authority control is overwhelmed. I refer people to the excellent talk “E-Books Do Not Exist (and Other Conundrums of Digital Asset Management)” https://www.youtube.com/watch?v=BJ5sSUHJagg where the speaker mentioned that his library is putting 8000+ records into the catalog every DAY! As I mentioned in a post to RDA-L https://blog.jweinheimer.net/2014/04/re-acat-advantages-of-rda.html that the average large library adds 1000 to 2000 per week.
8000 x 5 (days per week) x 50 (weeks per year) = 2,000,000 per year
1000/2000 x 50 (weeks per years) = 50,000 to 100,000 per year
(or no more than 5% of what is being added to the collection)
This is indeed, a very powerful argument, especially to administrators. If what we are doing applies to–at most–5%, how important does that make us in the “universe of information that is available to the users”?
I think there are equally powerful answers to this, but the cataloging community should not ignore these simple facts. I think that authority control can be maintained, at least to a point, but it will require massive changes among all kinds of communities. I think it can happen–and will happen eventually because people will demand it–but as I keep saying: everybody must look at the situation from the point of the users–not from their own. Not from the catalogers’ viewpoint, the authors’ viewpoint, or the publishers’–but from the users’ viewpoint, and consider the situation as objectively and dispassionately as possible.
Once we can ask a question honestly and dispassionately, such as: “How do we control 2,000,000 items per year? That number will probably rise dramatically in the future,” and expect an answer that is realistic and makes it possible, then we are making progress.
That may take awhile yet.