ACAT Linked data, bibliographic data, and the nature of being

Posting to Autocat

On 26/05/2015 22.52, Schutt, Misha wrote:

The ultimate goal, I believe, is to sift WorldCat into a database of such triples, so that the query “Do you have any novels by Belgian women authors writing in the 19th century” can be answered as easily as “Do you have Season 2 of Breaking Bad on hand?”

I agree that this seems to be the goal. The big question, unanswered and as far as I am aware, still not seriously asked, is: Is this really the kind of searches that the public wants? For instance, to provide a reliable answer to the sort of question you have posed:
Do you have any [novels, non-fiction, poems, plays, …] by [Belgian women authors, male Russian authors, transgendered British authors, …] writing in the [all kinds of time periods are possible]
our databases cannot answer such questions at the moment. Our collections can answer that kind of question, and always have, but to use a library collection takes work and knowledge. Therefore if we want to create such a tool, a tremendous amount of information, or at the very least, a tremendous number of links must be added to our data, and those links will go all over the place. Will all that information, or all those links, be put in manually?

Therefore, to create the tools to answer such questions will require an enormous effort, costing scads of money and resources lasting over a long time. It seems only responsible to ask: How many people need answers to such questions? Is this the wisest use of our money and resources? Above all, do we have sufficient resources to complete the task? If we don’t have the resources now, when will we get them? If never, what does that mean? These are serious (and obvious) questions that should neither be ignored nor underestimated.

A corollary of this is the assumption that our catalogs will turn into something quite different from what they have always been. With this idea of linked data our catalogs will turn from the current tools that tell people where they need to go to get the answers they need, to a future tool that provides that information directly. The way it has always been is that the library collection provides the information the users need, but the user must enter into the collection and use it. The catalog would become an answer machine.

I certainly have nothing at all against such a change, so long as it is determined to be what the public really wants. But I am skeptical. Do people want something that they can rely upon to lead them to where the information is, so that they themselves can pick what they want among materials that have been expert-selected and objectively arranged? Or do they want some opaque algorithm or “URI links” that may come from who knows where, to choose their answers for them? If someone wants the latter, there are lots of places they can go now, and those options will only improve in the future. If you want the former however, it is difficult to find anything. Besides libraries.

Personally, I think a good step would be just to see the catalog actually work as it was supposed to, and that people could see, understand, and use the cross-reference structures that are now pretty much hidden in the catalog. I think people would like to discover that when they search for Mark Twain, that there are other forms they need to search too (even if those “textual strings” become URIs). Also, if somebody searches a subject such as “Love” I think they would like to see how many choices they would have, with related concepts, broader, narrower and so on.

The purpose of linked data was never to make your own data obsolete. This is a danger that I see. Linked data was seen just as a way of sharing the information that you have in ways that are more structured and hopefully more useful than just regular links into pdfs or databases with local structures that are difficult to figure out.