BIBFRAME Linked data

Posting to Bibframe

On 3/6/2015 10:59 PM, Ross Singer wrote:
I’ve spent the last 20 years working to make computers understand AACR2. Theoretically, I could be “licensed librarian” in two (full disclosure: I am not a librarian). Are you saying that as a 20 year library technologist I can’t crack the marc/aacr2 nut? If so, wouldn’t that be more damning of MARC/aacr2?

I guarantee that if you actually worked with the MARC records that are in the majority (and didn’t have a stake in a company that supplies a subset of quality MARC data) you would never suggest relying on the fixed fields for this.

Fixed fields could potentially help me limit to what’s been published in England. If I’m not interested in things published in London, Oxford, or Cambridge (which, let’s be realistic, will be the lion’s share), the burden’s on me to wade through the results to find what I want. And yet, all it takes is one interested (possibly non-library) person to identify the publishers in the West Midlands and all of the libraries in Birmingham, warwickshire, worcestershire, staffordshire, etc. has a curated set locally published resources.

Something libraries are completely unable to do currently.

These are some revealing comments. I would like to analyze them since I think they show some of the basic differences of a systems person vs. a librarian/cataloger. I have discussed this problem at some length in a podcast of mine, and also in a presentation I gave for students at La Sapienza here in Rome.

You have asked a question: what has been published in West Midlands, and then say that libraries are completely unable to answer that currently. But is that correct? Can libraries provide that answer? Yes, but they do it in a way different from how a systems person might expect.

To expect the library catalog to do it is actually using the wrong tool. The catalog was never designed to answer such questions. It never was designed to do it and (probably) never will. So, expecting the library catalog to answer such a question is (to me as a librarian/cataloger) much the same as expecting a hammer to help you examine the rings of Saturn.

That comparison may be shocking and seem incorrect, but to expect a library catalog to do what it is not designed to do is just as shocking to the cataloger. It is unfair to expect a tool to do what it is not designed to do.

It should be clear that if you want to examine the rings of Saturn, you must use another tool–and if the hammer does not help you, you should realize you are using the wrong tool, and not conclude that the hammer is therefore worthless.

A catalog does not contain data in the normal IT sense of the word. That is probably another strange idea, but it is nevertheless a fact. It contains information (data) that will help you find the information you want. In other words, it contains directional information to what you want, but it does not contain the information itself. Let’s see how this works in reality.

Going through the process of answering your question as a reference librarian can illustrate it. If you want materials published in that area of the world, can the library catalog help?

Yes, but you need to know how to use it. There is a subject heading “Publishers and publishing” that can be subdivided geographically. There are also lots of narrower terms and a nice scope note as well.

Following these headings (i.e. by browsing), we eventually come to “Publishers and publishing–England” with further geographic subdivisions, and we should not forget “Publishers and publishing–Great Britain”. It is very possible that the “data” you want is in some of these sources.

For movies, there is the subject heading “Motion picture producers and directors” that can also be subdivided geographically. So yes, libraries very definitely can do what you want. And they do it–every single day. Your question is not at all unique nor especially difficult. But it takes the entire library to do it and to focus only on the catalog is incorrect, and unfair.

Then, let’s add the idea of a new tool you suggested, i.e. the example you give of the “… one interested (possibly non-library) person to identify the publishers in the West Midlands and all of the libraries in Birmingham, warwickshire, worcestershire, staffordshire, etc. has a curated set locally published resources”. How does the library handle this?

It would be handled in the following way. His/her work would be selected by a library selector to ensure quality, and then the cataloger would make a record, and at least one subject would be under “Publishers and publishing” with the appropriate geographic subdivision. The cataloger would not add the actual information from that work into the catalog itself–because the catalog was never designed to work that way.

To be honest, I am not that great of a reference librarian–there are many others who are much better than I am. They may be able to help in better ways than I can, but I at least know there is this heading, although there are probably others.

Of course, it’s not easy for a user to find a heading such as “Publishers and publishing”. Everybody has know that from the beginning, and the solution was the extremely important job of the reference librarian.

Therefore, a catalog and the collection are designed to work very closely together, and to separate the two make both practically unusable. Perhaps people don’t like this, but that’s the way it all works.

Still, your question illustrates the basic problem very well: you expect the library catalog to do what the library collection is designed to do. I agree that lots of people expect this too, and then we arrive at a major problem for libraries and their catalogs in the 21st century: New user expectations.

I think this shows how the “data” in the library catalog is fundamentally different from the “data” in other kinds of databases. And it also illustrates how the normal tools used for “data mining” and “data extraction” that work fairly well in other venues are more or less doomed to failure when applied to library catalogs. They contain a different kind of data.

So, to compare the situation to earlier times, it is like someone who wants to know what has been published in a certain part of the world, walks into a library, thumbs through the catalog, and walks out angry because they have decided that the information they want isn’t there, all without asking anybody anything.

Unfortunately, this happens all the time today with visitors on the web so of course people have bad experiences. Times have changed and the scenario I described–once the norm–happens less and less, so I agree that something must be done. Do we conclude that the “directional information” found in the catalog is a useless relic of the past? Or do we try to re-imagine what can be done with the tools we have? Libraries have always had lots of tools.

I would like to think people are trying for the latter, but it seems as if the current trends are for the former. In either case, the first step is to understand where the real problems are and then it may be possible to find solutions–or maybe not.

There are many other related issues of course. What do we want from a library catalog?

For a deeper discussion, there is my podcast “Cataloging Matters No. 17: Catalog Records as Data”, and my presentation to La Sapienza (shorter) is at