BIBFRAME Linked data

On 4/2/2015 7:52 PM, Kelley McGrath wrote:
> We will have to agree to disagree. It may be*easier* to get information out of MARCXML, but you can’t get*more* out of MARCXML than out of binary MARC. If you add things into a MARCXML record that won’t go back into binary MARC then it may be XML, but it isn’t MARCXML anymore. MARCXML gives you a wider variety of tools that you can use to interact with the data, but it doesn’t address any other limitations of MARC. >
> My problem is that I want information that isn’t easy to get at. Or at least I can’t find an easy way to do it. Can you tell me (via algorithm and not human eyes) whether the contents of 245$b are a subtitle, a parallel title or title(s) that are part of a resource without a collective title? Can you find the titles that are hidden in 245$c?

That hasn’t been my argument. My argument is that you can’t get more out of RDF/Bibframe than you can out of MARCXML, so long as you *know the schema.* This means that if a developer understands the complexities of MARC, they can do just as much with MARCXML as they can with RDF/Bibframe. Most developers do not understand the complexities of MARCXML, but they will be able to use Bibframe, which is supposed to be more accessible to non-library developers.

Concerning your question about various types of titles, parallel, subtitle, etc., if the information is not encoded separately, it cannot be extracted separately, at least not without some additional work. To see this in action, I took a MARCXML record with a parallel title and put it into Bibframe. Here is the 245 from the MARCXML:

Eleven short stories =Undici novelle /Luigi Pirandello ; translated and edited by Stanley Appelbaum.

and the parallel title is coded in a 740, not as a 246 11 (which would display the parallel title note):

Undici novelle.

What came out of the Bibframe conversion was (look at the bottom line):

Eleven short stories Undici novelle Undici novelle parallel

I saw that it did come out labelled as a parallel title. I wondered why so I checked the Bibframe site on github and found this In lines 3564+, (I won’t copy the code), we see that it is digging out the equals sign which denotes the parallel tile from the MARCXML 245 field e.g.


and if it finds it, then adds

element bf:titleType {“parallel”}

which outputs as


It looks like it works with the $b and $c too but I haven’t tested it.

All very nicely done, but as we see, it can be done (is done) with MARCXML and is an example of something that is too difficult for a non-librarian developer to do. After Bibframe, it will be easier for others to take it, if they want it. While I see nothing wrong with this and am all for it, it is also something that libraries could have been working with for a long time, since the beginning of MARCXML. You don’t have to have RDF if you know the schema. This illustrates my point.

From another viewpoint, it is also important to realize that this represents no *additional access* for the user from what they have always had, because the catalogers have made added title entries (246, 7xx$t, 730, 740) for all of these types of titles. We saw it in this record which used the older practice of 740. Therefore, the computer processing that finds the equals sign (=) does *not* create additional access because people have always been able find it by searching the extra title supplied by the cataloger. What the processing actually does is translate the librarian’s secret language (=) into the words “parallel title”.

I think it is worthwhile at this point to step back a moment and reconsider: all of this is for the user, or in other words, non-librarians. How do they understand “parallel title”? “Parallel title” is a very library concept, and not even all libraries have the concept of a “parallel title”. For instance, the AGRIS model has different titles for different languages, English title, French title, Spanish title, etc. but not the precise concept of a “parallel title”. For an example of how it is treated, see where one item has three parallel titles, handled as three equal language titles.

In this sense, the AGRIS model has been more exact than ISBD-type practices and I think that for a non-librarian, the AGRIS-type practice is much more understandable than the more abstract ISBD concept of “parallel title”. I have no idea how a web developer, who may not have stepped into a library for the last decade or so, would understand terms such as “parallel title”, “alternative title” “running title” much less “work title” “expression title” and so on. Lots of librarians don’t understand the differences.

To consider further, does the public need these various titles labelled so precisely, or do they just need the titles themselves? It seems to me that the vast majority of searchers don’t understand these distinctions and anyway, they don’t need these distinctions to find the information they need. They just need assurance that the titles really are entered into the record so that they can be found. Who cares if it’s a parallel, alternative, spine or whatever title? Catalogers care. A lot, and for all kinds of reasons, but others, hmmm….

These are some of the issues that I have been hoping would be discussed, but it would take broad participation–not just catalogers and IT people, but public services especially, and regular users. People in other bibliographic endeavors who have different bibliographic concepts should be included. After all, everything is supposed to be linked now. That includes their stuff. It will take many groups to find out what the public(s) really need and want.

This is not the correct list for discussing these considerations, but they should be discussed somewhere.

James Weinheimer
First Thus
First Thus Facebook Page
Personal Facebook Page Google+
Cooperative Cataloging Rules
Cataloging Matters Podcasts The Library Herald