Monday, March 19, 2012

Re: [RDA-L] Card catalogue lessons

On 17/03/2012 15:17, Brenndorfer, Thomas wrote:
<snip>
Why is the local catalog definitely not the correct tool here?
Catalogers go to great lengths to record the very same data as in relationship designators in the form of notes and statements in the record. That's the whole "justify the added entry" concept.

Decisions about what headings can become added entries are decisions based upon identifying the role played by the creating or contributing entity. As a catalog user I find it extremely frustrating to have to scan notes to understand why a search result came up attached to a particular person's name. It's great when I eventually find the note, but this is not a user-friendly design, and looks more like inefficiencies built upon inefficiencies.

A good data entry system has vast impact on the efficiency of cataloging. The issue of time spent on entering relationship designators quickly becomes moot when one factors in:

a) the time spent already on making those very same decisions and entering comparable data,
b) the ease with which some newer systems allow catalogers to enter data, validate data, or harvest data and integrate it, and
c) the overall goal of FRBR and RDA which is to create a universal baseline of understanding of what is going in catalogs.

This last point means that catalogers, systems developers, and other data providers all understand the same model, the same techniques, and the same application possibilities. That's the core business case for FRBR and RDA, and it's the same reason why the rest of the data management world relies heavily on entity-relationship models to design data systems that solve real world problems. Why ignore a tool (entity-relationship modeling) that the entire world of data systems now depends on?

And understanding this includes understanding the possibilities for retrospective conversion. It's far better to start thinking of retrospective conversion when there's a good new home for that data than just to recycle that data from one inferior system into another similar kind of inferior system (having gone through three ILS's in the last 12 years, data migration issues are expected, but it's frustrating when the benefits of a new system only come down to some low-hanging fruit, and the larger scale benefits are still out of reach of many library systems, and so the same weaknesses of AACR/MARC data abound).

AustLit ( http://www.austlit.edu.au/ ) did it correctly when old databases were FRBRized-- much wasn't difficult, and the rest was handled by good tools and trained data specialists. And when it was done, it was done, and everyone was better off because of it.
</snip>
Now we finally get some mention of the business case. And we see that "entity-relationship" models are the justification. "This last point means that catalogers, systems developers, and other data providers all understand the same model, the same techniques, and the same application possibilities. That's the core business case for FRBR and RDA, and it's the same reason why the rest of the data management world relies heavily on entity-relationship models to design data systems that solve real world problems. Why ignore a tool (entity-relationship modeling) that the entire world of data systems now depends on?"

Creating a data model from scratch is quite different from creating one based on models in long use. We have decades worth of data, many times representing records created over 100 years ago. As I have mentioned before, FRBR has turned the relationships of the traditional catalog into entities. In the card catalog, there were not explicit "works" or "expressions". In some cases, you might find a card that would explain in more detail how the cards were arranged, http://imagecat1.princeton.edu/cgi-bin/ECC/cards.pl/disk6/0488/B5379?d=f&p=Bible+...+Texts--Anglo-Norman+%3E&g=51627.500000&n=2&r=1.000000&thisname=0000.0002.tiff but you would not have separate cards giving you information about the "Work" of the Bible and then another card telling you about the Bible in English.

The works and expressions were theoretical constructs that determined how the separate cards were arranged in the catalog. These arrangements in turn, were derived from the arrangements found in some printed book catalogs (but not all) and based on the rules defined by Panizzi and Cutter. With FRBR, these relationships between and among all the separate records were turned into "entities", and these formerly theoretical constructs that guided arrangement suddenly turned into "things" that had to "exist" and therefore every record needed associated work and expression "entities" even when there were relatively few cases of such arrangements in the catalogs (less than 20%) because the vast majority of everything written exists in a single manifestation.

I agree that *if* we were setting up a brand new database today, we may want to consider creating an FRBR-type structure, but this is not the case. We have massive amounts of data that will need converting to this new model, so then the questions must be asked: how much effort will it be to create and maintain these new structures, and will the costs be worth it? The example of adding the "producer" attribute is only an insignificant part of this when considering all of the relator codes for past and present, and relator codes is itself only an insignificant part of FRBR structures. Yet it serves to accentuate the enormity of the problem. Are there any alternatives that accomplish the same goals? And I think a vital question is: will it make any difference to those who will use the system?

I think I have shown that there are alternatives that accomplish the same goals, and only research can determine whether creating the FRBR structures will make any difference to today's public. Yet it seems as if the utility to the public may *not* be the main purpose now of FRBR, but just to get into the linked data universe. There are many ways to get into the linked data universe, if that is the goal, and if we want database interoperability there are alternatives that have been going on for some time.

It would be useful to discover if the database interoperability that currently exists has made a difference, for instance, in the Google Books interface is the "Find in a Library" http://books.google.it/books?id=df6Ug_9CQccC. I wonder if this has made any substantial difference to library circulation? After all, although this is not linked data, this is one demonstration of how it could work in the linked data universe. Does anyone know if it has increased circulation in their libraries?

No comments:

Post a Comment