Sunday, July 29, 2012

[ACAT] RDA/FRBR buy in

Posting to Autocat

On 28/07/2012 16:00, Brenndorfer, Thomas wrote:
James Weinheimer wrote: ....
 So, if you had a database of 10,000 documents and all were in XML, to make those same 10,000 documents available in an audio format, you could generate them automatically through a single XSLT file that transforms the source file. It could even be done on the fly. If you wanted to make them available in some new format, e.g. epub or mobi or some other version, it can be done by adding only one file. Do we want to catalog these sorts of items separately?
Why would anyone need to "catalog" these? File characteristic metadata are generated on the fly, and can easily and automatically populate any database.
Because normally, they would be considered separate manifestations. If the metadata creation is handled automatically, I agree that is one option. But I would think that catalogers would believe that such matters should be handled at least semi-consistently in the real world. Let us imagine that a selector has decided upon the Thomas à Kempis, The Imitation of Christ site at There are obviously different ways to catalog it. I discovered that another translation is still available in an earlier version in the Internet Archive at

I can catalog it as a single webpage or I can catalog each instance of each file, so this can be either one record or a number of records, or I imagine I can do it both ways. As we see, when I first found this site in 1999, there were actually different bibliographical editions but there seems to be only one now. If I am to do any of it automatically, how do I do that? The people who own the site could do some things automatically perhaps, if they were interested enough and saw adequate value but otherwise, it seems to be up to the cataloger.

The benefit of the FRBR model is that the content can be cataloged separately from the carrier, which would mean descriptive data of the content, clarifying data about the content, relationships to creators and subjects, etc., could still involve cataloger intervention. There is no need to spend time "cataloging" the carrier data when this can be largely automated, or generated as needed. Relational databases work on the principle of the entities, attributes, and relationships. If you don't like the bibliographic elements in FRBR and RDA then use other ones suitable for the metadata task at hand.

The point that is being missed is that the overall framework and requirements for thinking about how data elements inter-relate and inter-operate are still the same. The point that is being missed is that the FRBR model **** IS BASED ON THE SAME FRAMEWORK USED IN BUILDING ALL RELATIONAL DATABASES ****. The cognitive dissonance resulting from criticizing FRBR but bringing up data problems that have already been solved by the same entity-relatonship model used in FRBR is so obvious that all one can do is watch the spectacle of such a trainwreck of illogical and nonsensical ideas.

I understand relational databases and how they work. I've built a few myself. It is important to acknowledge that they are not the latest in technology and there are other options. Relational databases are certainly not good enough for more advanced searching capabilities, for instance, if Google were a relational database, it would blow a bunch of gaskets. Lucene-type indexing technologies have proven themselves superior for those matters. (It's all based on flat-files, by the way!) That is why tools such as Worldcat with facets, which can now provide the FRBR user tasks, can operate as well as they do. Many systems use both databases: the relational database in the background for the technicians to actually manage the data (to edit and create the records), but Lucene technologies for the actual searching.

And I will say once again that there are many ways of modeling data. FRBR is one way of doing it but it is not the only way and it also doesn't mean it is the best way. One of the first steps in modeling is figuring out what is important to the stakeholders (i.e. the people who will use the system) and attempt to give it to them as much as possible. The FRBR data model was based (I assume) on this same idea to give the public what it wanted, and the model with works/expressions/manifestations/items was created. Yet, there was no research to discover if these really were the tasks that the majority of the users wanted, but just assumed to model according to the purposes of the catalog as laid out initially by Panizzi and Cutter and expanded later. Yet, we live in a different informational world since FRBR was created (in Internet time, the 1990s are now a different era, that is, in pre-Google times) and brand new resources are being created. Now we are standing on the edge of even more profound changes.

The first task should be to fix what has been broken for such a long time: upgrade the antediluvian MARC format, include the absolutely essential syndetic structures into keyword searches, get the subject headings to function coherently again (somehow), and link whatever can be linked. This is plenty to do, but absolutely necessary no matter what else happens. After watching how people work with all of this for a few years, perhaps we will have a better understanding what the public wants and how to adapt to it, and perhaps we will even see that FRBR structures are necessary. But there is no evidence of that now.

Perhaps everything is obvious to those so entranced by the prescriptions of FRBR/RDA "that all one can do is watch the spectacle of such a trainwreck of illogical and nonsensical ideas" of those who criticize, but for me, the amount of unsubstantiated library superstition is equally astonishing.

No comments:

Post a Comment