Monday, August 30, 2010

Cataloging Matters No. 3: The Functional Requirements for Bibliographic Records, a personal journey

Cataloging Matters, podcast No. 3


The Functional Requirements for Bibliographic Records, a personal journey

Hello everyone. My name is Jim Weinheimer and welcome to Cataloging Matters, a series of podcasts about the future of libraries and cataloging, coming to you from the most beautiful, and the most romantic city in the world, Rome, Italy.

Since my first podcast, which was about FRBR, I have received questions about what FRBR is and I have gotten several requests to discuss it further, so I conclude that relatively few people feel they understand it. Actually, this surprised me and I wasn’t expecting to talk about anything like this; still, I am very happy to do my best, but I will do it on my own terms. By this I mean that I will seek to describe FRBR as objectively as I possibly can, and afterwards I will provide my own personal opinions about it, but along the way, I would also like to talk about my own personal experiences with it since I think this may hold more meaning for people, and at the same time make it more interesting.

Also, I sincerely hope that this will help more than it confuses. If it confuses you more than before, I deeply regret it. My purpose here is not to give a complete overview of FRBR, but instead to demystify it, and to let catalogers know that FRBR is not all that new from what they are doing now. Ultimately, I hope this discussion may encourage people who want to know more to go through some of the excellent FRBR presentations and workshops that exist on the web. 

I am very active on listservs and have written about FRBR many times, so I will be repeating myself here on several occasions, but not on everything. For those who realize that I am repeating myself, please bear with me, but in this podcast, I will take nothing for granted, except that my listeners are experienced catalogers.

To begin, I would like to outline my personal journey with FRBR, which proceeded through the following stages:
Determination (to understand it)
--Incomprehension (not understanding anything)
----Humiliation (not telling anyone about my incomprehension)
------Renewed Determination (to understand it)
--------Joy (at the first glimmers of understanding)
----------Comprehension (full success)
------------Consternation (the first questions)
--------------Serious questioning
----------------Serious doubts

(By the way, believe it or not, this has turned out to be a twelve-step process, but I must state that any comparisons must stop there, or at least I hope so!)

Before talking about my journey, I believe some background is necessary.

To come to grips with these changes we are facing, it must be accepted from the outset that when people come to a library, they do not come for the librarians. Although they may be friends with the librarians, they actually come for something else: they come for what we have, that is, for our collections, and these people will decide how good of a job we do by how well our collections respond to their needs. No matter what else, people come to us for our collections, and not for us personally. Even though this is the way it has been from the very beginnings, fundamental changes were set in motion with the general access of worthwhile resources through the World Wide Web and they continue to change. Once we accept that fundamental changes are going on, questions inevitably arise: do we really want to maintain that the library’s collection, that is, the sum total of the information that is available to our patrons, is limited only to those materials that the library has paid for and organized? Is it wise to try to persuade our patrons that this is correct? I think it is not because they know that this is obviously not true. Therefore, I believe that it is vital for librarians to accept that the very idea of the “library’s collection” has changed forever and in many ways, has evolved beyond traditional library controls.

Given this scenario, how does the local library catalog fit in? Can it at all?

Additionally, the library, the library’s collection, and the catalog cannot be disconnected, even though people want to do it all the time. Patrons still ask all the time, “Where are your books on business? Where are your books on ancient Greece?” The answer is: everywhere, and that’s why you must use the catalog. If your library only has a shelf or two of books, that may be one thing, but when people come to a library that has thousands or millions of different kinds of items, the only way into all of that is through some kind of tool to find the materials that interest them, that is, through what I call a “catalog”, no matter what form the catalog might take. This “catalog” can be a traditional one--and there have been many, many forms of catalogs throughout the millennia that differ radically from one another--or it can be something very new such as the search engine of Google Books and whatever comes in the future. Nevertheless, there must be some sort of organized, semi-useful way into the collection, whatever form that collection takes, otherwise the collection itself will not be useful.

In fact, the World Wide Web would be useless if there were no tool such as Google, something that does the job of finding resources and presenting them to a searcher in ways that people find useful. If there is no organized, semi-useful way into a collection, people will not use that collection and will go someplace else that will be easier to use, especially in today’s Internet world.

None of this should be surprising and merely reflects the reality of almost any other service we use every day. When people go to a butcher shop, they go not because of the butcher, but because of the meat they will buy and take home. They may be close friends with the butcher and know his family; everyone may agree that he is the best butcher in the city, but if the meat is rotten, no one will buy it, nor should they. Also, even when the meat is good and people want to purchase it, if they encounter terrible difficulties actually getting it, for example, if they have to stand in line for hours waiting their turn, or the butcher is surly and foul-mouthed, they will go somewhere else.

Of course, if there is no other butcher in town, people will just have to accept whatever the butcher decides to give them or become vegetarians, but if a new butcher comes to town, watch out!

Consequently, to get people to use our libraries, our collections must be useful to them, plus whatever tools we provide for finding relevant materials must not present a barrier. If one falls, the other does; both are intimately connected by their very natures, even though many would like them to be separate. Before the existence of the internet and the world wide web, libraries were like the single butcher in town, but those times are over, so we had better watch out!

With that prologue out of the way, I would like to proceed on to my personal journey with FRBR.

When FRBR came out, I was still working at Princeton University, and since I had been responsible for the web presence of the Catalog Department and Technical Services, it became my job to be the “metadata expert” for the library as well. Therefore when FRBR came out, I took on as my personal responsibility the task to learn and understand FRBR as completely as I possibly could: to understand what it was, and to try to imagine how it could be put to the best uses. I was relieved that FRBR had appeared since I was beginning to sense the first changes in the information world and was happy the library field was responding.

When the physical volume came in (there were no decent ebooks in those days!), I got it as soon as I could, checked it out to myself, and began to read it. I confess that I read FRBR from cover to cover very closely (well, I didn’t actually read the index, I only perused the bibliography, but I did read the t.p. verso completely!).

As I neared the end of the book, my anxiety level began to rise. Even today, I remember my feelings very clearly when I closed that book; how I looked up and had no choice but to admit to myself that I hadn’t understood anything at all!

This was a devastating revelation but I had no choice except to admit it. I felt like the dumbest person on the face of the earth and was at a complete loss about what to do next. Of course, I couldn’t tell anyone what had happened, and all I could think of was: to read that horrible book again! It was humiliating, it was awful in every way, but I could see no other choice. I waited a few weeks, and dove in again.

It can be amazing how things turn out. Almost immediately as I began to read it the second time, I realized that... I knew it all already! In fact, there was actually nothing much new in it at all!

Here I would like to pause and before I continue, to let those of you who are listening know: if you are a cataloger--and not necessarily even an ISBD/AACR2 cataloger, but if you have worked as a cataloger, you already know much, if not almost all, of FRBR.  Remember that. The only differences are with some very strange vocabulary, and a weird structure of the records which has some unexpected consequences that can range anywhere from the  surprising to--what I think is--dismaying. But in any case, calm down: you already know it and you do it everyday right now.

When you see those strange hieroglyphics in the text of FRBR, realize you are seeing what you have always done, only described using completely different terminology and methods. So, when you see things like w1, e2, m3, and so on, this is not nuclear physics although that is certainly what it looks like in FRBR. These are not mathematical formulas, but attempts to describe in a different way what you are dealing with now. You know this stuff.

The main change is that FRBR represents a different viewpoint from the traditional cataloger view where you start from the item on your desk, how you describe it and then you fit it into the rest of the collection. (Electronic resources are different and I will try to talk about them in the next installment) FRBR on the other hand, starts with the collection and works its way down from there. Let’s see how this works in practice.

Currently, a cataloger starts from the item itself and gradually fits it into the rest of the collection, doing extra work when necessary. This is illustrated by the actual structure of AACR2, which in many ways, follows the workflow of most catalogers. You begin by describing this “thing you are holding in your hands”: finding the chief source of information, then transcribing the title and statement of responsibility, the publisher, paging, and so on. This is the ISBD part of the record, or the manifestation. Once you are done with this, you add any authors’ headings, creating any new headings when required.

If it is then necessary to fit this “thing you are holding in your hands” into the collection more specifically, e.g. let’s imagine you have a book that is an edition of Dante’s Divine Comedy, the cataloger will have to deal with various types of uniform titles, and this is the work. Then, if it is further necessary, e.g. this book has selections of parts of an English translation of the Divine Comedy, the cataloger must do further work with the uniform title (language and perhaps selections, if not something more), and may even have to further distinguish the precise version in hand with other variants: the translator (maybe), maybe an editor. In fact, the cataloger may even have to work with the publication history for printers and dates, and so on. All of this is the expression. Subjects and a call number can be added at various points in this process. Finally, if you have a barcode that relates to the specific “thing” in your hand, plus some other information, this is part of the item.

Of course, if the book you are working with is not a translation or an edition or version of something else in some way, which is the vast majority of materials, you don’t need to worry about a lot of this.

Comparing the current situation to FRBR, it’s best to think in terms of things called “attributes,” which can be thought of as similar to the subfields in MARC. Attributes are the tiniest bits of information, just as they are in MARC, e.g. in the 260 field, there are the subfields a, b, and c, and each subfield needs the 260 tag to be understood. In FRBR, there are no subfields--only the equivalent of fields, but this is no real problem since you can have the equivalent of 260a, 260b, and 260c, that is, where every bit carries all the information it needs to make it independent.

So, imagine all of the different subfields taken out from MARC in this way: 245c, 300b, and so on, plus all of the subfields from MARC authorities, and each of these “attributes” is independent of one another. There are a lot of them. Now, let’s also imagine that the MARC field and subfield codes are put into human-readable language, e.g. instead of 245c it would actually be “statement of responsibility.” Keep in mind also that this is not one-to-one since there are many subfield “concepts” in MARC that are not in FRBR, and there are some concepts in FRBR that are not in MARC. That is one of the purposes of RDA. But anyway, we now have all of these independent attributes. How do we group them?

In MARC and AACR2, all of these things are grouped by fields; for example, the 260 a, b, and c, mentioned earlier to create publication information, or the 245 a, b, and c to create title/statement of responsibility. But FRBR groups these attributes in a different way, by creating what are called “entities”. There are three groups of entities. I’ll save group 1 for later because that is the hard part. Group 2 includes the name entities, which pretty much equals what we have today for our name authority records. Group 3 is similar to our subject records as they are now. What is really different is group 1, which is where the work, expression, manifestation, and item exist, those bibliographic concepts we have all come to know and love so much!

To explain Group 1 entities, we need to return to the example I used before, While traditional cataloging starts from the item in hand and up to the collection, FRBR works differently: with Dante’s Divine Comedy, FRBR starts from the top down, that is: it starts from the work (Divina commedia), then it goes on to the expression (the translation information, the specific edition, or whatever), and only then to the manifestation and then the item, the thing you are holding in your hands. Exactly how the practical workflow for the cataloger will change is still very unclear at this point, at least it is to me.

Still, I think that when you keep this in mind, something like the following example from the text of FRBR, will make more sense (you may want to look at the transcript at this point). W means work , e means expression, m means manifestation, each entity including its own attributes.

w1 (the first work) Harry Lindgren's Geometric dissections
  • e1 (the first expression) original text entitled Geometric dissections (how it first appeared)
    • m1  (the first manifestation) the book published in 1964 by Van Nostrand in London
  • e2 (the second expression) revised text entitled Recreational problems in geometric dissections ....
    • m1  (the first manifestation of the second expression) the book published in 1972 by Dover in New York
To quickly put this into FRBR public displays, it could be something like this, and here I am using ISBD punctuation:

Geometric dissections / Harry Lindgren.
    • [Book] London : Van Nostrand, 1964
  • Recreational problems in geometric dissections .... [Rev. ed.]
    • [Book] New York : Dover, 1972
In this system, it is easy to see how another edition of Lindgren’s Geometric dissections, e.g. one published in New York by Knopf in 1968, would fit in. It would be a second manifestation (m2)  under the first work, and the patron would see:

Geometric dissections / Harry Lindgren.
    • [Book] London : Van Nostrand, 1964
    • [Book] New York : Knopf, 1968
  • Recreational problems in geometric dissections .... [Rev. ed.]
    • [Book] New York : Dover, 1972
As we can see, more than anything else, FRBR defines the multiple view of catalog records from the top down. Otherwise, it’s very similar to what we do today.

Any cataloger knows that many catalog records carry a lot of the same information, so for example, multiple editions of a certain book can repeat exactly the same title, subtitle and statement of responsibility, if not a lot more, so in a computerized environment, it makes sense that this kind of repeated information is placed one time separately where it can be used when necessary. To a certain point, this is what happens in many catalogs today that use relational database structures for the name, title, and subject headings; for example where the heading for Shakespeare is not typed in 2,000 times, but only one time. When you find Shakespeare’s name in your local authority file, his heading is not actually copied into your record, but a link is made, so that when your record displays, the patron will see his name displayed from this separate authority record, which appears along with everything else on one screen. This is normal database practice, where repeated information is entered only one time. The purpose is to make both maintenance and searching much easier and faster.

FRBR attempts to do something very similar, but extends this practice to the Group 1, 2, and 3 entities. This means that there will be separate records for each of these things. (There is a problem with the concept of the “record” but we will discuss in another podcast. For now, we will call it a record since the final product is the same) As we have already discussed, some of these records already exist: Group 2 entities (the name headings) and Group 3 entities (subjects) are pretty much what we have today. What is really different is in the group 1 entities, which posits that there should be separate entities that can be linked to for the work, for the expression, for the manifestation, and for the item. These entities can get a little more complicated since each can have links as well. To continue with our example, if you are cataloging a version of Dante’s Divine Comedy, instead of adding links to Dante’s heading, and maybe the uniform title, you would link to the “work record” for Dante’s Divine Comedy, but this record in turn needs Dante’s heading and therefore the work record would link to Dante’s name in the Group 2 entities through a special “responsibility relationship”. This way, everything could all be imported at one time.

The final product will work almost exactly like the Shakespeare heading works today in relational databases, as discussed earlier, except FRBR will also do it for the works, expressions, manifestations and items. I think there are very obvious problems here that will immediately make an experienced cataloger a little suspicious. Still, in theory--and I stress in theory--it can be imagined that such a structure could lead to a great savings in database design and record creation.

Next come more specific relationships among everything we have discussed, so FRBR defines all different kinds of relationships: work to work relationships, work to expression, whole/part expression to expression, manifestation to manifestation, and so on and so on. These get rather involved, but actually, they are no more involved than what a cataloger does everyday. So, none of this is really new.

Now come the user tasks, that is, what people want from bibliographic records, both from searching the entire catalog (i.e. multiple records) and from single records. From all of this finally emerge the Functional Requirements, that states that people want to Find-Identify-Select-Obtain specific parts of the group 1 entities (work-expression-manifestation-item), finding them by their group 2 entities (name headings) and/or by their group 3 entities (subjects).

So, while you may want to find bibliographic records by their subjects (e.g. find resources about evolution), you do not want to obtain all of their items, which could number in the hundreds of thousands, if not more. Before deciding which item(s) you want to obtain, you first need to identify certain resources, and then select more precisely what you want. At the very end, FRBR declares that if bibliographic records are to function correctly, they are required to achieve these tasks and thus we have the Functional Requirements for Bibliographic Records. If records do not do this, then they do not function correctly.

As a result, we can still fulfill Cutter’s Objects, or at least most of them!
(this is a quote)
  1. To enable a person to find a book of which either
             (A) the author       }
             (B) the title            } is known.
             (C) the subject     }
  2. To show what the library has
             (D) by a given author
             (E) on a given subject
             (F) in a given kind of literature.
  3. To assist in the choice of a book
             (G) as to its edition (bibliographically).
             (H) as to its character (literary or topical).
(end of quote)

I have to admit that when I finally understood all of this, I found it comforting to know that Cutter’s Objects would still be fulfilled. After all, that is what we have all been working toward for over 100 years!

In this segment, I have tried my best to present FRBR as simply and as honestly as I can, and with a minimum of bias. I would like to point out once again that the purpose of this part of the podcast is simply to demystify FRBR and make catalogers aware that they already know most of it, but not to provide a complete description. Still, it is interesting to note that FRBR will provide the same access as we have now, and the emphasis rather is placed toward providing a more coherent view to our patrons than the multiple card displays we have today that pretty much replicate how the card catalog worked. This means that instead of seeing 100 completely separate bibliographic records under Shakespeare’s Hamlet, where each record needs to be viewed to understand what it is, users will be able to see a more comprehensive and coherent view of the range of materials of Hamlet authored by Shakespeare, or materials about Shakespeare’s Hamlet, or still other resources based on Shakespeare’s Hamlet.

I have decided to stop here and save the rest for the next podcast. There is plenty to deal with in this one and I genuinely want others to make up their own minds before I relate the rest of my personal journey with FRBR, including the struggles with my own doubts, my despair and hope. Oh yes! And there is RDA to discuss as well, which is the practical implementation of FRBR.

To sum up, at this point in my twelve-step process, I had gone through several of the steps, and was in the stage of Comprehension and perhaps a little smug with my success. But very soon, will come my Consternation stage and matters begin to snowball from there. There are various reasons why this happened but it’s not that I think I am any smarter than anybody else. Primarily, I think it was because my own horizons expanded far more than ever before: I was living in a different culture; I started working in different organizations that didn’t use MARC, AACR2 or even ISBD; I began to work more with computers and saw the power of other formats, and I learned a lot when I began to do reference work with the public. But perhaps most important of all, I examined myself and saw that what I had always done with information had begun to change in fundamental ways.

The music I would like to close with is a short piece by Luigi Boccherini, the passacaglia from his "Night music from the streets of Madrid.” This recording is from the Internet Archive, and as before, there is no information about the performance itself, but this time, I am including the entire piece since it is so short, and so much fun.

That’s it for now. Thank you for listening to “Cataloging Matters” with Jim Weinheimer, coming to you from Rome, Italy, the most beautiful, and the most romantic, city in the world.


  1. Can you please provide a download link so that I don't need to dive in the mark-up of the Flash player to extract the download URL (

  2. Done! Thanks a lot. I'll try to include this on all of them now. As I mentioned, I'm still learning.