Sunday, November 14, 2010

The Functional Requirements for Bibliographic Records, a personal journey Part 4

Cataloging Matters No. 6
The Functional Requirements for Bibliographic Records, 
a personal journey
Part 4


Direct Link
Part 1

TRANSCRIPT

Hello everyone. My name is Jim Weinheimer and welcome to Cataloging Matters, a series of podcasts about the future of libraries and cataloging, coming to you from the most beautiful, and the most romantic city in the world, Rome, Italy.

This installment continues my personal journey with the Functional Requirements for Bibliographic Records (or FRBR). Will I finish it at last? Stay tuned!

This series has gone on for three previous podcasts. I believe that this installment will make no sense at all without the others, so I strongly suggest that you listen to them first, in order. Links to the earlier podcasts, along with everything else discussed here, are available from the transcript.

I have been very busy lately with the school year and other matters, and that is why it has taken some time for me to continue this series. But as I warned in my first podcast, this is a true “irregular” in every sense of the word so don’t expect too much!

I have decided to spare everyone from having to listen to me recite my twelve-step process yet again. If anyone is listening, I can imagine the sighs of relief!--and I will take up from the time that I worked at the Food and Agriculture Organization of the United Nations, where I had entered my Serious Doubts Phase.

I was away from the library cataloging I had become accustomed to and was working with AGRIS cataloging rules, actually indexing separate chapters, papers, and articles when necessary, working with a thesaurus called AGROVOC; all of this along with a certain amount of systems development. While there, I also dealt with bibliographic formats and practices from other organizations, practices I had never seen before. These had to do with statistics, images, geographic information, internal documents, and all kinds of other types of resources. Therefore, I came into contact with separate “metadata worlds,” each world coherent and meaningful on its own, e.g. the metadata world of AACR2, our own metadata world of AGRIS, metadata worlds of different indexes, the metadata world of statistics, and so on. None was necessarily “better” than any other, and each on its own made sense more-or-less, and I would have loved to import any or all of those records into our catalog, but when I tried to get them to work together, all I got was hash and it would have been easier to just do everything from scratch. I saw how the power of the newer formats such as XML could manipulate “correctly-encoded data” in all sorts of amazing ways--and I could even do some of it myself!--yet the automatic methods could only go so far.

One sticking point lay in the details of “correctly-encoded data” and exactly what that meant. This went both for the formats as well as the data itself. It turned out that just getting other organizations to send author information coded as was tough enough, or to get people to use more or less in the same way. But trying to bring uniformity to the data itself, that is, so that everyone would use, e.g. “FAO” and not one of the dozens of other possible forms of its name, turned out to be overwhelming. It was very clear that you could work yourself to death trying to regularize this information. True, there were possible solutions using URIs instead of the exact forms of names, but it still seemed to me that there would have to be an incredible amount of agreement among all sorts of parties before any progress could be made. Being in the middle of all this propelled me deeply into my Serious Doubts phase.

But these considerations were soon brushed aside, since I accepted a position as Library Director at the American University of Rome, a small, undergraduate institution located in a beautiful area among graceful palazzi on the top of the Janiculum Hill, which provides visitors with some of the most spectacular views of the city that anyone could hope for. I took the position for various reasons: I wanted to be a real librarian again, but also, I had always felt that it was the smallest collections where the materials on the World Wide Web offered both the greatest opportunities, and posed the toughest challenges. A Harvard or a Yale will have a great collection no matter what, and in many cases for the people there, the materials on the web result only in an “extra format” of something already available to them. For a small collection however, such an opportunity should provide the difference between night and day!

But how do you do this in a small library with very little help? Perhaps in another podcast, I’ll talk about my own attempts and what I think are my successes and failures, but for now I want to focus on FRBR.

Returning to the Anglo-American cataloging world got me back into AACR2 and gradually, FRBR. More importantly, I began to have my very first substantial and regular contacts with the public as a reference librarian. I learned a lot and am still learning all the time. What have I learned so far? First, it’s not easy at all to think on your feet when a student has put everything off until the last second and is half frantic. It’s also not easy to wheedle out of someone what they really want to know and not allow yourself to get sidetracked into showing them all kinds of things they do not want, wasting your time and having them think you don’t know what you are doing; or to try and match your understanding of what they want to what is actually available. Plus, for loads of reasons, doing reference is far more difficult when you actively add materials on the World Wide Web into the mix, as I wanted to do: it is a practical impossibility to keep up with them; there are concerns of “quality of information”; World Wide Web sites change their names and locations; and, as I discovered, Google and other search engines are always tweaking their results, so a search that, in a manner of speaking, “worked” yesterday or last week may not work the same today. That can be maddening!

There are dozens or hundreds of very thorny obstacles when you actively try to incorporate the information available on the World Wide Web into part of the local collection, but I felt I had to do it, and I still do. The old methods just don’t work well enough however, and I have succeeded only partially.

Through my experiences with the public as a reference librarian and watching how they tried to work with the library’s catalog, I very soon fell into my Disillusionment phase. I saw firsthand how difficult a library catalog is for the public to use. At the same time, I saw how much easier it is for the public to use tools such as Google and Google Scholar, along with databases such as those from Ebsco or Sage. The Google-type tools and databases obviously were made with the user’s comfort and so-called “customer satisfaction” in mind, while the library catalogs had different ideas. It was at that time, while I reflected on my observations of my users’ troubles, that I remembered the purposes of the catalog as laid out in the FRBR user tasks: to find/identify/select/obtain works/expressions/manifestations/items by their authors/titles/subjects. Therefore, I began thinking about FRBR consciously once again.

And yet the main thing that I discovered very quickly was: when people ask for help, they very, very rarely are asking for works, expressions, manifestations or items. Of course, on a certain level, people continually ask for a book that they cannot find on the shelf, so they are asking for items, but beyond this purely mechanical/clerking need, they ask for answers to their questions, no matter where those answers may come from. Most questions are similar to: “I am writing a paper on the House of the Vestal Virgins, and I can’t find anything useful” or “I need statistics on drug crime over the last 10 years” “I’m updating a book I wrote several years ago and need the newest information” or similar questions. Rarely, and I emphasize very rarely but it does actually happen, I have gotten a question such as, “I need Hobbes’ translation of the Iliad” or, what I hear more often, “I need the latest edition of such and such a book”. To be honest, I am almost the only person I know of who wants highly specific editions. As one example of my own needs, I have been looking for a specific edition of Thomas Middleton’s play “A game of chess”, where I would like a first edition (yeah, sure!), but so far I can find no scan of the first edition. Still, I can download my own copy of the play from an edition of Middleton’s collected works from the Internet Archive, which includes excellent commentary http://www.archive.org/details/worksthomasmidd04bullgoog
and there is another copy of this set at HathiTrust. Plus, I do have access to a couple of scans of different frontispieces, one is from the first edition, while the other is probably what the editor in the collected works mentions in his preface to the play:
http://de.wikipedia.org/w/index.php?title=Datei:Middleton_%27A_Game_of_Chess%27.jpg&filetimestamp=20070708233726; http://commons.wikimedia.org/wiki/File:A_Game_of_Chess.jpg.

I have found no copies of the first edition here in Rome, and I hope that no library would make such a rare book available for inter-library loan, but in spite of all that, I still have access to quite a bit of information just by sitting at my desk and knowing where to look. While I am not the only person who wants resources of this type, there are nevertheless very few in comparison with everyone else, and such people are certainly not in the majority. Of course, I would also like all of this to be much easier to find. Nevertheless, I freely admit that this is not the primary type of information that I need either, since I search much more often for answers to my own questions no matter where those answers happen to be. So, in the vast majority of situations, I am not much different from my patrons. Naturally, the very idea of relating the concepts of works/expressions/manifestations/items to web resources seemed to be nonsensical to me: while I agreed that with enough mental effort, you could probably force sites such as youtube, microsoft.com, blogs, or facebook into an FRBR structure, I could not see how the final product be useful to anyone or worth the effort.

These observations may seem obvious and rather unimportant, but to me, realizing and accepting all of this was simply devastating: if it is true that very few people want works/expressions/manifestations/items, then it follows that people want something else. Then, the conclusion is unavoidable: the catalog does not supply what people want! That is when I found myself on new ground, in a place I did not want to be, and I did not like it one bit.

I squirmed, but I could not avoid the unpleasant conclusion: the very premises of FRBR toward users were clearly and utterly wrong. FRBR had confused “user tasks” with what the traditional catalog actually did. It dawned on me that FRBR describes how the traditional catalog has always functioned and although it may be correct so far as it goes, it does not then logically follow that this is also what people want or need. That is where the fallacy lies. And I could see that fallacy in operation every single day when I worked with my patrons, or even when I did my own searching.

Still, if the premises of FRBR were wrong, what did that entail for everybody out there? What did this mean for RDA, which was just really getting off the ground? And I stood face to face with my Despair phase.

In retrospect, my Disillusionment phase was not so difficult because it passed rather quickly, but my Despair phase (which I confess I still fall into occasionally) lasted significantly longer and was therefore, much more difficult for me. I could still do my job of course: select materials, create records in the catalog, work with patrons and so on, but it all had much less meaning since I saw how the newer tools were being used more often, with more relish, and many times, with results that were really not all that bad. People still had major problems with the new tools of course, especially when it came to writing a paper, but the problems with full-text seemed trivial compared to those they encountered when they worked with a library catalog.

I also became responsible for the university’s bibliographic instruction, or, what is now styled as the library parts of information literacy. Many things I learned about my students surprised me, but primarily, I was surprised that when people type the terms they want into a search box, I have not yet met anyone who understands what they are searching or what is happening. People even find such a question surprising. To the people I have met, a search box is a search box is a search box, no matter what it is and what it is connected to. There is one box and it does everything for them. Therefore, it becomes highly difficult for people to understand that when they are searching a library catalog, they are searching, as I now call them in my Information Literacy sessions, “Summary records” and they are not searching full-text. Several students have been honest enough to tell me that they didn’t understand what it meant to search by author, title, or subject! This was some of the reality I encountered.

I do not think any of these people are stupid, and in fact, the way they approach matters makes a lot of sense: all search boxes look the same, therefore they are the same, so all should be searched the same way. It seems to me that using search boxes demands an intellectual leap that doesn’t apply when searching physical catalogs and indexes. When working with physical catalogs and indexes, it is very clear what you are searching and what you are not, since it is obvious that there is no way you could be searching the full-text of the books on the shelves when you are using a card catalog, or when leafing through the volumes of an index. But when the catalogs and resources are all virtual, the relationships among catalogs, indexes and everything else become far more nebulous and the searcher can sense no clear boundaries. The entire environment becomes far more abstract, and you don’t know what you are doing, and what you are not doing. That is, you can’t know without doing a lot of work.

This is why I believe people search library catalogs in the same way they search Google, and why they almost always get such poor results when compared to full-text searching. Searching a catalog competently is a skill that must be learned; and not only learned once, it must be exercised or it atrophies, just as any other skill that goes unused. So, even if I had made some strides forward in some of my classes and people actually learned something, it would turn out that after a few months or a year later, they forgot. That should not be surprising, but it was for me.

While I was doing some research on user education, I came across a provocative article, which quoted a Mr. Line from a paper in 1983 entitled “Thoughts of a non-user, non-educator” http://www.londonmet.ac.uk/deliberations/courses-and-resources/wilson.cfm, where he was quoted as saying that the term user education is, “meaningless, inaccurate, pretentious and patronising and that if only librarians would spend the time and effort to ensure that their libraries are more user friendly then they wouldn't have to spend so much time doing user education.”

While this made a lot of sense to me, I am also interested in library history, and one of my favorite authors is William Warner Bishop from the Library of Congress (who also happened to be the first head of cataloging at Princeton University). He gave a talk to the NY State Library School in 1915 (http://www.archive.org/details/catalogingasasse00bishrich). In this talk, he said something about catalogs that always rang true for me and I would like to quote him at some length:

“Now no instrument can always be worked easily, safely and successfully by the chance comer. Herein lies much of the difficulty found in the use of card catalogs.

For who uses a card catalog? For whom is it made? This is the real crux of much of the current discussion of the merits--and failings--of that machine. Obviously it is not for the way-faring man; equally obviously not for the child just entering school. Clearly persons who wish to read or study some definite book or some subject are the normal users of card catalogs. For the idle or the curious browser, there are the open shelves; for the fiction seeker, the finding list and more open shelves; for the child, the children's room; for the man in haste, the reference collection and its attendants.”

and later he says:

“Is it not a perfectly fair statement that in the users of a card catalog there may be presumed some modicum of intelligence and a more than passing interest in some topic? I do not believe that the card catalog can ever be made so easy of operation, especially in this day of huge libraries, that every chance comer can handle it successfully without some instruction.”
He goes on to say that a catalog is complex because books and people are complex. This is beautifully said and very convincing, but I fear, it is absolutely outmoded in our day and age. Mr. Bishop went to great pains to talk about how different kinds of people can avoid using the card catalog, but today with an ever-growing demand to use library collections remotely and the ubiquity of what is now called search, “every chance comer” has no choice but to work with the catalog, and to do it without any real instruction. Since the number of reference questions is also plummeting, patrons even do it without asking anyone for help. Of course they will have poor results! As a result, I reluctantly came to believe that Mr. Line is correct and not Mr. Bishop. Today, people who want information for whatever purpose can avoid not only the catalog, as Mr. Bishop pointed out in 1915 for “the man in haste” and the “curious browser”, but today everyone, including the serious searcher can avoid the entire library as well.

As a result, I concluded that since there is absolutely no possibility of training all users of our catalogs, it is the catalog that must change and no longer be seen as an impediment. This means that it must change in ways that will be more “user friendly”. But added to this imperative were all of the other problems I have mentioned in my earlier podcasts: a mushrooming number of worthwhile materials available online--Google Books is only one site which alone has been adding millions and millions of books but there are an enormous and ever-growing number of other great sites out there with new ones popping up all the time, each containing innovative and wonderful resources; I had also noticed that there is a huge amount of metadata, but it did not interoperate because formats, data, and bibliographic concepts are not coordinated and consequently, the metadata others create is not “good enough” and must be redone over and over again by each group; there was the genuine challenge of full-text retrieval methods plus the new “social web” which were difficult to assess, but showed great promise and it only made sense to work with these things somehow; almost all of us were also looking at flat budget lines and on and on the problems went. These were some of the real and serious challenges that I saw we were facing, and what was the library community’s response?

FRBR and then, RDA.

To be honest, I had always been looking forward to RDA since it was clear that changes were needed, particularly in raising productivity, and dealing with the new, weird things I saw on the web, where it seemed that the only thing that was constant was that they changed.

I need to pause for a moment here to avoid a potential misunderstanding: when I say that productivity needs to increase, I am absolutely not saying that catalogers are slackers or anything of the sort. Increases in production come primarily through the introduction of technological innovations and adherence to shared standards, not through individuals working harder. There have been relatively few technical improvements in the creation and sharing of catalog records since the introduction of Z39.50. Some tools provide help in making authority records and so on, but a lot more could be done. Much more important in my view is for catalogers to produce records that are of a sufficiently high standard that other bibliographic organizations can just accept them without local editing. I think we all know that while libraries claim that they create records that follow AACR2, they often fail in many ways and local editing is necessary with the result that the same items are re-cataloged repeatedly, or it turns out that the volume of copy records that require editing becomes so overwhelming that libraries just give up and accept whatever comes their way. Such a situation cannot be considered adherence to standards and is unsustainable in the long run. Imposition of genuine and realistic standards that must be followed, as they are in other industries such as foods and drugs, or the automobile industry, if such standards were possible to implement, would doubtless increase productivity tremendously.

So this is what I mean when I say that productivity must rise; we work smarter so that we can genuinely cooperate, not that each cataloger must produce 500 original records a day!

To return: while it was clear to me that FRBR did not provide what users wanted, I was very interested in seeing what RDA would come up with. Perhaps the actual practice would improve on theory by avoiding the problems I saw and provide some real solutions. But when RDA came out for general review and I could see it, I plunged into the darkest depths of my Despair phase. I couldn’t even discuss matters of detail of RDA because I saw that it was silent about the tremendous challenges we were really facing: of productivity, how to work with the other “worlds” of metadata, or interoperate with full-text tools. RDA did nothing new except change a couple of procedures, and it stuck faithfully to FRBR. As a result, our patrons’ experience would not change at all.

About this same time the economic bubble burst, and lots of things changed. Before the bubble, I could at least consider retraining and retooling, but afterwards, it was simply unthinkable. Perhaps even then, if I had honestly thought that RDA represented a step forward, I might have considered fighting for funding (still unsuccessfully, I have no doubt), but I could not ignore that in my professional opinion, RDA is not a solution for anything and I could not justify spending precious dollars (or euros) on that.

In the depths of my Despair phase, I contacted others and it turned out that they also shared many of my concerns; they also had no money for retraining staff and switching over to RDA. This was when I found a ray of Hope because I learned I was not alone, and I decided to initiate the Cooperative Cataloging Rules Wiki (http://sites.google.com/site/opencatalogingrules/). It’s still new and I don’t know what will happen with it, it may be doomed to oblivion, but at least for me it represents a bit of hope and an option for libraries who either cannot or will not switch to RDA. I thought long and hard before announcing anything, but decided to simply forge ahead.

That pretty much describes my own, personal journey with FRBR up to the present, and the difficulty I experienced of accepting that FRBR changes nothing of substance and avoids the real problems facing modern librarians. Perhaps you will find this ending anticlimactic or unsatisfying, but it is not for me.

A very important concern of mine can be inferred from those who have listened to my earlier podcasts: that the FRBR user tasks are based on the work of Panizzi and Cutter, two giants in the field whom I have admired immensely. For me, renouncing FRBR was equivalent to renouncing Panizzi and Cutter and this made me exceedingly uncomfortable. Nothing improved until an exchange on the RDA-L list with Bernhard Eversberg, who helped me understand things better. http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/msg02048.html

I was discussing details of how difficult it had been for me to find some small bit of information I wanted (it turned out to be only a single page published over 100 years ago), but nevertheless I could do it and I considered the fact that I actually could find what I wanted nothing less than amazing. I mentioned that these are the sorts of things people want to do today and they have nothing whatsoever to do with the FRBR user tasks. Bernhard pointed out that in earlier days, people wanted to do the same things, but “they had to first align their intention with a bookish mindset and then walk into a library,” which seemed true, and I replied that in a case like mine, it was probably only after prolonged consultation with an experienced reference librarian.

So, perhaps Panizzi and Cutter were right after all, but for them, the existence of a reference librarian was simply too obvious to mention, since it went without saying that untrained people could never use a catalog competently.

The information environment has changed far too much and the presence of an ever-watchful, skilled reference librarian can no longer be taken for granted. This narrows the choices at our disposal: either to expect patrons to struggle with our catalogs as we can see them doing now and if patrons don’t find something it’s their problem and not ours, or we can try to make the catalog more useful and user friendly so that people can operate it more easily. Of course, in one way, shape, or form, our patrons pay our salaries, and since patrons can now actually get worthwhile information without the library, it is logical to assume that if we do nothing and expect everyone to continue fighting with our catalogs, those patrons will see us either as useless or obstructionist, and suddenly, their problems really do become our problems. For me, FRBR and RDA head in the wrong directions and are the equivalent of doing nothing.

So, we are left with improving the catalog. There are a lot of things we can and should be doing using the power of the computer systems, plus focusing on increasing quality and standards. Fixing this situation will demand time and imagination, a lot of trial and error; and I hope it will be done with fantasy, taste, and even a bit of fun here and there.

For those of you who have had the patience to share my journey, I hope you have enjoyed it, whether you happen to agree with me or not.

The music I would like to close with is from the first movement of Vivaldi’s stirring, and rather dark Double Cello Concerto in G Minor, performed by the King’s Consort. http://www.youtube.com/watch?v=IYdTLnlc4q4&amp

That’s it for now. Thank you for listening to “Cataloging Matters” with Jim Weinheimer, coming to you from Rome, Italy, the most beautiful, and the most romantic, city in the world.

9 comments:

  1. I agree with much of what you say. On the point of "This is why I believe people search library catalogs in the same way they search Google, and why they almost always get such poor results when compared to full-text searching" -- I think we CAN make our library catalog search results a LOT better, even without full text. Our current interfaces are awful. Now, is the "lot better" we can get good enough? It's hard to say, and if it won't be good enough than it might be a mis-use of time to try. On the other hand, that (not full text meta)data is all we have right now, it might be worth a try.

    Curious what you think of my own (alpha, development demo) attempt to improve search based on traditional cataloging records greatly, using Blacklight, not too different than other people's Solr-based attempts to improve catalog search, but this one is mine. :) https://blacklight.mse.jhu.edu/demo

    On a more broader topic -- it's true that finding a known WEMI is seldom the _ultimate_ goal of our patrons. And I think this in fact has probably ALWAYS been true, it's not new. But we've got a lot of books that can help meet the actual user end goal -- finding the book that can meet your goal is an intermediate step to meeting your goal. At least that's always been the goal of the library catalog, right? So, on the one hand, we can imagine a world where people _won't_ want to find books at all, they'll get all their information in individual pieces online. In which case libraries might as well not worry at all about how people find books in their catalog, the books on the shelves will basically just be historical preservation. I'm not sure how likely or how soon this scenario is. But as long as the books on our shelves (as well as electronic things packaged into book packages) form a large and useful part of what we have to offer to meet actual goals, it is our mission to connect people to the books that will meet their actual goals, even though a book known in advance is not the goal itself.

    I agree with you that the FRBR "goals" are kind of completely wrong, they probably don't really represent user goals at all, and certainly not the actual ultimate goals. They may represent some intermediate goals though, whether those are the intermediate goals or not, the job of our catalog, with regard to the books on the shelves, is still to translate the actual user goals into _books_. And one thing that has hampered the ability of software to do that is the fairly messy state of our data. I think the FRBR _model_ is a lot more interesting/useful than the FRBR goals (despite the fiction of the FRBR project itself that it's about 'requirements', it's not). An attempt to have our data be clear about what exactly it is about, to allow more flexible software approaches to matching actual ultimate goals to books on the shelf.

    So, again, if matching actual user goals to books on the shelf is not a useful activity in the first place, then okay, forget it all. But if it is, we've got to model our data more clearly to support that, and I think FRBR is a pretty good framework for that -- being based on a hundred+ years of library experience figuring out how to describe books clearly in records is a plus in this aspect, rather than a minus.

    ReplyDelete
  2. Thanks for the comments. I think we are 90% in agreement. A few of my own comments:

    Concerning your demo at: https://blacklight.mse.jhu.edu/demo.
    I love it. It works fast and well, with lots of links, lots of options and so on. So, e.g. https://blacklight.mse.jhu.edu/demo/catalog/bib_122331 you have some kind of mechanism searching behind the scenes to provide people with various copies of Gray's Anatomy that are online in a way that is simple and easy. Plus, when I am looking at a record for the physical copy, I am aware of others, e.g. https://blacklight.mse.jhu.edu/demo/catalog/bib_57898. Very impressive!

    I am a book lover, too, and I am not trying to minimize books or libraries, but I also do not believe that we should emphasize certain materials to our patrons just because they happen to be on our shelves. I fully realize that this involves internal political considerations; still, people should be able to limit their searches to physical materials available locally, but people need information no matter where it happens to be, and it is the job of the librarian to connect the patron to the relevant information, that is, to the best of the librarian's ability. If it is a matter of connecting people to books, our catalogs do this right now and have done so for a long, long time, and there is little reason to institute FRBR or RDA. The card-catalog interface we have needs to be updated, and you have improved on it already very nicely without FRBR structures or RDA. Perhaps adding circulation information would be useful for patrons: e.g. other undergraduate students have also checked out these books, other professors have also checked out these books, etc. This could mimic Amazon in several ways. Lots more could be done right now, without implementing FRBR or RDA.

    ReplyDelete
  3. This gives me a chance to illustrate a problem in FRBR perhaps a bit more clearly: Today, we must assume that people will do keyword searching and have pretty much ceased to browse as people did in the card and book catalogs. Now, the multiple display in a book catalog was very predictable, pre-determined by the filing rules, but display of multiple results from modern keyword searches are unpredictable--the best we can do is let people manipulate them in different fashions, as you have done; therefore, this means that the real starting point for the searcher will tend to be *the individual record* they have chosen through mechanisms such as yours, and they will then proceed from there.

    This is 100% different from browsing a card or book catalog, which is what underlies FRBR, or finding WEMI. For me, the main focal point for the user is the bibliographic record, which we call the manifestation (but I have already discussed my problems with the manifestation in the 2nd installment of the FRBR series). The bibliographic record is what people will focus on. This has major consequences for FRBR: in this scenario, when a person already is looking at a record for a specific item, do they then want to see all other variants, i.e. the WEMI? Very, very, very few people need that, and our systems allow this right now anway.

    What do people want from a bibliographic record? Certainly, they want the item, but that is mechanical; what else do they want? FRBR claims that the patron wants to use the information in the bibliographic record to find other items in the "collection" by subject, author, or title. People want all of this (title, I don't know), but I think there other things that are much more important to people, and anyway, they want more--a lot more. I have tried to call this, for want of a better term, an "intellectual microcosm" that exists around any item. This microcosm includes conference papers and public lectures on related topics by different people; it includes reviews, debates, blog entries, journal articles, links to other information through citations, and all kinds of other things I can't even imagine now. People want this information and it should be presented to them in a coherent way that reveals the wealth of possibilities available to them.

    When looked at in this way, the individual bibliographic record could, and should, become the locus for some new and very powerful possibilities. I think we can see it to a small extent in the "About this book" display in Google Books, where you can see some truly novel information: word clouds, maps, "popular phrases" and so on.

    That's enough. But I think you can see where I am going. And your demo catalog could be a great part of it!

    ReplyDelete
  4. Yep makes sense. Having clearly modelled data is totally important for creating that 'intellectual microcosm' around a record though.

    FRBR totally got the 'requirements' part wrong, but I think it still got the _data model_ pretty much right. The kind of clear data model that FRBR provides, where it's clear what data is really saying and is really about in an unambiguous way, is neccesary for doing unanticipated things with our data, and neccesary for sharing and re-using our data clearly.

    It occured to me recently that there's a very clear way to explain why W and M are different and needed -- is a particular piece of data about the work in general without regard to specific versions or editions (W), or is it about a specific version or edition (M, or sometimes E). Without being clear about which is which, you can't re-use data accross editions without entering it over and over again in each edition, and also can't tie in your data with other data sources in a clear way -- in my Umlaut-powered attempts to for instance find full text copies of a book, I'm never sure if a full text copy at, say, the Internet Archive is the same edition or a different edition and can't tell the user that, they just need to look and try to figure it out for themselves.

    And even just within a search, I think knowing these edition relationships will be increasingly important for providing clear and concise displays that don't mislead the user. While the users end goals may not be articulated as 'find all the versions of this thing', when you have multiple versions split accross different items in the results, that may or may not even be adjacent to each other and do not link to each other, I think this quickly begins to get in the way of the actual end user goals. If someone ends up with something other than the latest edition of a textbook because your catalog didn't tell them a more recent edition was available; if someone can't find what they're really looking for because 20 different editions of something _else_ are cluttering up the search results as 20 different hits; etc. Good interfaces require knowing when two different records for two different things that aren't _identical_ (different M) are nonetheless the same W. Google Books actually _tries_ to do this, purely algorithmically, but does a very poor job, it's hard to have a computer guess this from our current cataloging records.

    ReplyDelete
  5. [Just noticed your first comment. I disagree that our catalogs do a fine job of connecting people to books right now -- they do an AWFUL job, and are honestly embaressing, they appear to our users as anchient technology. While there IS a lot of room for improvement without any cataloging changes -- there is ALSO a lot of things that are ridiculously expensive or even infeasible for us to do becuase of our cataloging not really intended for machine action. My demo has been VERY expensive to us in staff time, in large part because of having to write really really complicated code to try and tease out useful data from our poor records -- and there are some things it still doesn't do very well or at all, because the data simply doesn't support it. I've tried to document some of those things on my blog as I go. But please don't use my catalog as an example of how our data is just fine and doens't need to change -- my experience in developing it has been to come to the opposite conclusions. ]

    ReplyDelete
  6. As you, I have set aside FRBR and RDA because we cannot afford retraining our staff. But also, looking at a catalog, like (http://www.library.miami.edu) University of Miami´s Encore Catalog, I feel that we can make our catalogs a lot more friendlier using our old AACR2 and MARC 21. So what´s the deal?

    ReplyDelete
  7. In reply to Johathan: You're right. Our catalogs do not do a "fine" job of connecting people to books now. That was a poor choice of words on my part. I meant "adequate" in the sense that this is what people have been using up to now.

    I also agree that someone looking at a specific record needs to know that a later edition exists. This was pretty easily handled in the card catalog, since the card for the later edition filed right behind the earlier one, and in any case, everything stands together on the shelves. But my question is: do we have to redo everything to get these sorts of things to work? Especially since we cannot consider our records working independently of everything else on the web.

    Is there adequate information right now within a record to find later/earlier editions and other related works? I think there is, but I confess I cannot program it: doing automatic searches of 1xx/240/245, and/or 700at, and/or call number searches for editions, making fuzzy searches, or not. If there are any results, they display, and if there are no results, you see nothing, just as with your search of the Internet Archive and Google Books.

    These are the ways I think we should try to proceed forward: repurpose what we have for new uses. Perhaps it won't be 100%--big deal. Nothing I have ever seen is 100%, Google isn't 100%, but gradually, gradually, we need to work in ways that promote greater interoperability in all kinds of ways. In this sense, instituting FRBR structures will not promote greater interoperability, unless everybody adopts those structures, something I can't imagine happening.

    We also need to keep in mind that we are not the leaders. For example, I am pretty sure that Google will not redo everything for FRBR structures--after all, they wouldn't even take OAI-PMH!

    I just wrote a comment to Autocat about editions/manifestations that may interest you in this regard.

    Finally, while I am sure that your demo is very expensive to create, I will state that retraining, retooling, redoing local documentation at every library, and upgrading every library's catalog, necessarily splitting the library's bibliographic universe in the process and undoing a lot of the gains made from the shared standards with ISBD, will probably be far more expensive than what your unit is spending. The improvements should be demonstrated before we just set out blindly on such a course. I, and many others out there, do not have the faith.

    Once again: I am not at all against changes in the library catalog or in the practice of cataloging. Changes must happen and they should have started long ago, but that is water under the bridge. What I am saying is that any changes we make should be in ways our users want, and not just for the sake of change.

    ReplyDelete
  8. Hi Jim, I listened to your Cataloging Matters podcasts today in one fell swoop. They're excellent. Thanks very much for sharing you're F-R-B-R journey in this audio format.

    ReplyDelete
  9. At the public library reference desk where I work, the majority of our reference questions are from people looking to fulfill FRBR user tasks. And I see people using the catalog all the time with the same goals. We *occasionally* get a reference question of the kind you get at academic libraries, but probably not more than 2 or 3 per week. I'm excited about FRBR, although less so about RDA.

    ReplyDelete