Monday, February 7, 2011

Cataloging Matters, podcast #8

Cataloging Matters, podcast #8

RDA: the Wrong Solution 
for the Wrong Problem
a paper given at the 
RDA@yourlibrary online conference
hosted by the Amigos Consortium
February 4, 2011
LISTEN TO PODCAST
I believe the title of this talk pretty well sums up what I intend to discuss. Libraries and their catalogs are in great distress, but although in distress, I remain optimistic since I believe there are many solutions possible. Unfortunately, RDA goes in the wrong direction: it will not help the public use our catalogs any better, and more importantly, RDA ignores the real problems that libraries, librarians, and catalogers, are facing. Above all, for catalogers: they need help. And lots of it.

What are these problems? I want to discuss a few of them, but remember that this is *not* an exhaustive list!

Today, with the exponential growth of the Internet, libraries are faced with a huge and ever growing number of resources that need description and organization; we are also experiencing the apparently paradoxical situation of an increasing number of variants of resources, along with fewer of them at the same time. By this I mean that instead of hundreds or thousands of more-or-less exact physical copies of a single resource, such as we see with multiple copies of a single book, these new resources are truly unique, single websites (i.e. each website equals a single copy) but this single copy can be viewed simultaneously by any number of people wherever they are on the web. At the same time, there are an increasing number of variants of resources as they are reworked in all kinds of novel ways. We have never seen these types of resources before. There are many examples, such as the “Star Wars Kid” video on YouTube that went viral and multiple versions of that one video came out.
[The best discussion of the many facets of Star Wars Kid is in Chapter 16 of the lecture Jonathan Zittrain: The Future of the Internet http://fora.tv/2008/05/15/Jonathan_Zittrain_The_Future_of_the_Internet#chapter_16 and the many videos are at http://www.youtube.com/results?search_query=star+wars+kid&aq=f. You must see the original first, though: http://www.youtube.com/watch?v=iQibs3albtM
There are also “mashups” [http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29], i.e. webpages that bring together bits and pieces of other webpages. All of these resources change constantly and sometimes there is a new version every few minutes or even seconds, while the older versions are not saved, and consequently, they disappear forever. Back when I saw the first examples of these resources, I thought they were the very definition of “ephemera” and therefore, out of the scope of library catalogs. That was a simple and satisfactory solution for me, but too bad: I was wrong. These kinds of resources turn out to be very important.

Articles are written about all of these new resources in scholarly journals. Online social networks are providing completely new ways to find information. One such is Aardvark http://vark.com/ that bypasses searching altogether and links you to “someone” who can answer your question! There is little doubt that developments using these, and still more innovative tools, will continue into the future. These are some of the consequences of the changing nature of modern information.

Then, once people have found the resources they want, they can use citation management software [http://en.wikipedia.org/wiki/Comparison_of_reference_management_software] that not only manages their documents and citations, but also hooks them into collegial networks where they can collaborate in various ways, and this software will even search for new documents automatically, using semantic analysis tools that ferret out your needs, without the necessity for anybody to consciously search for anything at all, not even a subject or a name heading! http://blogs.nature.com/mfenner/2008/09/05/interview-with-victor-henning-from-mendeley We can all take it for granted that these tools will become increasingly sophisticated as they develop.

None of this is science fiction and is happening as I speak. As a consequence, there are: 1) resources that never could have existed before the Internet; 2) there are entirely new methods to find resources; 3) people are using these resources in unprecedented ways, reworking them for their own purposes and as a result, I think it is safe to conclude that: 4) the general populace has completely different expectations and needs than ever before.

And yet, we need to realize that many of the changes libraries are facing are not really focused on us, but what we are experiencing is primarily a side effect of the sweeping changes going on in the traditional publishing industry. As a result, all libraries, from the local public library to the research library, are facing more-or-less the same pressures, or will be very soon. The changes that are roiling the traditional physical publication industries find their end point in the library.

The different publishing industries are dealing with the changes in various ways. The music publishers have been reduced to using threats and are despised by many throughout the world. In the traditional print industries, newspapers are in the vanguard, and their demise gives us a glimpse of the future of the entire print publishing industry. It is clear that the old ways are changing. So, the changes I am discussing in libraries are not necessarily focused on us: they represent a sea change in the centuries-old patterns of publication, of how people communicate among themselves, and libraries have been the end point of that process; but this too, may be coming to an end, or at least changing in fundamental ways.

Perhaps none of this is new to anyone listening; I think it shouldn’t be. Something else that shouldn’t be new is the fact that almost every library is facing major budget cuts. And nothing I have read has predicted that the “good old days” of those big, fat library budgets are going to be restored any time soon. (When were those good old days anyway?)

Oh yes, there is also this little thing called Google Books that already has more texts available at the click of a button than in many of our entire libraries! Much of this is publically available now, but once the full text of the entire collection is made available, and it would be wise to assume that it will happen sooner rather than later--and possibly very, very soon, like next week?--it will be highly difficult to stand in the way of our patrons’ demands, and libraries will be forced into subscribing to all of this magnificent full text. Our patrons will be able to search these materials with tools that have no library input whatsoever. This too, will develop in ways that are unpredictable. No one can convince me that this will *not* have tremendous effects on the use of library resources.

From all of this, it is obvious that the library community as a whole is facing serious difficulties, but we must admit that traditional cataloging and the local catalog are in absolute crisis. While catalogers are under greater stress than ever before because the numbers of resources are higher than ever, and potentially growing at exponential rates, the number of catalogers is not increasing or, in many cases, going down. All this is occurring while their final product, i.e. the catalog record, is becoming less and less understandable to the public, who now have far more experience of full-text retrieval tools than traditional library tools. Of course, in the march toward the future, eventually *everyone* will be members of the “Google generation”. If the result is that even the idea of “surname [comma] forename” is being forgotten, what does that portend for the far more complex concept of authority control?

The methods used in the library catalog are based on centuries of trial and error using technologies appropriate to an earlier time. For instance, our current procedures for creating cross-references for various forms of personal and corporate names, as well as subject heading access, are based on left-anchored textual strings and are founded on how people browsed card catalogs, which hasn’t been the case in most libraries for over 20 or 25 years. That has been quite some time now.

I am not aware of any studies that have shown that our public *today*--not the public of 100 years ago but the public of today--actually wants what RDA is designed to give them. Remember, RDA is based on FRBR, and the purpose is to allow the user to “find, identify, select, obtain (what?) works, expressions, manifestations, items (how?) by their authors, titles, subjects”. In other words--and this is very important--RDA allows *nothing new* at all, because FRBR explicitly restates the same user needs that have been the underlying purpose of the catalog since at least the early 1840s at the British Museum under Antonio Panizzi. So, if we institute RDA and FRBR, our users will not be able to do anything--and I repeat *anything at all*--that they cannot do today. This at the same time as the very nature of the resources, how the resources are found, and what people can do with them, have changed in ways that have been unpredictable, and these changes are continuing at an incredible pace.

What are the actual changes our users will see with RDA? They are the type that almost no one will even notice. For instance, I am sure almost no one will notice the spelled out cataloging abbreviations or changing the dates on personal names from, e.g. 1943- to “born 1943”, or the elimination of N.T. and O.T. in the books of the Bible. Also, if RDA is implemented fully--and this requires that there be even further changes than what are considered now--the display of the works, expressions, manifestations, and items *can* be different, if libraries want, and these views will probably look very similar to displays in printed book catalogs, although they will probably be somewhat more interactive. Or they can look more or less the same as today. But it is important to understand that with RDA, patrons will find no change in searching. For example, we will not institute a “methodology” access point for scientific materials, long asked for by many, or anything substantially different because the basic purpose of RDA and FRBR are exactly the same as what exists now.

Consequently, when I compare RDA to the tremendous changes in the universe of information that I outlined earlier, I do not see how it has any relevance at all.

RDA does not address the fact that people have definite problems using our subject headings, that people almost never browse lists of names arranged alphabetically, that full-text searching and various types of sorting, such as relevance ranking, are by far the most popular types of searching that people do--even though very few people understand what relevance ranking actually means. RDA does not address how to incorporate related, non-library metadata projects on the web and how catalogers can cooperate with other creators of metadata to get the help they so desperately need. Considering higher quality records, we all know that many libraries are not able to follow the AACR2 rules today and nothing happens to them, so why wouldn’t they just decide not to follow RDA as well?

At the same time, I ask: is it morally justified for libraries, who are facing major budget cuts, to spend significant amounts on training catalogers to learn RDA at the expense of.... what? The simple fact is there will not be new funding, so there absolutely must be tradeoffs: will there be less spending on materials and resources for our patrons? Will more staff be laid off? Will more library branches close? Will more pay raises be deferred, or will more paychecks be reduced? Our British colleagues are facing some of the most draconian budget cuts I have ever heard of. Is it in their interests to cobble together the funding for training for RDA somehow? What are they supposed to give up? Other libraries, such as my own, simply do not have the budget at all for this, period.

Therefore, we see that an unavoidable corollary of RDA implementation will be a split in the library bibliographic world at a highly inopportune moment.

No one has suggested that publishers will provide us with better quality records in RDA than they do now with AACR2. Creating RDA records will not take less time than AACR2, therefore, it is difficult to even imagine how productivity could increase. It seems to me that sooner or later, someone must demonstrate a sound business case in favor of adopting RDA. I have yet to see one.

As a result, trying to force FRBR’s 19th century view of information onto our new information universe is like those people long ago who continued to insist, while ignoring all the evidence, that the earth is the center of the universe. And also, now that new tools such as Google and Google Books exist that allow each individual to experience this new universe of information for him or herself, and to use it in very personal ways, then to insist that FRBR is “what people need” when any individual can see it is not, is similar to those groups who fervently believe that the U.S. did not really land a man on the moon and that NASA has been involved in a tremendous hoax from that time!

It is important to keep in mind that obtaining the funds for a subscription to the entire full text of Google Books, when it becomes available, will undoubtedly be much easier than for RDA training, even though additional funds will most probably be required, but the benefits of the full text of millions of books will be crystal clear immediately to each and every one of our administrators as well as to our patrons. In contrast, demonstrating the advantages of RDA to administrators and patrons will be next to impossible because the business case has not yet been made.

Why does the cataloging community insist on a drastic change in their rules that will have serious backroom impacts on workflow, training and productivity, but that no one will notice in the final product? I have a few theories, one of which I mentioned in my very first podcast where I discussed “change for change’s sake,” [http://catalogingmatters.blogspot.com/2010/08/cataloging-matters-podcast-1.html] but on further reflection, I realized there is another possibility: the Black Box of Bruno Latour. [One of the best discussions was in an old Lingua Franca issue http://linguafranca.mirror.theinfo.org/9410/latour.html]

Bruno Latour is a French philosopher who has a unique, and *highly controversial*, method of studying scientists. He studies them as if they were a tribe of primitive people in the South Pacific, and concentrates not on the products of what they make, but how they do it and how they relate to one another, or what he calls “science in action”. [Latour, Bruno. 1987. Science in action: how to follow scientists and engineers through society. Cambridge, Mass: Harvard University Press.] So, he asks: what is this “thing called science” when it is being done? How are scientific theories made?

He describes that while a theory is in flux, there is a huge amount of effort, ego, money, hope, energy, and everything else you can think of, poured into proving one’s own theory, but naturally there are competing theories, and just as much effort, ego and so on is thrown into those counter theories. When one of these theories finally “wins” and becomes accepted by the general scientific community, it turns into what is essentially a “black box” where information is input on one end, and from the other end, a solution comes out. Everyone agrees that the black box works correctly and whatever it produces is “correct”.

It turns out that the longer a black box is in place, the more people have invested in maintaining that black box. This includes companies that produce and sell scientific instruments and publish information, along with various scientific departments with their individual scientists all focused on getting grant money and interested in the advancement of their careers. Therefore, those who seek to “open” that black box do so at their own peril because they will be facing many established layers of powerful vested interests.

At certain points however, when the black box simply stops working, it nevertheless must be opened. In the case of libraries, I suggest that the black box is the traditional library catalog, and it has already been opened up for quite some time. It was not the librarians who opened it initially, but computer specialists who built their own tools, such as the arXiv.org plasma physics site, the entire open access publishing movement and even ingenious kids who built sites such as Facebook.

It is my position that since this is the situation we are facing, librarians too must--and I mean absolutely must--open that black box that has been handed down to us and protected by our predecessors since at least the days of Panizzi, if not before. We must open it for ourselves so that we can reconsider *everything* in it, its purposes, how it functions, and which parts serve the needs of our patrons. For those parts that do not serve our patrons’ needs, are they necessary for librarians, or can they be repurposed in some way? We must include in our deliberations all kinds of other groups such as interested members of the public and scholars and many, many others.

Latour mentions that doing this is like opening Pandora’s box: it will be messy; it will be disheartening; it will be humiliating in many ways, and yet we have no choice except to do it because the black box of the library catalog no longer functions as it should and other groups who are far more powerful and important than librarians are reconsidering matters right now without us. I am a cataloger and such an idea is very disturbing to me. Nevertheless, we must involve ourselves or risk remaining completely ignored. I believe that doing this will be a major step in the further advance of our profession.

A lot has been done already by the general information science community and while their findings should be considered, their conclusions should *not* necessarily be accepted.

I think there is a great deal the library cataloging community can do that will have far greater advantages for the public than the cosmetic changes of RDA, while being much less disruptive for us. To take only one example, we can face up to the fact that our traditional system of subject headings simply *do not work* in the online environment. But it doesn’t follow that people do not want the *control* that the subject headings allow and therefore should be abandoned. This would be an incorrect conclusion. In fact, this is one of those areas where the public has already opened the black box and come up with something called “The Semantic Web” which in essence, seeks to provide many of the same controls as our traditional subject headings and authority controls. An example of such a project is dbpedia [http://wiki.dbpedia.org/About]. See also, the project subj3ct.com [https://subj3ct.com/] which attempts something similar to what librarians have always done through authority control. These projects are far from perfect and just a few moments of a skilled cataloger’s time skimming over some of these projects will show how much help they need.

This is merely one area where catalogers could make important contributions to a huge, collaborative project that others can readily see and perhaps, even come to appreciate, at least appreciate far more than typing out a few abbreviations in local catalogs.

Above all, our creaky old MARC format needs to change into something more modern, plus our records need to be liberated from our local catalogs, to begin to make their own way in the world outside of library catalogs, to be reused in all kinds of ways by the public, but these records can still retain their ties to the library world through means of linked data.

As I mentioned before, many of these suggestions make me highly uncomfortable. I am sure they will make many other catalogers uncomfortable as well, along with the organizations they work for, but I feel something like this is imperative.

I think an anecdote from my own family history may be appropriate. This comes to me second hand, by way of my father. He told me a story that he had been told about his great-grandfather, my great-great-grandfather, Joel Akers, who passed away before my father was born. Here is a picture of him and his family.

This took place in a little farming town in Kansas, and my father told me how the townspeople told him that “Grandpa Akers” absolutely hated the new automobiles. People had fun remembering that whenever a car drove into town, Grandpa Akers would hobble out into the street, stamp his feet, shake his cane and cuss and yell all the time he could see the car. Folks compared him to a banty rooster. Of course, all of his anger and threats didn’t stop anything, but it gave others something to laugh about.

Although I would have liked to meet my Grandpa Akers, I confess that I don’t want to be like him. There is no use fighting these kinds of changes because it is wholly unrealistic to imagine how they could vanish so that some previous time that you happen to prefer will return. The fact is: this new world is not going away, and once this revelation is genuinely accepted, the task becomes very simple: Darwinian survival. How do we survive in such a future?

If I followed my instincts, I could let my “Grandpa Akers side” come out. I could cuss everybody and yell out: “I love books! They aren’t going away! Look how many are being published right now! These website things are crazy since anybody can put any blamed thing out there they want! And since you can’t believe what you see there, any fool who believes in those things is crazy too! Aardvark? What kind of a stupid name is that? And it links me up to some idiot out there who I don’t know, but he’s supposed to answer my questions?! What is this insanity? MARC format was good enough for my pappy, so it’s good enough for me!” While I yell this at the top of my voice, I can stamp my feet and shake my fist at anybody who is unlucky enough to come anywhere near me.

Yet, if I actually did this, what would happen today as opposed to the year 1900? Of course, there would be a very good chance that someone around me would have a cell phone. They could record my outburst, and upload it to Youtube. A video like that could easily go viral and I could become just as well known as the Dramatic Chipmunk, the Star Wars Kid, or the Dancing Baby, with people laughing at me, not just in the same town, but all over the world, and for a long time to come!
[The Dramatic Chipmunk http://www.youtube.com/results?search_query=dramatic+chipmunk, Dancing Baby http://en.wikipedia.org/wiki/Dancing_baby. Yes, the Dramatic Chipmunk actually did meet the Star Wars Kid: http://www.youtube.com/watch?v=UEO_PAeRFfc]
When I saw so many videos of myself, accompanied by homemade voice-overs and sound effects, each varying in its level of hilarity or obscenity, it just might turn out that I would learn something very special about myself.

Catalogers need a new attitude. We can see this attitude in the id.loc.gov example where the Library of Congress finally let out the subject headings in a format I did not know: SKOS (Simple Knowledge Organization System). I applauded, and still applaud this project because making the subject headings generally available has been overdue for many years. It is a great learning project, but as I myself learned to my own dismay, for several reasons, the subfield codes could not be transferred into SKOS. [http://www.w3.org/2004/02/skos/intro] As a result, a subject heading there is the entire text string with each subdivision separated only by double dashes! While I realize this is only a beginning and I certainly hope it will be developed further, I find it totally ironic that the way id.loc.gov has implemented the subject headings is essentially a replica of the catalog card itself in pre-MARC form! Still, what is so bad about it?

The power of the system of subject headings was not only with authorized forms, but even more important--I think, in the subdivisions that refine the main topic in various ways. Of course, there were always tremendous problems with library subject headings but the resulting benefits were enormous and could easily be seen by everyone who knew how they worked. Still, people faced the problem of finding the authorized form of the main heading, which necessitated (and still requires) a whole slew of cross-references. But this was only part of the problem: if you were to use the subject headings effectively, it was essential to get an *overview of the subdivisions* used under that heading, because when you did this, you discovered how the system of subdivisions actually opened up your mind to new possibilities you could not have suspected before. For instance, someone interested in horses could find by browsing
“Horses--Behavior--United States--Anecdotes”
or someone interested in Dr. Johnson could find
“Johnson, Samuel, 1709-1784--Knowledge--Manners and customs”

In other words, *when used correctly* the system of subject headings not only helped you find what was in the collection, but it also revealed new ideas you would never have thought of and actively searched for on your own.

In James Burke’s excellent documentary series “The Day the Universe Changed”, in one episode he described the development of catalogs and indexes. He demonstrated the powers available through indexing that brought disparate bits of information together in novel ways and how it helped people to think. He concluded that the result from indexing achieved “1+1=3”. http://www.youtube.com/watch?v=TZC-abOGRug I cannot think of a better way to describe it.

It was too difficult to get such an overview from examining hundreds or thousands of cards however, and so you had to consult the LCSH red books separately to get a coherent overview of the subject heading structure, which we must admit very few people did, and anyway, this method also had its own problems. Transferring such a complex system into online library catalogs has been a complete disaster, leading to general incomprehension among the public of a tool that is potentially incredibly useful.

We need to spend our time making this system where “1+1=3” function once again using today’s tools and for today’s populace. Since it has been a failure in our online catalogs, we need other options. It is absolutely vital to retain subject subdivisions. SKOS doesn’t allow it. OK, use something else. If nothing out there works and we have to create it ourselves from scratch, that is just fine, we should do it; we *must* do it even though it may not be “perfect”. I compare this to the Ferrari racing team: if the team decided it needed something and their mechanic said, “Well, I can’t find anything like that in the car parts store”, he would be fired on the spot. That is not how you win Formula 1 races. You yourselves, create the conditions for your own successes. The Ferrari team knows this very well. Catalogers too, need to adopt this kind of attitude in their work. Otherwise, they will come in dead last.

To calm our minds, we can keep telling ourselves that *we* didn’t open that box, others did, but it’s open now. All we can do is the same thing Pandora did.

How do we do this? I would like to close with a quote from Latour:
“Now that [the black box] has been opened, with plagues and curses, sins and ills whirling around, there is only one thing to do, and that is to go even deeper, all the way down into the almost-empty box, in order to retrieve what, according to the venerable legend, has been left at the bottom–yes, hope. It is much too deep for me on my own; are you willing to help me reach it? May I give you a hand?”http://www.bruno-latour.fr/livres/vii_tdm.html

Thank you very much for your attention. It really is a great time to be a librarian.

Consider joining the Cooperative Cataloging Rules Wiki! [http://sites.google.com/site/opencatalogingrules/]

1 comment:

  1. Wow. I'll be quoting this paragraph:
    "Above all, our creaky old MARC format needs to change into something more modern, plus our records need to be liberated from our local catalogs, to begin to make their own way in the world outside of library catalogs, to be reused in all kinds of ways by the public, but these records can still retain their ties to the library world through means of linked data."

    ReplyDelete