Saturday, April 30, 2011

Re: RDA MARC coding question

Posting to RDA-L

On 04/29/2011 09:40 PM, Vosmek, John J. wrote:
> The condescension from RDA advocates toward RDA skeptics (implying - or sometimes stating outright - that the skeptics are just too closed-minded and thinking too "inside-the-box" to grasp the revolution in thinking that is RDA) probably doesn't actually help the case. If the majority of the people who are going to have to implement it don't understand why it's so great, that probably isn't their fault.
You are absolutely right. There is no reason why anyone has to accept something on faith--people have every right to remain skeptical until things have been demonstrated to their own satisfaction. One thing I have noticed in much modern discourse is an apparent assumption that if people have similar understandings on a certain point, they will (must) necessarily agree on the conclusions. Therefore, if there is a disagreement over the conclusions, there must be a lack of understanding somewhere. Therefore, if you are to solve the dispute, someone must be "educated". I have found that some people "educate" others by talking louder and calling names! :-)

I reject this line of reasoning (although I confess I have been guilty of it myself and have had it pointed out to me). People may easily have equivalent understanding of a point, but each person has his or her own individual experiences, beliefs, likes and dislikes of all kinds, so it does not at all follow that they will agree on the conclusions as to the desirability or undesirability of a disputed point. Socrates demonstrated this to devastating effect in the Dialogues, where he showed that nobody had all that great of an understanding. For him, the only way to reach the truth was through genuine dialogue, so that all could hopefully take some tentative steps to the truth.

Concerning MARC coding, as far as I am concerned, the changes toward FRBR started from the wrong point. (For the moment, I will assume that FRBR would be a good thing to implement) Changes started with the data (RDA) and not with the format. The first step in changes should have been the communications format (MARC, or--I have to bring it up again at risk of having virtual tomatoes thrown at my virtual head--at least its ISO2709 incarnation) into something more modern, more flexible and more useful to the general metadata community. It would make no sense to try to exchange true FRBR types of records using MARC/ISO2709, so the changes should have started there. By now, something could probably be demonstrated and people, including the general public, could all begin to determine the worth of the final product (FRBR). Then whatever changes were needed to the data would have probably been clearer, and matters could have gone on from there.

But since they started with data instead of format and still, nothing can be demonstrated to the skeptics, so it all remains an argument between the believers vs. the unbelievers.


Friday, April 29, 2011

Re: Is cataloguing a science or an art form? (Was: Cataloging Training Needed)

Posting to Autocat

On 29/04/2011 10:48, Anna Martin wrote:
I enjoyed reading Christel Klein's comment "I think of cataloguing as both 'craft' and 'art'."
It annoys me a bit when people said that cataloguing was an art form since it is plainly a matter of following rules as closely as you can. However the more I discover about science and art, the more I wonder if there is really any difference between these two concepts. To be good at either you have to know all the "rules" and then decide when and how to apply them.
Unfortunately in my experience one comes across few cases when there is any room to be creative when cataloguing: mostly when issues arise it is due to lack of information and the cataloguer just has to find that information or follow the rules about what to do if they can't. So if I had to choose one I would still definitely say it is a science. AACR2 is like an omniscient God that is designed to cover all eventualities and is, in my perhaps limited experience, pretty effective in doing so. I think if I had to I could catalogue a carpet or a lampshade or a box of different coloured marbles using rules straight from AACR2. It would be fun to try but sadly any artistic flare most of us cataloguers might have languishing in our souls is suppressed and we only catalogue books and a few limited pre-existing  formats such as CD's and DVD's and which definitely call for "science".
It seems to me that artistry implies a certain mastery over the basics somewhere along the way. Somebody who will not learn how to play the piano but just pounds on it cannot be considered an artist. A pianist who will be considered an artist must spend long hours learning how to play, what he or she is really capable of, and not least important, they must constantly exercise their skills because they may lose them rather quickly.

Of course, just knowing the mechanics of how to play a musical instrument is not necessarily enough to be dubbed an "artist" which means something more. What is that extra? I don't know, but being here in Rome, I can state that when I go into a church filled with all kinds of statues, the ones by Bernini just pop out at you. You can almost always find the ones by Bernini and they are easy to spot, but I don't know why. Obviously, he had some kind of exceptional quality.

I have met many cataloger/artists. They display a freedom, a willingness to push the rules and practices but still manage to stay within them somehow. Naturally, this demands a deep understanding of, and experience with, the rules (i.e. the tools of the trade). I can also see the artistry in the records many produce, primarily through subject analysis and authority records, but certainly I have seen artistry in descriptive practice as well.

So, while I agree with you that "cataloging is plainly a matter of following the rules as closely as you can", I think there is still some lee-way: you can follow those rules creatively, or follow them  dogmatically. I like to think I have been on the more creative side, but maybe not.

Re: Fictitious beings as pseudonyms

Posting to RDA-L

On 04/28/2011 05:10 PM, J. McRee Elrod wrote:
But so long as we insist on Cuttering by main entry, the Chilton works will be scattered on the shelf. Finding the bibliographic records is not enough. We need to facilitate *physical* discovery. Many patrons bypass the catalogue and just browse.

Better to standardize on one entry, as opposed to departing from normal Cuttering practices, and have to deal with new items being an exception to normal practice.
How materials are placed on the shelves is primarily a local matter. It just seems to me that if current methods for shelf browsing have worked pretty well in the past, and unless there have been demonstrations and people throwing firebrands against it, which I have not heard of, I
don't see any problem. Have there been complaints from our patrons about this? If so, those complaints should be addressed, but with no complaints, there is no problem.

Materials on related topics and by the same authors are scattered on the shelves all the time. This was one thing I have gone into deep discussions about with my students: while I think that shelf browsing is definitely the most pleasant activity in a library, or in a bookstore, it must be accepted that it is not a very good way to find the materials you really want and need. There is nothing new about this, and has been the case since the library at Alexandria. Therefore, if you rely on shelf browsing to get your information, you are guaranteed to miss a lot of materials you want. Period. End of topic.

Still, if there is evidence that there has been serious problems with the arrangement of materials on the shelves, we must deal with it. But let's not fix things that are not broken. That is only asking for trouble.

Thursday, April 28, 2011

The brave new world of Push pop press

Posting to NGC4LIB

This is quite an article from Wired: "Gore, Ex-Apple Engineers Team Up to Blow Up the Book"
Look at the video. Just incredible.

The developers also claim that it's easy to make these books.
"Push Pop Press could likely undercut Adobe on price, not to mention ease of using the product. An interactive magazine, book, comic book or photo essay can be created with Push Pop Press in as little as 20 minutes, the programmers claim. ... “This is a layout tool, not a developer tool,” Tsinteris said. “It’s a little like playing with Legos.”"
They see Adobe as their competitor, and I must say, from this video, it beats Adobe hands down.

These are the sorts of innovations that may really make the book itself seem outdated and obsolete, especially to younger eyes. The program Blio is another possibility, championed by no less than Ray Kurzweil, but so far it hasn't really taken off. I've read somewhere that it has been difficult for him to get permission for an Apple app, which is what Gore and his team has done.

I think people would find it very useful if somehow, our tools could interoperate with technologies like this book, so that they could find newer information, as well as other opinions and points of view.

Re: Dr. Snoopy

Posting to RDA-L

On 04/27/2011 10:40 PM, J. McRee Elrod wrote:
This is one change I would like to see, but as an AACR2 revision rather than requiring a new set of rules.

It would be advantageous to have a single main entry for Geronimo Stilton works, and have works produced under that pseudonym brought together in the catalogue and on the shelf.

That the pseudonym is personified as a mouse or cockroach is beside the point. The author is writing under than name.
I agree that all of these changes could easily have been handled through
AACR2/LCRI revisions.

I have done a little bit of looking around at the question of authorship and found an interesting article from The Indexer vol. 18, no. 2 Oct. 1992, "Name of an author!" by Anne Piternick. Traditionally, there has been focus on the idea of finding the real author of a resource and trying to add that person's name.

From my own researches previously, I discovered lots of problems originally with the concept of corporate authorship, i.e. how can the "United Nations" author anything? This item could not have been written by an entire organization but by specific individuals. I have still had to argue this with non-specialists. In the old days, anything with no specific author, e.g. a journal of a learned society, was handled as an anonymous work. Slowly, the idea of corporate author came forth (Panizzi was first, I believe) and there have been lots of changes since then.

We have also seen changes in how pseudonyms are handled, the concept of  bibliographical identities, and so on.

Concerning spirits, the author mentions them and cites a 1986 article in Nature that was said to be written by God "as revealed to Ralph Esting". She could not find the citation, but if we were cataloging this, based on the "Spirit" rules, I guess the name heading would be "God (Spirit)" which I find really bizarre, but is probably not any different from "Archangels (Spirit)" or "Heavenly Spirit (Spirit)".

I haven't found anything about why spirit writings (or "channeled books", or books written through channeling) are handled as personal names, but it seems to be a very popular topic even today, and I could imagine someone saying, "Well, who knows? This might really be the spirit of Joan of Arc. Let's set them up as personal names."

Mr. Piternick also discusses computer programs, and questions if they can write books. She mentions the Rachter program which wrote a book and asks who is the author: the program or the persons who wrote the program. (The book is online by the way "The Policeman's Beard Is Half Constructed" LC cataloged it as title main entry with 700s for the two programmers while poor Racter was left out completely) This reminds me of the wonderful Postmodernism Generator that generates a completely meaningless essay about postmodernism. I hope we don't start cataloging these essays!

My own opinion of Geronimo Stilton, which is not a spirit or pseudonym but everybody can agree is a fictitious character, is that today, people will search using keyword, as in Worldcat, and when they choose a record, they should see some nice subjects with Geronimo Stilton that can lead them to lots of other books.
Stilton, Geronimo (Fictitious character) -- Comic books, strips, etc.
Stilton, Geronimo (Fictitious character) -- Juvenile fiction.
This seems to be adequate access.

In my opinion, changing a long standing rule such as this will open up a hornets' nest of associated complications that will be difficult to decide upon, and even more difficult to find a common level of consistency; all to achieve something that is of extremely limited utility to the public, if any at all, which would be similar to what I mentioned before with the changes to Russia/Soviet Union/Former Soviet
. It would be better to focus our energies in other areas.

Wednesday, April 27, 2011

Re: Dr. Snoopy

Posting to RDA-L

Jonathan Rochkind wrote:
But in cases where it is obvious what's going on.... it seems to me it would be preferable for the cataloger to act upon that. I am not a cataloger. What would they have do under AACR2?
The rules are not in AACR2 since fictitious characters were/are set up as subjects. The current rule is in SCM H1610 at This is we  see the example of Doctor Doolittle,
150 __ $aDolittle, Doctor (Fictitious character) 

The rules for Cartoons are in SCM H1430, which I cannot find online unfortunately.

By definition, at least in AACR2, only real people or groups of people can author something. Otherwise, they are considered to be as a subject that cannot author anything, similar to events such as WWII, or other topics, such as Love. This is different from a real human being writing under an assumed name, which could be a possibility for Charles Schultz writing under the pseudonym of Snoopy, but with a pseudonym, there is normally some attempt to hide the real author's name, not as in this case where his name appears on the chief source.

Again, why have this change? Where is the utility either to librarians or the users? The reasoning for such a change, along with so much of RDA, escapes me. Switching to an entity-attribute-relationship model is fine and I am all for it, although I have my problems with WEMI.
Nevertheless, these sorts of changes have nothing to do with that and certainly add no utility to anyone that I can see.

I have nothing against change, but change for the better is what should be the aim instead of change simply to do something.

Re: Cataloging Training Needed

Posting to Autocat

This has been an especially engaging thread for me, since I have been out of the US. Certainly, people need to understand what a catalog record is and how it works, but if the purpose of library school is to prepare students (i.e. relatively young people) for an entire career, it does seem as if there is a need to prepare them, not only for traditional library cataloging, but for the entire metadata infrastructure. I think we all know that our formats will change relatively soon (that is, in the next 10 years I hope!), our rules may change very soon (although RDA is not much of a change). As authors create more and more resources for electronic access through the world wide web, or whatever it evolves into, I think that what library students will need more than anything else is to learn to be flexible, because nobody knows what is going to happen.

For the moment, library cataloging is still based on AACR2/MARC21 and even ISO2709! And so for the moment, people are still needed to create these kinds of records.

People can be trained/taught to be catalogers that can get a job today, but that is not enough, although it was fine 25 years ago, which was a period of much less change. Today, the future is completely unpredictable. Who predicted on January 1st of this year the problems in the Middle East that have toppled governments that had always seemed immovable? Who could have predicted that newspapers would be threatened with extinction? Who can predict what the information environment will be in 10 or 15 years?

In my opinion, what is of vital is that library students learn how to be important parts of this change, otherwise it will all take place without us. To me, the idea that everything will be decided by the Googles and Yahoos and Bill Gates's and Steve Jobs and Sylvio Berlusconis is the real nightmare.

In the meantime, traditional library cataloging needs to be continued, and they need to have a good understanding of that. But in order to be a part of the coming changes, they need to understand how people really find information (which is absolutely *not* as it is laid out in FRBR!) and what they expect to do with it. In this sense, there is just as big of a need for reference work with the public, and it will be far more important for the librarians of the future to keep up with the latest research.

Librarians need to create a genuine, semi-united community that may actually make a difference.

Re: Google can't be trusted with our books

Posting to NGC4LIB

Going back to the Guardian article, the author discusses something different. He talks about how Google was intending to simply *delete* all of Google Video, a huge resource that people have relied on for a few years now. Google decided not to delete it only when people started complaining. Their reasoning was that they wanted to concentrate on "search" instead of hosting content--a rather strange reason that I suspect may betray more intentions, since Google has been buying up all kinds of hosting resources for several years.

The author compared this with Google Books, saying that based on such reasoning, a private profit-making corporation could not be trusted with such an incredible resource. As he says,
"As a private sector company, the core aim of Google is to make money. The Google Videos situation shows that in order to lower expenditure and adjust its priorities, Google was willing to delete content entrusted to it by users. Libraries have trusted Google with millions of documents:  many of the books scanned by Google are not digitised or OCR-processed anywhere else and, with budgets for university libraries shrinking year after year, may not be digitised again any time in the near future.
Google acted admirably by listening to users and working to save the videos but entrusting such vast cultural archives to a body that has no explicit responsibilities to protection, archiving and public cultural welfare is inherently dangerous: as the situation made clear, private sector bodies have the ability to destroy archives at a whim."
Naturally, the long-term purpose of the Google Books project was for Google to make money, not from any altruistic motives. They do *not* do it all for us. ;-) If it looks as if they will not make money, it will wind up being a drain and what will happen to the project? That is why I mentioned the absolute need from Google's viewpoint (and unfortunately, our viewpoint necessarily) to monetize the book project somehow. While I haven't read anything, it wouldn't surprise me if Google is pinching pennies now along with everybody else in this down economy. If it were bad enough, Google would probably be willing to jettison some of their holdings (is that the idea of closing down Google Video?), including selling the book scans, but if those scans are illegal, they cannot be sold. As my father would have said, it's like spending money on a dead horse. It's a real dilemma for Google but the main losers would be us, the public.

Sooner or later, the books in our libraries will become available electronically because as people become more and more used to accessing materials electronically, the more distasteful they will find the labor, the wait, and the general hassle of getting a physical book to be able to hold in their hands for only a couple of weeks. The non-electronic materials will slowly begin to go ignored, just as happened before, when the texts in manuscripts that were not printed were ignored and forgotten. It would be a genuine tragedy for our entire civilization if the scans are not made available.

By the way, while I very much appreciate HathiTrust, I cannot download the public domain books and place them on my ebook reader. I must read them one page at a time on my computer, which I will not do. I discussed this in a blog entry earlier.
Therefore, I look for a downloadable version on either Google Books, or I prefer the scans at the Internet Archive. There are lots of other sites out there with some excellent book scans, though.

Re: Snoopy, Dr.

Posting to RDA-L

I have asked this before. If fictional characters are now handled as if they are real people, what does this mean for groups of fictitious characters such as X-Men, the Justice League of America, or the Fantastic Four? Are they now going to be handled under the rules for corporate bodies? And if so, are we going to have to find materials published by their organization to find how the form of name appears on the chief source of information, which I guess would be one of their comic books? And I must say that the heading "Fantastic Four" does not imply to me a corporate body, so it would have to be qualified in a different way from the current heading "Fantastic Four (Comic strip)" to "Fantastic Four (Firm)"?

I will simply take it for granted that these kinds of changes are being instituted because of numerous complaints from members of the public, who have experienced major problems finding such materials and it has been demonstrated that these are the kinds of changes that will help our patrons find the materials they need.

Google can't be trusted with our books

Posting to NGC4LIB

Here are some important comments in an article from the Guardian: Google can't be trusted with our books / Simon Barron. (Guardian online 26 April 2011)

"Google announced last week that it would be deleting the content of the Google Videos archive. After a public outcry, it said it would work on saving all the video content and making it available elsewhere. In this instance, the public managed to change Google's mind and stopped the mass deletion of a unique digital archive but the situation raises concerns about data under Google's control, including the unique archive of Google Books."

The author goes on to discuss some other points, including metadata. My own opinion is that these issues can and will be solved once Google can "monetize" its books project. As it stands now, it is just an expensive, interesting financial exploration.

Monday, April 25, 2011

Re: Linked files

Posting to RDA-L

On 04/25/2011 06:20 PM, Jonathan Rochkind wrote:
Seriously, it is a fundamental idea in identifier management, decades old, that you should not change your identifiers, and for this reason you should not use strings you will be displaying to users as identifiers. One way this idea is expressed, for instance, is that you should not use a 'natural key' as a 'primary key' in a relational database. You can google on those terms if you want. In the sense that an rdbms pk serves as a kind of identifier, that is just one expression of the fundamental guideline not to change your identifiers, and thus not to use things you might want to change as identifiers.

I am seriously not sure why you are arguing this, James. This is a pretty fundamental concept of data design accepted by every single contemporary era data/database/metadata designer. This is probably my last post in this thread, this is getting frustrating to me. Perhaps it's my fault in not being able to explain this concept adequately, in which case I don't think I can personally do any better then I've done. Otherwise, I am not sure why you are insisting on arguing with a basic principle accepted by everyone else doing computer-era data/database/metadata design -- which has been proven in practice to be a really good prinicple. It's not a controversial principle. At all. Anywhere except among library catalogers, apparently.
The reason I am arguing this point is that it is something that can be done now, relatively inexpensively and otherwise, *nothing gets done at all*. For example, all this discussion about RDA and how it promises the New Atlantis, and so on, but for the foreseeable future, the public will notice absolutely nothing.

I confess that these kinds of discussions get frustrating for me as well. Instituting URIs would be a library tool that could be used by the entire community and who may actually find them useful--perhaps extremely useful. All the powers-that-be would have to do is ensure unambiguous access to current and earlier forms of headings. I *know* that that could be done easily enough, and I'm sure you do too. But no, everybody in the world has to be expected to add and change their headings to the identifiers of the powers-that-be, because otherwise things don't conform to the way they are supposed to work. How much incredible labor and expense does this entail? It is simply unrealistic to expect this to happen in the current situation and possible future, so the consequence is: nothing will get done. And who gets hurt? The patrons, and by extension, us, because we are seen as dinosaurs.

Of course I understand how identifiers are supposed to work, but *I don't care* how the system is "supposed" to work. I am by training a historian, and when I see that something "should never change" I just smile. Of course things will change and this must be built into *any system*, otherwise it is guaranteed to fail.

Right now, we need something that functions and that will make a substantive difference to our patrons. The cataloging profession sorely needs some successes, and that means coming up with creative solutions that people will see and--hopefully--appreciate. 70% or 80% today is certainly better than what we have now. It is frustrating to see some solutions, even temporary ones, and not have them.

It's a h*** of a way to run a business!

Re: Linked files

Posting to RDA-L

On 04/25/2011 05:56 PM, Jonathan Rochkind wrote:
If you maintain the "preferred display form" as your _identifier_, then whenever the preferred display form changes, all those links will need to be changed.

This is why contemporary computer-era identifier practice does NOT use "preferred display form" as an identifier. Because preferred display forms change, but identifiers ought not to. The identifier should be a _persistent_ link into your database for the identified record.
So long as the link from your database links unambiguously to the  resource you want to link to, that is all that matters. There are different ways of allowing that. This function is most efficiently handled by the database you are linking into, instead of the single database expecting everybody in the world to change their own databases to add their URIs. For example, I could add a link for the NAF form of Leo Tolstoy to dbpedia to interoperate with it. If they had a special search for exact NAF form, like in the VIAF, it would definitely be unambiguous.

My point is: this is something that is achievable. Probably through a relatively simple API, it could be implemented in every catalog pretty easily. There is just no hope that each catalog will add URIs within any reasonable amount of time.

Certainly, if we were creating things from scratch, we could redo everything that would be better for us (there is no doubt in my mind that future information specialists/catalogers 80 years from now will be complaining about whatever we make), but you must play the cards you are dealt and be creative with what you have. Perhaps it wouldn't be perfect, or maybe it would, I don't know, but in any case, it would be vastly better than what we have now and people could start discovering and using our records in new ways.

Re: Linked files

Posting to RDA-L

On 04/25/2011 04:27 PM, Jonathan Rochkind wrote:
I agree entirely, controlled headings from authority files ARE a sort of archaic version of identifiers and should be considered as such.

The thing is, that they aren't all that succesful as identifiers in the modern environment. For instance, just as the most obvious example, you NEVER want to _change_ an identifier. Yet, our authority file headings sometimes get changed (from a rename of an LCSH heading, to adding a death date to an author). Violates pretty much the first most basic rule of modern identifiers.

It's no surprise that an identifier system our community invented nearly a hundred years ago before computers really existed do not perform very well as identifiers in the present environment. But it's still the truth. I think you're absolutely right that we should understand these legacy controlled headings as a sort of identifier -- that will help us understand better how to use them and convert them in the modern environment. But important to remember they are a sort of ancient identifier system, which is ill-suited in several ways for the contemporary environment.
So long as the URI links unambiguously to the correct concept, it should not matter. In the new environment, it only makes sense that one conceptual resource could have many URIs. With the VIAF for example, we see how each heading is unambiguously linked in a variety of ways based on their own forms, In a correct system, the label that people see will be controlled by the searcher him or herself.

To see how it could work is to look at dbpedia for Leo Tolstoy with all of the redirects. That is the dbpedia URI. So long as our superceded forms are handled in an  unambiguous way, is they are now (with very few exceptions I think, essentially for forms that take on qualifiers, but I am sure these could be handled unambiguously too), the system should still work.

I think it is important to get something to demonstrate ASAP. If we expect that everyone is supposed to add URIs in their own databases, then that will take a very, very long time and is not realistic. It will not happen, or at any rate, by the time it does happen, the public will have moved even farther away from anything we make. Doing something now that is "quick and dirty" (and not so dirty, I suspect), plus relatively inexpensively, to provide the public with something that they may actually find useful, even though it may not be perfect and need some kind of updating, would certainly be far more practical than expecting the public to just wait until everybody adds the URIs.

Because it is clear that people will move on, farther away from us, and just ignore our tools.

Re: Abbreviations (was: Display space

Posting to Autocat

On 04/24/2011 11:55 PM, donna Bair-Mundy wrote:
I liked what Dr. Hill said about the overuse of abbreviations. Having taught graduate students for over a decade has thoroughly disabused me of any notion that even educated adults are familiar with the abbreviations we use. Our students do not come to us with a knowledge of Latin or Greek. The abbreviations s.l. and s.n. mean absolutely nothing to them.

It's not that they are uneducated. It is that the knowledge set of the current generation is vastly different from that of previous generations. For example, our younger students are far more technologically sophisticated than the students of ten years ago.

Our abbreviations often reflect the assumptions of an earlier era. And yes, I'm from the pre-space flight, pre-personal computer, pre-ATM era myself.
We should be viewing the "problem" of the public understanding cataloging abbreviations from some different points of view. First, cataloging and catalogers are facing a huge number of highly serious problems right now that many think, are far more important than abbreviations. Some of those problems threaten the very existence of catalogs and cataloging. Yet, I shall grant that some of the abbreviations catalogers use can be difficult for the public, but the public has lots of problems using the catalog. Is the "problem" of abbreviations so difficult that it should come before solving the other problems, such as the public understanding of what bibliographic records are, what the headings are and how they work, or the need for catalogers to increase productivity? These are all far more serious than the rather meaningless "problem" of abbreviations. But OK: let's assume that abbreviations do merit attention ahead of the many other problems.

Then comes the second consideration: there are lots and lots of abbreviations people see in the catalog, not only cataloger's abbreviations, but these abbreviations are used by the authors: abbreviations for drugs, for political parties, for all kinds of things. If we are going to say that abbreviations are a problem for our public, and that it is such a serious problem that we must put it ahead of the many other problems facing us, then why are we also ignoring all of the abbreviations used in corporate names and in the rest of the records?

Are we saying that it is only our cataloging abbreviations that people have trouble with?

Third, and most important, what about all of the millions of other records in the catalog that use the abbreviations? I have heard of no plans to convert those records (I certainly hope we don't since that would be a ridiculous waste of our resources) so if we don't update them, the public will continue to see those abbreviations in *every search* they do, and the public will have to deal with them forever.

Everyone simply must admit that changing a rule from one day to the next *will not* keep the public from seeing the abbreviations that catalogers have been using for such a long time. They will *always* have to see them.

Are there real solutions to this kind of problem? I believe automated solutions exist and at any rate should be tried. For instance, it should be child's play to write the display program so that if a string [s.l.] exists in a 300$a field, it can display as "No place of publication" in any language we want. This could be done for every abbreviation in every field since there are so few of them. As far as solving the problem of abbreviations in the rest of the record, a possible solution would be implementing something such as which has an API, and could work by letting you run your mouse over a an abbreviation and it would show you some of the possible meanings.

These would be real, 21st century solutions to the problem. We should not delude ourselves that by simply changing a couple of rules and ignoring everything that came before, records that our patrons will be seeing *every day* in *every search*, is any kind of a solution at all.

The problems will still be there for everyone to see and deal with. Once we recognize this reality, we can decide upon the importance of abbreviations and try to do what is best for the public and for ourselves.

Re: Linked files

Posting to RDA-L

On 22/04/2011 18:33, Karen Coyle wrote:
Quoting "J. McRee Elrod" <mac@SLC.BC.CA>:
So these identifiers link to *inhouse* files? "Shakespeare" once rather than repeated as author, added entry, and subject, in multiple bibliographic records?
Again, separate linking from identification. Identifiers identify. That's what they do. There is great importance in identification for sharing, as we know from library authority work. The difference between identifiers and creating authorities, however, is that authorities in libraries today are represented with text strings, thus making them language dependent. Also, if you wish to change your display you also change the string that identified the entity -- and that breaks a fundamental rule of identification, which is that the identifier must not change.
There is another way of looking at our headings than solely as textual strings, which is not entirely correct, but rather as identifying something *unambiguously*. This is exactly what our headings are
designed to do. An identifier does not have to be composed only of numbers, but any string. This is why I have suggested reconsidering our headings *as* identifiers, since catalogers have worked very, very hard for a long time to keep them unique, or unambiguous. We can see how this works in VIAF, which allows you to search exact name in the LCNAF form for, e.g. Tchaikovsky's heading, e.g.,%20Peter%20Ilich,%201840%201893%22+and+local.sources+any+%22lc%22&stylesheet=/viaf/xsl/results.xsl&sortKeys=holdingscount&maximumRecords=100,
or for the Swedish form,,%20Pjotr,%201840%201893%22+and+local.sources+any+%22selibr%22&stylesheet=/viaf/xsl/results.xsl&sortKeys=holdingscount&maximumRecords=100

This works for all forms. I think it would only take a change in mindset for this to work.

There is a VIAF api right now, and I would like to try to implement it.  Has anyone done so?

Friday, April 22, 2011

Re: Procedural Guidelines for Proposed New or Revised Romanization Tables

Posting to Autocat

On 22/04/2011 16:31, Aaron Kuperman wrote:
The issue we should be discussion is whether we should Romanize at all. Romanization is inherently a problem since a systematic Romanization needs to be based on a single dialect, and there conflicts with how the language is Romanized in practice. For example, an Arab or Jew who uses English as a second language will Romanize Arabic radically differently than one who was taught French as a second language (e.g. "sh" as opposed to "ch").

While some Romanization is necessary so monolingul Americans can identify the books, perhaps that can be limited to headings under authority control and the short title. As it is a large amount of the record in Romanized, which is a tremendous waste of cataloging resources, and of no benefit to end users (who after all, prefer the original rather than the Romanized text).
I agree with this. My own experience is that library technicians need it, along with students of the language who need help. Also, someone who doesn't know the language can often figure out a lot by looking at the romanization, e.g. when I would catalog a translation from e.g. Chinese, lots of times I could get an idea of the original title from the romanization. Of course, I still needed someone to confirm my suspicion!

In these discussions I would also want to study the incredible new possibilities available through tools such as Google Translate, e.g. "Literaturnaia gazeta" in Russian and through Google
Translate|en. I introduced this tool into the AUR catalog, where, if you look in the right hand column, you will see the Google Translate tool and you can change your language. It still seems like magic!

But still, with these kinds of tools, what is/will be the purpose of transliteration?

Thursday, April 21, 2011

Re: [RDA-L] Linked files

Posting to RDA-L

On 21/04/2011 19:33, J. McRee Elrod wrote:
If locally catched, fine. That's what we did with linked UTLAS files

But no OPAC should depend on offsite files being available to display a complete bibliographic record, no matter how "redundantly" available. The present NAF is not redundantly available, nor is it likely to be.
This is the very idea of "the web" however. I agree that this is a  frightening, "brave new world" we are entering, but it is the new world nevertheless, and I am fully convinced that we either enter the world of
"the web" or we do not. In my opinion, we need to enter that world or suffer the consequences. This involves a loss of a lot of the controls that we have grown used to.

This includes the NAF. It is, at least in theory, US government information and therefore, part of the public domain, along with all of the other LC created documentation.

Re: [RDA-L] Linked files

Posting to RDA-L

On 21/04/2011 18:40, Adger Williams wrote:
Let me see if I get this straight.
Ideal "linked data" architecture has links, based on some unique code of some kind.

The links connect to a server for an authoritative national level database.

They also connect to a local image of the part of the national database that the local OPAC needs. (The image gets updated periodically from the national database.)

The local image drives the display for the OPAC.

Is the local image where one might choose to make display choices like choice of alphabet, et cetera? This might also be where the choice of how much of the information from name authority records (addresses or other personal information) would get displayed. Are there other functions that the local image of the national database would serve?
That is one way it could be done, but there are many, many possible
architectures, where bits and pieces can come in from anywhere. In the
AUR catalog, the individual record display brings in information from
all over the place, e.g. There
is information brought in from Google Books, Worldcat, Google Translate,
Amazon, and from a site called AddThis.

Something very similar could happen with the headings, if everything
were linked. For example, if I had hooked up VIAF correctly, I could
display the French or German headings instead. Or, the patron could
decide which format to see.

Again, for this to work, it is important to keep in mind redundancy so
that there are backups in case of failure.

Wednesday, April 20, 2011

RE: Conference names : use of annual, etc.

Posting to RDA-L

Amanda Xu wrote:
Adopting and implementing RDA standards and technologies is different from fixing "a broken motorcycle or automobile." In the later case, you have to replace parts that are not working or need to be modernized. The pipes or wires that connect to the parts may not be durable any longer.
In an ideal RDA implementation scenario, we will extract what is working out of the old engine, transform and load them into the new platform or simply add a plug-in or appliance between the old and new interface if we can't afford the new platform.
If we did right, the end users shouldn't even be aware that such transformation had occurred except some nice UI features. It happened just like when we migrate from Windows XP to Windows Vista or later version.
The new arrival collections will mount directly into the new platform under the new standards and technologies governed by RDA as the universal content model for the discovery of library resources, and the support of user tasks as defined in FRBR/FRAD. That is why we are learning from each other and the RDA toolkit right now.
Backward compatible tools, e.g. from RDA to AACR2 are needed as well. For those libraries who can not afford the upgrade. This should be surprised at all for those of us who have years of experience with the info ecology systems.
Technically speaking, the implementation of RDA is not about fixing. It is about extracting, transforming, loading (a.k.a. ETL), etc. Of course, to do it right, doctor-like diagnosis and analysis of the library's resources, workflows, etc. is critical.
The question being discussed was: what is the purpose of adding the frequency to a conference name? This does not involve extracting, transforming, or loading. I mentioned that I could not see any reason for adding the frequency that would be useful either for patrons or librarians, and I asked: what is broken that this proposal attempts to fix? I pointed out, in fact, that such a rule would very possibly make it more complicated for end users to find conference names and some sort of way would need to be found to bring the variant names of the same conference together. This would be getting really complicated and somebody, somewhere should have to say why we need to add the frequency. Where is the evidence that end users or librarians are having problems that this would fix? If there is no evidence, there is no problem and therefore, nothing to fix. We should not assume that just because there is a proposed rule change in RDA, or in any cataloging rules, that it is either useful or needed. Evidence should be given.

Many of the changes with RDA seem similar in their utility to the end users and to the librarians, at least that's how they seem to me.

I have nothing against extracting, transforming and loading, but this is not what I see the changes of RDA doing. Those would be tasks mainly for systems librarians. Such tasks will demand changes in our formats and to establish some levels of interoperability with other systems out there, while questions of abbreviations, eliminating O.T. and N.T., changes in conference names, and others, have nothing whatsoever to do with ETL.

And I must repeat that if the end users don't even notice these changes except for some UI features (that can be debated whether those features will be nice or awful), the usefulness of the final product will be highly questionable. I won't mention again my doubts about the validity of the FRBR "user tasks". There needs to be some major user research and a business plan that makes sense.

Tuesday, April 19, 2011

RE: Conference names : use of annual, etc.

Posting to RDA-L

Hal Cain wrote:
Yebbut-- the hardest problems of achieving consistency usually arise from the inconsistencies found in the resources themselves. Regularizing such inconsistencies will infringe on the principle of representation: there should be a clear match between the resource and how it is described (and, I add, consistency in how we provide access) -- and what searchers bring to the catalogue often starts with a citation, formal or informal, created by someone looking at the resource. You can't get away from the thing in hand (or on screen, etc.) and suppress those inconsistencies.
Some of the wisest advice was given me a long time ago by an unforgettable fellow, who was a member of a one of those motorcycle gangs that gets violent occasionally. This fellow was pretty nice though and very "colorful". His advice is certainly nothing new to anyone, but it was to me at the time, and it comes back to me occasionally. He said, with a lot of feeling: "If it ain't broke, DON'T FIX IT!" But he did mention that figuring out exactly what is broken on a motorcycle or automobile can be very difficult and can turn out to be completely different from what you thought at first. You fix what is broken, otherwise you may be taking everything apart, changing parts that don't need it and perhaps wind up making the engine run worse than before.

So, I look at the rule changes of RDA, such as this one for conferences and immediately wonder: "What is broken?" I confess that this one is a mystery to me. While I readily agree that members of the public experience problems finding conference names, I can't imagine that adding the frequency to the conference name could be any kind of a "solution". So, the public doesn't need it; I don't think librarians have problems with conferences that would be solved by such a rule. I think most of the problems people have with finding these names (and other authorized forms) have far more to do with the inability of library catalogs (or at least most of them?) to search authority files and bibliographic files at the same time using *keywords*, which is how everybody searches today.

Monday, April 18, 2011

RE: Conference names : use of annual, etc.

Posting to RDA-L

On 18/04/2011 19:34, Adam L. Schiff wrote:
I think what will happen in RDA is that we will create authority records for each conference, rather than one record to represent the continuing conference.
I think you are right, but then our patrons will demand that somehow, these separate conferences all come together. People have plenty of problems already with conferences--one of the worst is the idea that it is a conference *name* and not a conference *title*. I don't know how many hours of my life I have argued this with people, where I have to show what is a corporate body, etc. Anyway, that's why I mentioned the "superwork" idea, but in this case, it would be a "superconference".

It's hard to decide how all of this "superstuff" will turn out though. In my own opinion, it is evidence that something, somewhere is wrong.

RE: Conference names : use of annual, etc.

Posting to RDA-L

Danskin, Alan wrote:
Treating "events" consistently is a simplification of the instructions. The decision to include "frequency" in the name of the event is justified by the principle of representation if the event represents itself as an " Annual Conference" or the "Biennial Festival".
I have been following this discussion with some interest. While I fully agree that this may be a simplification of the instructions, it does seem possible that the final product, i.e. the work of the catalogers and of the researchers who need to find these conferences, could very probably be made more complex. In fact, it could wind up becoming very complex as conference names change from one to another based on frequency.

It occurs to me that perhaps the concept of the "superwork", aimed primarily at serials, will also be suggested for conferences, so that people who are interested can find the various forms of the potentially multiplying conference names.

Wednesday, April 13, 2011

Comment to "more on the point of RDA"

Blog comment to more on the point of RDA Bibliographic Wilderness

First, I would like to say that I have enjoyed our private email exchanges since I have learned so much from them.

My basic idea of suggesting XML in favor over ISO2709 is, in short, we are going to have to switch to XML someday anyway since XML is the language of the web. There is one use, and one use alone, to ISO2709, and that is to transfer complete records (i.e. catalog cards) from one library catalog to another library catalog. The only difference is that the cards are not printed out now. Therefore, I see absolutely no reason to keep ISO2709 and many reasons to get rid of it. The quickest, lossless method would be to go to MARCXML, but I have admitted this may not be the best for the public, and perhaps MODS or even DC simple would give developers at least something to work with, compared with what they have now, which are ISO2709 records that they need to process in various ways before they can be used. Still, let’s offer many formats instead of only textual ones and ISO2709. Plus, so long as we follow the “roundtrippability” (quite a word!) of MARCXML-ISO2709, we will never be able to grow beyond the limitations of ISO2709, which was designed to create cards.

I believe that for Jonathan though, the problem is not so much ISO2709, but MARC itself, and this is where I think we disagree. This is probably because I am a cataloger. So, for example, when he writes: “Where do the instructions go to catalogers to tell them how to fill out this new thing?” and then he mentions how difficult it is to use the documentation. I think this is a misapprehension of what the task of cataloging is.

Cataloging is the intellectual task of taking a resource, describing it using standard methods so that the description means the same thing to everyone (the basic idea of ISBD), and then ensuring that description can be found in various ways–also reliably–within a greater intellectual framework. The description and access points are then placed in whatever tool you have for people to use: a card catalog or a database.

Cataloging is *not* filling out forms–this is only the “instantiation” of the catalog record while it is in the process of taking on a realized form, in other words, it is data entry. In FRBR terms, the “catalog record” first takes place only in a cataloger’s mind (work), then data entry (where it becomes an expression), the view of the record in a catalog (manifestations/items). In more realistic terms, I compare it to Plutarch’s example of watching a carpenter work and concluding that the job of a carpenter is to saw boards and pound nails, but in reality, you’ve missed the real meaning of what the carpenter is doing: the carpenter builds homes for people to live in, and places of business for people to work in. Returning to cataloging, it is easy to mix up the task of data entry with the task of cataloging.

Therefore, cataloging is a logical process where one part logically follows from another. For instance, the very first thing you must do when cataloging is to determine something called the chief source of information. This is a strange idea for the untrained, yet the rest of the description and even some access points are based on this. Many times, a different chief source of information can result in a radically different description and access points so it is important that everyone, so much as possible, start from the same “chief source of information”.

So, while it would be nice to be able to include “everything” on a topic in the rules, e.g. form of name of corporate body, but that is difficult to do because so many procedures rely on so many earlier decisions, in this case, the form is normally based on the form found on the chief source of information of one of its own publications. From this one example, we can begin to see how cataloging decisions are based on other cataloging decisions and should suffice to show that the job of cataloging is not filling out forms.

Still, saying that the problem is with complex documentation seems a little strange to me, but OK, I’ll go along with it. There are many options for documentation today. For example, when I created the Cooperative Cataloging Rules Wiki, Alexander Johanssen suggested that, instead of a wiki, I should use DocBook, which I had not heard of. Although I chose the wiki, I did it because I wanted to get something useful out there for people more or less quickly and did not want to spend time on learning something completely new, but when I saw DocBook, I realized that it was a much better solution since it is completely in XML and so allows complete freedom.

If people believe the problem of documentation is so difficult, perhaps the CCR Wiki community could start thinking in terms of DocBook for the rules and opening it up to the wider community.
Reordering the current rules, bringing them together on a wiki or Docbook-type solution, would solve what you mention. I see no reason for creating an entirely new rule set and changing some rules in weird ways in the process.

In my own opinion, the underlying problem is not with the complexity of the rules, which are complex not to ensure employment for catalogers, but because the materials we receive for cataloging are indeed complex. The *really* hard job is to get people to actually follow the rules in the first place, no matter which ones we choose, thereby assuring some level of standardization, as I discuss in my latest podcast.

But that is another topic.

Tuesday, April 12, 2011

Cataloging Matters Podcast no. 9: Standards, Perfection, and Good Enough

Cataloging Podcast no. 9: 
Standards, Perfection, and Good Enough

Hello everyone. My name is Jim Weinheimer and welcome to Cataloging Matters, a series of podcasts about the future of libraries and catalogs, coming to you from the most beautiful, and the most romantic city in the world, Rome, Italy.

In this installment, I want to explore the idea expressed more and more often that catalog records need to be Good Enough. What in the world does that really mean and what are the consequences when and if we accept it? Have people already done so?


Even though the debate is as old as libraries and their catalogs, it seems that more and more often in the library literature and blogs, I run across the idea that the quality of catalog records only needs to be “good enough” but there is little discussion about what this “good enough” actually consists of. If you do a Google search with the keywords “cataloging good enough” the result is quite interesting: at least in the results I get, the term “good enough” is almost always juxtaposed with the term “perfection”. This is shown very nicely in the number 1 result for me (which does not mean it is number 1 for everyone), which is the interesting article “An Essay on Cataloging” by Daniel CannCasciato in Library Philosophy and Practice, November 1, 2010 (There are links to this article and everything I mention from the Transcript)

This article offers some excellent quotes:
'The biggest question we have to ask ourselves,' [Jay] Schafer said, is 'What's good enough?' Is a nearly perfect catalog record worth the cost of achieving that goal?' Or, as a different speaker put it, … "At the root of these processes are two powerful beliefs. One: the cult of perfection. And two: cataloging is about how print books are arranged on the shelf..."
In his article, Daniel goes on to say that the question itself is wrong: that cataloging can now be an ongoing process, i.e. the catalog record is no longer a printed catalog card that disappears into a drawer, but modern catalog records are in a publically accessible database that, if everything is set up correctly, can be accessed later and updated on a cooperative basis, even automatically or semi-automatically. This can happen because modern systems allow people to work together to build resources up gradually to the benefit of all. The author also mentions Michael Gorman:
“To accept the "good enough" rallying cry relegates patrons of today and the future to a lesser status than previous generations, a slight that, as Gorman wrote, might not be noticed for years.”
I would also like to quote from the report “Rethinking How We Provide Bibliographic Services for the University of California” The University of California Libraries, 2005 [available at p. 25], where, under the section “Automate metadata creation” the authors write:
“We must adapt and recognize that “good enough is good enough”, we can no longer invest in “perfect” bibliographic records for all materials.”
While I honestly sympathize and mostly agree with all of these ideas, there seem to be several assumptions hidden. For those who do not accept the idea of “good enough”, the assumption is that what came before the implementation of “good enough” was “better” and that if we do things substantially differently, the results will be for the worse. Yet, those who argue in favor of “good enough”, normally pair “good enough” against “perfection”. As the first article points out, comparing “good enough” with “perfection” is not the only possibility; I will go on to state that it is not even a normal sort of comparison, and ultimately it is not fair. In fact, “good enough” has nothing at all to do with perfection. Yet, what is this idea of “good enough”?

In this installment of Cataloging Matters, I would like to discuss “good enough” a bit more thoroughly to determine what it can mean and what it does mean in other contexts. Obviously, “good enough” has completely different meanings to the various people quoted before, so what is going on? Does “good enough” really mean nothing in particular, or can it mean something much, much more specific?

It isn’t like the concept of “good enough” does not exist in our society; in fact, it exists everywhere we turn. Our society could not exist without “good enough”. But what does “good enough” really mean in this wider sense?

It turns out that it means exactly what it says: that with the specific item, process, or whatever you are talking about, you can be assured that it will indeed rise to a certain level of quality, or in other words, it literally is “good enough”. “Good enough” in this sense, which is also a more normal sense, means standardization, and standardization in turn means reliability, which provides everyone concerned with levels of quality that all can count on.

Seen in this way, standards are a type of guarantee. “Good enough” or anything that follows minimally-accepted standards does not mean that everyone has no choice but to accept whatever the producer feels like throwing at them; it means exactly the contrary: you don’t have to accept it. In fact, there are many options when goods or services come to people that do not meet the minimal agreed-upon standards, but standards also have nothing to do with some vague ideal of perfection.

So, if a manager in charge of a public water plant says that the water quality meets the standards, or in other words, is “good enough”, that manager means something very specific and is saying nothing negative at all. Here “good enough” is not some insider code for “nobody cares” or that whoever wants to use the water from this plant should be aware and are expected either to hold their noses and take their chances, or to filter out whatever yuck goes through the pipes and throw in chlorine tablets before they drink it. It means that the water that goes out of the plant literally is guaranteed to be “good enough” for people to use reliably and safely, as determined by experts in the field.

Why? Because the management at the water plant in essence guarantees that the water conforms to highly specific technical standards. For example, in the transcript you can find a link to the standards about water quality issued by the U.S. Environmental Protection Agency I won’t quote anything from these standards because I don’t even know how to pronounce most of the words! Still, if the water quality does not conform to these standards, it is considered a highly serious matter. While the actual quality of the water may rise and fall from day to day or hour to hour, there is a limit below which the water quality cannot be allowed to fall.

It also doesn’t mean that all the experts agree with all the standards. Nor do the non-experts all agree. Both of these groups--the experts and non-experts--have all sorts of motivations for their opinions, from professional to ethical to monetary to political to who knows what else. As a result, some of the minimal levels may be highly contentious. Right now, there is a lot of discussion in the press about the safe levels of radioactivity in the drinking water and food produced in Japan. Some experts maintain that there “is no safe level” since DNA can mutate with any level of radioactivity. On the other hand, the U.S. conservative pundit Ann Coulter said that there is a growing body of evidence that radiation in excess of what the government says are the maximum amounts we should be exposed to, are actually good for us and can reduce cases of cancer. Although both sides seem rather extreme to me, I am no expert and realize that there is probably a difference of opinion, and in view of the Japanese crisis, these differences may flare up again. Nevertheless, that doesn’t mean that those responsible for running nuclear plants should just stop caring about the amount of radioactivity being released and give up trying to follow the current standards that are in place. If they did, I hope they would be punished, and punished very severely.

There are different methods of enforcement and punishment when standards are not followed, for example, if the quality of the water falls too low. The managers of that water plant I mentioned will be very interested in keeping the water quality “good enough” because otherwise, they may wind up fired or find themselves in prison for several years!

Modern society would disintegrate without this level of reliability. I am sure we all want to be able to light our stoves without them blowing up in our faces, to drive our cars without the steering wheels coming off in our hands, or to open a can of corn and not find ourselves in the hospital for the next few weeks with ptomaine poisoning. If any of those things happened to us, we would want to see whoever was responsible punished in some way because the product obviously was not “good enough”.

The number and types of standards are truly amazing. Just considering the standards from one organization: ISO, there are over 18,000 standards! But there are lots of standards organizations, e.g. Wikipedia lists hundreds but there are many more

In the library literature however, the words “good enough” have meant something quite different from this use, depending on who is discussing them. For the traditional cataloger (including myself, I confess), “good enough” in librarian-speak actually means “inferior and not good enough”, i.e. precisely the opposite of what it proclaims to be, while for others who are often more administrative or IT based, it seems to mean, “I really don’t care.”

To be honest, I can’t fault those who don’t care. After all, the moment you do care, you automatically find yourself surrounded by an almost impenetrable thicket of hair-raising technical information and jargon that in just a few minutes will force all but the most stalwart screaming from the room. Imagine for a moment that you really cared about the standards for water quality from the EPA I mentioned earlier. How much time would it take you to learn enough just so that you could begin to understand what those standards describe? Then, how much more would you have to learn to have an informed opinion about how good the standards are?

Who wants to get involved in all of that? Yet, if our society cares about the quality of our water, somebody has to get involved, and that is the price of expertise. This is because the final product must be reliable for everyone concerned but the general public should not have to immerse themselves in the details. So, while I expect the welds in the buildings I enter to be “good enough” so that bits don’t start breaking off and falling on my head, and I want the welders themselves to be able to work safely with their equipment, I have absolutely no interest in reading their standards.

But I sure want the welders themselves to know the standards very well and more importantly, to follow them to the letter. If it turns out that a company does a bad job at welding and their welds break, I actually want that company punished in some way because if there is no enforcement, then it means that companies and welders can do anything they feel like and consequently, whether or not a weld is secure would become a matter of sheer luck. While I have a lot of respect for welders and builders and electricians and mechanics, I do not think it is wise for society to rely solely on their “higher ethical sense of professional responsibility” for the quality of their work. I am sure that many welders indeed have a very high ethical sense; I am just as sure that many others do not. This is why standards exist in the first place: to replace personal trust with genuine, enforceable guarantees. If everyone could see that welders could get away with inferior work and do just as well as the competent welders, the result would be to teach the competent welders that their “higher ethical sense” is useless. If we take away the enforcement, some very important standards that each and every one of us relies on every single day, would all become jokes.

In the case of welding, would I prefer those welds to be “perfect”? Sure, but I don’t even know what a perfect weld would look like, and in any case, I think most of us will settle for welds that are “good enough”, that is, so long as they really are good enough.

Therefore, it seems obvious that if libraries want the general public to use their catalogs and catalog records, they must provide some level of reliability. The words “good enough” must therefore mean something just as precise as they do in other professions and should not be compared with some unreachable “perfection”.

Now, I would like to change tack and ask: does this mean that the records catalogers have been creating for the last few decades should be considered as the minimal standard, or in other words, defines what is “good enough”? In addition, the information universe is changing radically, and when practically everything is digitized, which may happen much sooner than we think, what does this portend for the very purpose of the catalog itself, a tool that is becoming less and less understandable to our patrons? The controlled terminology, although people tend to like it when they understand it, is less understood, and in any case, people have to fight with it to make it work since it was designed for a completely different environment. As a result, if people do indeed come across controlled terminology that is useful to them, they do it more by happenstance than anything else. Therefore, isn’t it logical to conclude that it would better serve the needs of the patrons if we were to move the resources away from creating a semi-obsolete catalog record and toward digitizing our resources?

I think this is what is really behind the administrative/IT plaint of “good enough” and I admit it’s a very good question. This is where I think the discussion becomes genuinely interesting.

There is no doubt that what catalogers create must change; it must change not out of some sort of “inherent need” for change, but as adaptations to the fundamental changes taking place in the information environment. If we were living in 1965, there would be no need for any real changes except for normal managerial efficiencies, as occurred with the adoption of ISBD. The pace of change in libraries was much slower back then, and if a specific change took a few years, maybe it was unfortunate, but it was still OK. This was because the only way to get at information during those days was through the catalog, and if people couldn’t find things because of backlogs or whatever, although there might be a little huffing and puffing from a couple of researchers, it was not that big of a problem since people had no choice except to wait until the library got around to it.

This has been completely turned around today since there are other, very attractive choices for our patrons through Google, Yahoo, and many other databases where they can easily find and use some wonderful websites that are not in the library’s catalog. I am sure this must be very confusing for the public and I can imagine they could easily conclude: I just found this wonderful resource on the web using Yahoo and there is no record in the library’s catalog for it. If no record at all in the catalog is “good enough” for these excellent sites, then there seems to be no reason for catalog records for anything digital at all. Besides, using a library catalog today is just plain weird. Conclusion: resources should be devoted to digitizing what is not already digitized; then we would have a real, lasting solution, while creating catalog records is just continuing the practices of the past.

Such thinking is very logical and is accepted by many.

Are there standards in libraries currently? Of course: there are standards for the construction and wiring of the library building; for shelving and how much weight a floor can handle. There are standards for binding, for storage and disposal of chemicals used in conservation departments; standards for accounting and on and on. But these are different from the bibliographic standards those same libraries claim to follow. How are they different? In quite a number of ways, actually.

I have tried to demonstrate that normal standards create a series of technically-defined minimal levels of quality that experts consider to be “good enough”. Of course, there is nothing to prevent any organization from devising products or processes that are almost totally different from all other similar products, so long as what they create meets those minimum levels, or they may rise far above any or all of the minimal levels, but the standards remain in place as guarantees both for the producer and the consumer; the producers can be assured that their products represent a quality product and/or can work with other items, so a producer of television sets does not have to produce electrical cords, but can buy them reliably and safely; while the consumer can rest easily knowing they are buying a reliable product.

Library-bibliographic standards are quite different however: they seek to create records that are as identical as possible; in essence library bibliographic standards seek to define a template that defines what final product will be. True, there are different bibliographic levels, e.g. full, core, minimal, conser and so on, but even here, each level is in essence, a template, defined by what they will and will not contain.

How could a standard from the normal world, based on guaranteed minimums, work in the bibliographic world? What would such a standard look like? To imagine one, we can consider the current AACR2 rule of three for authors, which in short, says to create entries for the first three authors of a resource and if there are four or more, make an entry only for the first one. This is very specific. Compare this with the proposed RDA rule that says to make an entry only for the first author, and then curiously enough, then RDA singles out translators, and illustrators of children’s books! The rest are up to “cataloger’s judgement”.

If we relate this to the earlier discussion about normal standards, we can see that this RDA rule defines a minimum level of quality and that a book with three authors and an editor only needs an entry for the first author to be fully compliant with RDA. In effect, the RDA standard says that one author is “good enough”. It is difficult to figure out what else such a rule means. Relying on “catalogers’ judgement” is the same as relying on the “higher ethical sense of professional responsibility” of mechanics and welders. It doesn’t work in the real world; why should it work in the library world?

To make this more like regular standards, the rule could rather say something like, “make entries for at least the first three authors of any resource”. This would allow for a better minimal level of quality, while allowing for all different kinds of variations and additions that any organization would want to add.

How would the idea of minimal levels work in other areas of the record? With subjects, perhaps we could guarantee a certain number of headings for certain types of materials. With description, I would think that ISBD offers lots of areas for minimum levels. Setting up real standards takes work, discussion and compromise. I confess that right now I’m not really sure on many of the specifics, but I have no doubt it can be done since there are standards in place for all kinds of processes, and what’s more: it must be done. In today’s shared information environment, if library rules become more like normal standards and focus on creating minimal levels of quality instead of defining everything that goes in and is left out of a record, I believe the tasks catalogers are facing and their solutions will become clearer.

Therefore, neither AACR2’s maxims of “thou shalt” and “thou shalt not”, which strive to create records that are as similar as possible and where any deviation is seen as a flaw, nor RDA’s rather naive assumption that if you leave as much as possible to a notoriously fickle “cataloger’s judgment”, offer a real solution. Normal standards would allow matters to become simultaneously more flexible, while allowing for a level of genuine reliability.

The other major difference of normal standards vs. library bibliographic standards is something less pleasant to discuss: the matter of enforcement. Obviously, few people will agree that catalogers should be led off to jail for messing up a publication date or doing superficial subject analysis (OK. I’ll admit that there have been a few times when I saw the thoroughly lousy copy records of some libraries and I thought..., but I won’t go there!) Still, there are options that other professions use: for example, there is the process of certification and the need to renew that certification. Certification can be applied in a few ways: to the product, i.e. where each bibliographic record would get some kind of label of guarantee that it reaches specific standards, or there can be professional certification, where individuals are certified and their certification is not forever, but needs to be renewed.

Of course, this would mean a radical change from what exists today. A lot would need to happen before something like this could take place, including getting support from our institutions to help catalogers get and maintain their certification. Still, if we are to change from the empty library concept of “good enough”, which has meanings that range anywhere from ”uncaring” to “not at all good enough”, toward the more normal idea of standardization that “good enough” really is good enough, something has to change, and minimum levels of reliability need to be guaranteed in some way for our records.

Finally, if standards are to succeed, they have to be realistic, that is, they cannot be based on wishes. A standard that would mandate all automobiles get a minimum of 250 miles to the gallon would be currently impossible, so standards must be based on what is genuinely achievable.

In this sense, I suspect that even AACR2 may not provide a realistic standard since there have been so many complaints and problems of record quality (I am a highly vocal critic), the difficulty of training and so on. Of course, RDA is not any simpler than AACR2 so if RDA is accepted, exactly the same difficulties will remain.

How have libraries dealt with the problems of records that fall below the accepted level of quality? With the incredibly inefficient method of spending time upgrading the records locally and complaining among themselves. Yet, nothing happens to the offenders and they just keep making the same sub-standard records that everyone is supposed to upgrade. It turns out that anybody can make records, from a student with an hour’s training to a paraprofessional to a master cataloger of 20 years experience. Unfortunately, this has resulted in a tremendous number of substandard records needing upgrading. Years of this practice, and now the economic crisis, have forced many libraries to just give up and accept anything they get. Obviously, this is not good for record quality, and the standards we claim to follow become a joke.

One final point: for standards to make any difference, they also have to be aimed at products that people really want. If somebody created detailed standards for the creation of feather pens or wagon wheels, they would have absolutely no importance today.

If libraries are to come up with enforceable standards, they must be aimed at something that people want and need. And now comes the huge question where there is very little agreement: what is the public doing in this new information universe? Does the public find our bibliographical records today useful? If yes, that’s great, but if no, what parts of the record are the most useful and how can they be improved?

In short, it should be clear from what I have said, that I think that the issue of library bibliographic standards is a problem with a very long history that has been mostly ignored, and it is not possible to ignore it any longer. It doesn’t matter if we accept RDA or not, if we decide to stay with AACR2 or even go back to AACR1 or take Cutter’s or Panizzi’s rules. None of it makes any difference if record creators are always able to ignore any standards that exist. In my own opinion, some kind of certification will be necessary sooner or later if there is to be any hope that library catalogers, and their records, will be taken seriously.

It is truly unfortunate though, that these concerns are coming to a head now, when we are facing the serious economic crisis plus the need to fit ourselves into the greater metadata universe. Big changes are coming (as if we haven’t gone through enough change already!) but I feel that something has got to give: either we take the issues of real, genuine, useful, enforceable standards seriously, and just as seriously as they are taken in other professions, or our standards may be headed for the trash can.

Even this does not end the discussion however. There is also the question of creating standards for an “expert system”, that is, a system not primarily for use by untrained people, but for experts. Experts of all kinds need standards for their own tools as well. For example, there was a recent article in the Globe and Mail about the New York publishing consultant Mike Shatzkin, “Mike Shatzkin in Montreal: Libraries don't make sense anymore” Mr. Shatzkin maintains that although libraries make no sense in the future, “there will be an ongoing need for librarians, however; their skills will continue to be in demand, as will those of editors.” Of course, librarians are no different from any other profession: in order to be effective, they need specialized tools. A dentist armed only with a toothbrush, toothpaste and a pair of pliers would not be very effective as a dentist. Librarians are just as reliant on their specialized tools, that allow them to “do their miracles” of finding information and resources that the untrained cannot. When a librarian is stuck only with the Google interface, with no controlled vocabulary or structures, they are just as helpless as anyone else... almost.

I would like to discuss this, but I have gone on long enough. Perhaps in a future installment, I will talk about the need to create an expert system not aimed for the general public, but by “information experts” of all kinds.

It has been a bit of time since my previous podcast. There are two reasons for this: I have had some health problems that are now pretty much over, but more significant is the fact that I am no longer Director of the Library at the American University of Rome. I resigned that position to take advantage of some other opportunities. Right now, I have been taken on with the Food and Agriculture Organization of the United Nations as an information specialist, attached to FAOStat in the Statistics Division. After that we’ll see what happens. In any case, I have lots of ideas!

I have decided to close this segment with Palestrina’s Agnus Dei, or Lamb of God. Giovanni Pierluigi da Palestrina worked here in Rome, primarily at St. Peter’s during the 16th century. This is a wonderful example of Renaissance polyphonic choir music.

Incidentally, I can’t resist saying that I think it’s too bad that the Google Books-Publisher agreement fell through, yet I still realize that all of those books will be made available sooner or later; all that has changed is that now we know it will be later. These are the sorts of technological innovations that cannot be stopped forever.

That’s it for now. Thank you for listening to Cataloging Matters with Jim Weinheimer, coming to you from Rome, Italy, the most beautiful and the most romantic city in the world

[See also: The Truth about Standards (European Committee for Standardization)].


Posting to RDA-L

Kevin M. Randall wrote:
James Weinheimer wrote:
I don't think I am missing the point of RDA, and the abbreviations are a great example. Do we really believe that a simple rule change will "solve" whatever "problems" the public supposedly has with abbreviations in the catalog? Sorry, but I find that very naive.
Did you read the rest of my post? This response shows that you still do not understand at all. The "simple rule changes" are *NOT* the changes that are significant in RDA. What is significant and has great potential is the entire concept behind RDA, creating a framework that brings metadata into the current age of information technology
Well, if you insist so strongly that I don't understand, it certainly must be true! :-)

But please bear with me and let me insist that I do understand. The underlying structure of RDA, which tries to envision the FRBR structure, is still something that is highly debatable. First, there is still no evidence that this structure is wanted or needed by our patrons. Every FRBR-type project I have seen is of limited use for our patrons, in my professional opinion, since our patrons are moving away from such things into the world of "search". Certainly people will look at and play with the displays, but there is still no evidence that it provides what they need, especially WEMI. And WEMI is one of the main products we will be making. I think very, very few people need that. The second point, and of course the most important, is the business case that demonstrates that all this is actually worthwhile in the business and financial sense. Both of these points have been advanced over and over throughout the years and certainly didn't start with me.

In any case, I think the library world has to demonstrate some kind of substantive advances, and I think we have to demonstrate them soon since the information world is moving away from us faster and faster, and along with that world goes a lot of the funding. Instead of swallowing the promises of a future Eldorado, the powers that be are starting to ask: what can you do now? This is why I mentioned the abbreviations "problem" and the changes to the Russian headings. We can change everything in our new records but there is still a massive amount of legacy data out there that our patrons will be seeing and working with every day just as much, or probably far more, than with the newer records we create. So, whether it's some completely insignificant rule change about abbreviations, or something bigger with new frameworks and structures, it all comes down to the same thing: our patrons will be working with both every day in every single search they do. This is why I say we have to look at it through their eyes, and not ours. From that point of view, things look much less revolutionary.

Now, we can either convert the older records, or we could place those older records into a separate database, in essence, archive it all. This would be one idea that I may go along with, and then start fresh with a brand-new format, rules, and so on and the task would be to get the two databases to interoperate as closely as possible. Of course, all this assumes that RDA and FRBR is useful and needed by our patrons AND that it's worth the costs.

I am certainly not saying that I know what people want when they search for information. That can only be discovered after research, especially in times as dynamic as our own. To begin creating an FRBR/RDA structure on the assumption that it provides people with what they need (otherwise why would you create it?), without any evidence for it, is unwarranted. So, the FRBR/RDA structure may be revolutionary and great, or it may just be a continuation of the 19th century structures, placed into the 21st century, which is my own opinion.


Posting to RDA-L

Myers, John F. wrote:
One could argue interminably the pros and cons of abbreviating or not. I can see merits to both sides, as well as to native language representation of missing date issue. (That is, the replacement of [s.l.] with [place of publication not identified], where [s.l.] replaces the earlier [n.p.] for "no place".) I am however adequately convinced by the machine processing crowd to hold my reservations in abeyance.

The Bible heading changes would happen regardless of RDA -- they were the last proposal to change AACR2 and were rolled into RDA rather than causing a new update to AACR2 in the middle of the RDA development process.

If by "lack of $b in titles" you mean that the "Other title information" element is not part of the core elements of RDA, I would point out, insofar as AACR2 had core elements which I will equate with the "first level of description" articulated at 1.0D1, it is neither a core element of AACR2. The equivalence of the RDA core element set as a "Full level" record is an undesirable possibility, but is a consequence of policy implementation not of RDA itself.
What I am asking is do we honestly and truly think that these tiny, insignificant changes are going to make any difference at all to our patrons? These changes certainly won't help anybody find anything--they are just changes in display: the elimination of O.T. and N.T., spelling out abbreviations, the subfield b. The only possibility of added access is getting rid of the rule of three, but that could just as easily reduce access since the only mandated access point is the first author. (Oh yes! Plus translators and illustrators of children's books!) Will the public suddenly like our records and find them useful after these changes? Why?

We need to look at matters from *their* viewpoint, not ours. Look what they can do using other tools. Some are saying that search is going away altogether. While I don't necessarily agree (and I hope not, as people can hear my views on search from my last podcast), these people, e.g. from Bing, have a *far bigger* voice in the information universe than any library cataloger, or group of library catalogers. We must do something, and something big.

Kevin Randall wrote:
The questions above indicate that the questioner is missing the point of RDA entirely. Abbreviations could be limited or eliminated entirely by a very simple amendment to AACR2.
I don't think I am missing the point of RDA, and the abbreviations are a great example. Do we really believe that a simple rule change will "solve" whatever "problems" the public supposedly has with abbreviations in the catalog? Sorry, but I find that very naive. To be fair, I was guilty of exactly the same attitude back when the Soviet Union and Eastern Europe fell apart. I and my team at Princeton struggled mightily to fix all of the--who knows how many subject and corporate name headings in the catalog, but we did it. It was one of the tasks I took special pride in and the heading browses looked great!

But then came the retrospective conversion project of the cards, and the "beautiful" displays of the headings were utterly spoiled by being inundated with zillions of obsolete headings! I was so mad until... I realized that what I was looking at was only the reality that confronted our patrons every day. When the patrons used both the cards and the OPAC (which they did constantly of course), everything had always been split, but for me as a cataloger, I was concentrating only on the OPAC and the cards were somehow "outside". I had been ignoring the complete reality of the situation. Suddenly, I was confronted with what the users saw every day. I didn't like it, but it was a humbling moment.

Abbreviations are precisely the same thing: while new records will have their abbreviations spelled out, the old records will not. Our patrons will still have to work with those abbreviations, that is, unless some retrospective project is created, but what a waste of resources that would be! In any case, thinking that making a new rule is going to "solve" a problem, when millions of records that will not be fixed will still be in people's faces every day, in every search result, shows one of the reason why technical services librarians are often held in such low regard by other library divisions. So many times, technical services people see only what they want to see, just like I did with the Soviet Union/Eastern European headings.

We shouldn't delude ourselves that these insignificant changes (for our users, but not insignificant for us) will make any difference in the scheme of things. As Mac said, it is not the problem of the rules, but the problems we are facing are in other areas.