Wednesday, June 2, 2010

RE: AACR2 and RDA sample records from LC

Posting to Autocat

Myers, John F. wrote

<snip>
Isn't this a bit of a strawman? I cataloged for over a decade, and successfully so, before I finally saw MARC in its raw form as cited below. The only reason I came to learn that MARC was not the nicely formatted display in OCLC or my ILS was that a co-trainer trotted out this raw form to have fun with the students in our class. And yes, it is a little bit scarier and a lot less readable in raw form than a MARCXML rendering. But having recently compared records in the two formats, it is clear that neither is serviceable without an interface.
On top of that, MARCXML requires more space (on the order of 10 times more). After comparing the two formats, I am even more impressed by the genius of Avram's efforts. There are valid arguments for the obsolescence of MARC, but indecipherability in its raw form is not one of them.
</snip>

I wish you were right, but the fact is, MARC in its ISO2709 format is used for *record transfer* (i.e. a communications format) and this has lots of consequences. It is the format that others get when we share our records, whether we or they like it or not. Certainly they can go through the hassle of parsing, etc. but they ask: why should they have to jump over hurdles? And they won't do it. Yes, if you have the correct software, such as in library catalogs, everything gets parsed out, but not everybody has, or wants to have this parsing software. That's one reason why it's all obsolete. From that interesting article mentioned by Allen "If Libraries had shareholders", it's clear that if we want to join the larger world of information, we will be the ones who will have to change, not everybody else. Others are not standing around helplessly waiting for us. If we don't change, then our materials--and consequently our libraries and librarians themselves--will increasingly become a backwater.

But as I showed, even if somebody has the option, understands it, and chooses the MARCXML format, the fixed field information still needs to be parsed, and still nobody is going to do that.

As I wrote in my original message, there are a hundred limitations held in that horrifying list of numbers:
01142cam 2200301 a
450000100130000000300040001300500170001700800410003401000170007502000250
009204000180011704200090013505000260014408200160017010000320018624500860
021825000120030426000520031630000490036850000400041752002280045765000330
0685650003300718650002400751650002100775650002300796700002100819

These incomprehensible numbers define the record--not only how long the record as a whole is, but they define the length of each field, and where each starts and stops. If you change a field, e.g. to add "written by" in a 245$c, almost all of these numbers have to change as well.

This is why doing global changes was so terrifying: so many things needed to change, e.g. changing Russian S.F.S.R. to Russia (Federation) (one I remember well) because the definitions of the fields in these numbers all have to change as well, since the former took 16 positions and the latter took 19 positions. As a result, changing a MARC ISO2709 record is very complex, especially for global changes that demand some serious computer resources (especially in the old days), when there was a serious possibility of a crash in the middle of it, which made you shudder even to think about. Therefore, you would just do it one by one, as we did.

The new formats get around all of this and add a wealth of possibilities we haven't had before. My argument is that although almost no library catalog stores these records in this format, we still *transfer* them in this format and as a result, we are still all stuck with those same ISO2709 limitations, while everyone still needs special software to do anything with them. This automatically limits us in many ways from the modern information community. who may want to work with our records, but they obviously refuse to use them in this format. And you can't convince them that MARCXML is much of an improvement.

One huge advantage of XML over ISO2709 is that bits and pieces can be taken out and not the entire record, so someone could get a list of just titles for a mashup without having to download entire records, parse it all, extract the titles and display the results, or you can easily get basic info for an RSS feed, along with a link to the record if people want to see more. In other words, with other formats, you can work with information live, and browsers are even designed to work with many of these formats independently, but with ISO2709, there is a lot of work to do first before you can use it in any way at all.

I personally thought that OAI-PMH waas to be the solution, but it was nixed by the new information behemoth, Google, and so we have to find another solution.

And concerns over length of record are unimportant today, especially for a communcations format. The storage format will always be different, and may even vary from database to database.

To me, it's obvious that changing the rules to RDA will change absolutely nothing in this scenario. Abbreviations? Capitalization? Changing to more modern formats however, would be one important step toward bringing us into the modern world of information exchange. But it is only one step. There will be many more if we want to change those frightening trends we see in library usage shown in "If Libraries had shareholders" and other places, too.

No comments:

Post a Comment