Tuesday, August 31, 2010

RE: Feedback on RDA

Posting to RDA-L

J. McRee Elrod wrote:
But I do think it is fair to say that much of the feedback has not been heeded.

And one of the most important suggestions, from no less than The Library of Congress Working Group on the Future of Bibliographic Control, suggested the suspension of RDA since it doesn't do what is needed. See: http://www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf

I think they summed it up as well as anyone on p. 27-28 [pdf p. 32-33] with a few of my own comments in brackets:

"The Working Group has a number of concerns about the current direction of RDA, concerns that have been echoed by many in the field. Indeed, many of the arguments received by the Working Group for continuing RDA development unabated took the form of "We've gone too far to stop" or "That horse has already left the barn," while very few asserted either improvements that RDA may bring or our need for it.

The business case for moving to RDA has not been made satisfactorily. The financial implications (both actual and opportunity) of RDA adoption and its consequent, potential impact on workflow and supporting systems may prove considerable. Meanwhile, the promised benefits of RDA-such as better accommodation of electronic materials, easier navigation, and more straightforward application-have not been discernible in the drafts seen to date.
[Nor is it discernible now--JW] It is unclear how metadata created according to RDA will align with existing metadata, and how well library and related automation systems will or can handle metadata created according to the new standard. There is dissatisfaction at the apparent abandonment of the ISBD structure. There is distress over the opaqueness of the language used, over the organization of the rules, over formatting decisions (such as appearance of examples), and with perceived difficulty in navigation. Many fear that RDA will be more difficult to use and understand than is the current code, and that this, in turn, will lead to problems with education and training, in addition to increasing the likelihood that the code will not be utilized by anyone outside the library community. Finally, although RDA is being based on FRBR principles, FRBR itself is still evolving [and suspicious in its own right--JW]."

I agree completely with this. Have these issues been resolved? Of highest priority now, with the economic problems, is making the business case, which to my own knowledge, has yet to be done.

Perhaps these questions are indeed coming a bit late in the process and while that may be regrettable, for the past several years many of us have been dealing with vastly increased workloads, and I don't see them decreasing anytime soon. Yet, it is certainly better than not raising these issues at all. Besides, there was this little matter of the economic crash which has changed a lot of assumptions we may have cherished only five years ago.

No matter what, this is not merely a theoretical argument and has to do with the future of our profession and the future of our careers. Libraries are certainly not seen as "forward-looking" by the rest of society. We are faced by many, huge problems that cannot be ignored indefinitely and will be dealt with one way or another. If we mess this up in a big way now, we may lose the credibility we still have.

Monday, August 30, 2010

Cataloging Matters No. 3: The Functional Requirements for Bibliographic Records, a personal journey

Cataloging Matters, podcast No. 3


The Functional Requirements for Bibliographic Records, a personal journey

Hello everyone. My name is Jim Weinheimer and welcome to Cataloging Matters, a series of podcasts about the future of libraries and cataloging, coming to you from the most beautiful, and the most romantic city in the world, Rome, Italy.

Since my first podcast, which was about FRBR, I have received questions about what FRBR is and I have gotten several requests to discuss it further, so I conclude that relatively few people feel they understand it. Actually, this surprised me and I wasn’t expecting to talk about anything like this; still, I am very happy to do my best, but I will do it on my own terms. By this I mean that I will seek to describe FRBR as objectively as I possibly can, and afterwards I will provide my own personal opinions about it, but along the way, I would also like to talk about my own personal experiences with it since I think this may hold more meaning for people, and at the same time make it more interesting.

Also, I sincerely hope that this will help more than it confuses. If it confuses you more than before, I deeply regret it. My purpose here is not to give a complete overview of FRBR, but instead to demystify it, and to let catalogers know that FRBR is not all that new from what they are doing now. Ultimately, I hope this discussion may encourage people who want to know more to go through some of the excellent FRBR presentations and workshops that exist on the web. 

I am very active on listservs and have written about FRBR many times, so I will be repeating myself here on several occasions, but not on everything. For those who realize that I am repeating myself, please bear with me, but in this podcast, I will take nothing for granted, except that my listeners are experienced catalogers.

To begin, I would like to outline my personal journey with FRBR, which proceeded through the following stages:
Determination (to understand it)
--Incomprehension (not understanding anything)
----Humiliation (not telling anyone about my incomprehension)
------Renewed Determination (to understand it)
--------Joy (at the first glimmers of understanding)
----------Comprehension (full success)
------------Consternation (the first questions)
--------------Serious questioning
----------------Serious doubts

(By the way, believe it or not, this has turned out to be a twelve-step process, but I must state that any comparisons must stop there, or at least I hope so!)

Before talking about my journey, I believe some background is necessary.

To come to grips with these changes we are facing, it must be accepted from the outset that when people come to a library, they do not come for the librarians. Although they may be friends with the librarians, they actually come for something else: they come for what we have, that is, for our collections, and these people will decide how good of a job we do by how well our collections respond to their needs. No matter what else, people come to us for our collections, and not for us personally. Even though this is the way it has been from the very beginnings, fundamental changes were set in motion with the general access of worthwhile resources through the World Wide Web and they continue to change. Once we accept that fundamental changes are going on, questions inevitably arise: do we really want to maintain that the library’s collection, that is, the sum total of the information that is available to our patrons, is limited only to those materials that the library has paid for and organized? Is it wise to try to persuade our patrons that this is correct? I think it is not because they know that this is obviously not true. Therefore, I believe that it is vital for librarians to accept that the very idea of the “library’s collection” has changed forever and in many ways, has evolved beyond traditional library controls.

Given this scenario, how does the local library catalog fit in? Can it at all?

Additionally, the library, the library’s collection, and the catalog cannot be disconnected, even though people want to do it all the time. Patrons still ask all the time, “Where are your books on business? Where are your books on ancient Greece?” The answer is: everywhere, and that’s why you must use the catalog. If your library only has a shelf or two of books, that may be one thing, but when people come to a library that has thousands or millions of different kinds of items, the only way into all of that is through some kind of tool to find the materials that interest them, that is, through what I call a “catalog”, no matter what form the catalog might take. This “catalog” can be a traditional one--and there have been many, many forms of catalogs throughout the millennia that differ radically from one another--or it can be something very new such as the search engine of Google Books and whatever comes in the future. Nevertheless, there must be some sort of organized, semi-useful way into the collection, whatever form that collection takes, otherwise the collection itself will not be useful.

In fact, the World Wide Web would be useless if there were no tool such as Google, something that does the job of finding resources and presenting them to a searcher in ways that people find useful. If there is no organized, semi-useful way into a collection, people will not use that collection and will go someplace else that will be easier to use, especially in today’s Internet world.

None of this should be surprising and merely reflects the reality of almost any other service we use every day. When people go to a butcher shop, they go not because of the butcher, but because of the meat they will buy and take home. They may be close friends with the butcher and know his family; everyone may agree that he is the best butcher in the city, but if the meat is rotten, no one will buy it, nor should they. Also, even when the meat is good and people want to purchase it, if they encounter terrible difficulties actually getting it, for example, if they have to stand in line for hours waiting their turn, or the butcher is surly and foul-mouthed, they will go somewhere else.

Of course, if there is no other butcher in town, people will just have to accept whatever the butcher decides to give them or become vegetarians, but if a new butcher comes to town, watch out!

Consequently, to get people to use our libraries, our collections must be useful to them, plus whatever tools we provide for finding relevant materials must not present a barrier. If one falls, the other does; both are intimately connected by their very natures, even though many would like them to be separate. Before the existence of the internet and the world wide web, libraries were like the single butcher in town, but those times are over, so we had better watch out!

With that prologue out of the way, I would like to proceed on to my personal journey with FRBR.

When FRBR came out, I was still working at Princeton University, and since I had been responsible for the web presence of the Catalog Department and Technical Services, it became my job to be the “metadata expert” for the library as well. Therefore when FRBR came out, I took on as my personal responsibility the task to learn and understand FRBR as completely as I possibly could: to understand what it was, and to try to imagine how it could be put to the best uses. I was relieved that FRBR had appeared since I was beginning to sense the first changes in the information world and was happy the library field was responding.

When the physical volume came in (there were no decent ebooks in those days!), I got it as soon as I could, checked it out to myself, and began to read it. I confess that I read FRBR from cover to cover very closely (well, I didn’t actually read the index, I only perused the bibliography, but I did read the t.p. verso completely!).

As I neared the end of the book, my anxiety level began to rise. Even today, I remember my feelings very clearly when I closed that book; how I looked up and had no choice but to admit to myself that I hadn’t understood anything at all!

This was a devastating revelation but I had no choice except to admit it. I felt like the dumbest person on the face of the earth and was at a complete loss about what to do next. Of course, I couldn’t tell anyone what had happened, and all I could think of was: to read that horrible book again! It was humiliating, it was awful in every way, but I could see no other choice. I waited a few weeks, and dove in again.

It can be amazing how things turn out. Almost immediately as I began to read it the second time, I realized that... I knew it all already! In fact, there was actually nothing much new in it at all!

Here I would like to pause and before I continue, to let those of you who are listening know: if you are a cataloger--and not necessarily even an ISBD/AACR2 cataloger, but if you have worked as a cataloger, you already know much, if not almost all, of FRBR.  Remember that. The only differences are with some very strange vocabulary, and a weird structure of the records which has some unexpected consequences that can range anywhere from the  surprising to--what I think is--dismaying. But in any case, calm down: you already know it and you do it everyday right now.

When you see those strange hieroglyphics in the text of FRBR, realize you are seeing what you have always done, only described using completely different terminology and methods. So, when you see things like w1, e2, m3, and so on, this is not nuclear physics although that is certainly what it looks like in FRBR. These are not mathematical formulas, but attempts to describe in a different way what you are dealing with now. You know this stuff.

The main change is that FRBR represents a different viewpoint from the traditional cataloger view where you start from the item on your desk, how you describe it and then you fit it into the rest of the collection. (Electronic resources are different and I will try to talk about them in the next installment) FRBR on the other hand, starts with the collection and works its way down from there. Let’s see how this works in practice.

Currently, a cataloger starts from the item itself and gradually fits it into the rest of the collection, doing extra work when necessary. This is illustrated by the actual structure of AACR2, which in many ways, follows the workflow of most catalogers. You begin by describing this “thing you are holding in your hands”: finding the chief source of information, then transcribing the title and statement of responsibility, the publisher, paging, and so on. This is the ISBD part of the record, or the manifestation. Once you are done with this, you add any authors’ headings, creating any new headings when required.

If it is then necessary to fit this “thing you are holding in your hands” into the collection more specifically, e.g. let’s imagine you have a book that is an edition of Dante’s Divine Comedy, the cataloger will have to deal with various types of uniform titles, and this is the work. Then, if it is further necessary, e.g. this book has selections of parts of an English translation of the Divine Comedy, the cataloger must do further work with the uniform title (language and perhaps selections, if not something more), and may even have to further distinguish the precise version in hand with other variants: the translator (maybe), maybe an editor. In fact, the cataloger may even have to work with the publication history for printers and dates, and so on. All of this is the expression. Subjects and a call number can be added at various points in this process. Finally, if you have a barcode that relates to the specific “thing” in your hand, plus some other information, this is part of the item.

Of course, if the book you are working with is not a translation or an edition or version of something else in some way, which is the vast majority of materials, you don’t need to worry about a lot of this.

Comparing the current situation to FRBR, it’s best to think in terms of things called “attributes,” which can be thought of as similar to the subfields in MARC. Attributes are the tiniest bits of information, just as they are in MARC, e.g. in the 260 field, there are the subfields a, b, and c, and each subfield needs the 260 tag to be understood. In FRBR, there are no subfields--only the equivalent of fields, but this is no real problem since you can have the equivalent of 260a, 260b, and 260c, that is, where every bit carries all the information it needs to make it independent.

So, imagine all of the different subfields taken out from MARC in this way: 245c, 300b, and so on, plus all of the subfields from MARC authorities, and each of these “attributes” is independent of one another. There are a lot of them. Now, let’s also imagine that the MARC field and subfield codes are put into human-readable language, e.g. instead of 245c it would actually be “statement of responsibility.” Keep in mind also that this is not one-to-one since there are many subfield “concepts” in MARC that are not in FRBR, and there are some concepts in FRBR that are not in MARC. That is one of the purposes of RDA. But anyway, we now have all of these independent attributes. How do we group them?

In MARC and AACR2, all of these things are grouped by fields; for example, the 260 a, b, and c, mentioned earlier to create publication information, or the 245 a, b, and c to create title/statement of responsibility. But FRBR groups these attributes in a different way, by creating what are called “entities”. There are three groups of entities. I’ll save group 1 for later because that is the hard part. Group 2 includes the name entities, which pretty much equals what we have today for our name authority records. Group 3 is similar to our subject records as they are now. What is really different is group 1, which is where the work, expression, manifestation, and item exist, those bibliographic concepts we have all come to know and love so much!

To explain Group 1 entities, we need to return to the example I used before, While traditional cataloging starts from the item in hand and up to the collection, FRBR works differently: with Dante’s Divine Comedy, FRBR starts from the top down, that is: it starts from the work (Divina commedia), then it goes on to the expression (the translation information, the specific edition, or whatever), and only then to the manifestation and then the item, the thing you are holding in your hands. Exactly how the practical workflow for the cataloger will change is still very unclear at this point, at least it is to me.

Still, I think that when you keep this in mind, something like the following example from the text of FRBR, will make more sense (you may want to look at the transcript at this point). W means work , e means expression, m means manifestation, each entity including its own attributes.

w1 (the first work) Harry Lindgren's Geometric dissections
  • e1 (the first expression) original text entitled Geometric dissections (how it first appeared)
    • m1  (the first manifestation) the book published in 1964 by Van Nostrand in London
  • e2 (the second expression) revised text entitled Recreational problems in geometric dissections ....
    • m1  (the first manifestation of the second expression) the book published in 1972 by Dover in New York
To quickly put this into FRBR public displays, it could be something like this, and here I am using ISBD punctuation:

Geometric dissections / Harry Lindgren.
    • [Book] London : Van Nostrand, 1964
  • Recreational problems in geometric dissections .... [Rev. ed.]
    • [Book] New York : Dover, 1972
In this system, it is easy to see how another edition of Lindgren’s Geometric dissections, e.g. one published in New York by Knopf in 1968, would fit in. It would be a second manifestation (m2)  under the first work, and the patron would see:

Geometric dissections / Harry Lindgren.
    • [Book] London : Van Nostrand, 1964
    • [Book] New York : Knopf, 1968
  • Recreational problems in geometric dissections .... [Rev. ed.]
    • [Book] New York : Dover, 1972
As we can see, more than anything else, FRBR defines the multiple view of catalog records from the top down. Otherwise, it’s very similar to what we do today.

Any cataloger knows that many catalog records carry a lot of the same information, so for example, multiple editions of a certain book can repeat exactly the same title, subtitle and statement of responsibility, if not a lot more, so in a computerized environment, it makes sense that this kind of repeated information is placed one time separately where it can be used when necessary. To a certain point, this is what happens in many catalogs today that use relational database structures for the name, title, and subject headings; for example where the heading for Shakespeare is not typed in 2,000 times, but only one time. When you find Shakespeare’s name in your local authority file, his heading is not actually copied into your record, but a link is made, so that when your record displays, the patron will see his name displayed from this separate authority record, which appears along with everything else on one screen. This is normal database practice, where repeated information is entered only one time. The purpose is to make both maintenance and searching much easier and faster.

FRBR attempts to do something very similar, but extends this practice to the Group 1, 2, and 3 entities. This means that there will be separate records for each of these things. (There is a problem with the concept of the “record” but we will discuss in another podcast. For now, we will call it a record since the final product is the same) As we have already discussed, some of these records already exist: Group 2 entities (the name headings) and Group 3 entities (subjects) are pretty much what we have today. What is really different is in the group 1 entities, which posits that there should be separate entities that can be linked to for the work, for the expression, for the manifestation, and for the item. These entities can get a little more complicated since each can have links as well. To continue with our example, if you are cataloging a version of Dante’s Divine Comedy, instead of adding links to Dante’s heading, and maybe the uniform title, you would link to the “work record” for Dante’s Divine Comedy, but this record in turn needs Dante’s heading and therefore the work record would link to Dante’s name in the Group 2 entities through a special “responsibility relationship”. This way, everything could all be imported at one time.

The final product will work almost exactly like the Shakespeare heading works today in relational databases, as discussed earlier, except FRBR will also do it for the works, expressions, manifestations and items. I think there are very obvious problems here that will immediately make an experienced cataloger a little suspicious. Still, in theory--and I stress in theory--it can be imagined that such a structure could lead to a great savings in database design and record creation.

Next come more specific relationships among everything we have discussed, so FRBR defines all different kinds of relationships: work to work relationships, work to expression, whole/part expression to expression, manifestation to manifestation, and so on and so on. These get rather involved, but actually, they are no more involved than what a cataloger does everyday. So, none of this is really new.

Now come the user tasks, that is, what people want from bibliographic records, both from searching the entire catalog (i.e. multiple records) and from single records. From all of this finally emerge the Functional Requirements, that states that people want to Find-Identify-Select-Obtain specific parts of the group 1 entities (work-expression-manifestation-item), finding them by their group 2 entities (name headings) and/or by their group 3 entities (subjects).

So, while you may want to find bibliographic records by their subjects (e.g. find resources about evolution), you do not want to obtain all of their items, which could number in the hundreds of thousands, if not more. Before deciding which item(s) you want to obtain, you first need to identify certain resources, and then select more precisely what you want. At the very end, FRBR declares that if bibliographic records are to function correctly, they are required to achieve these tasks and thus we have the Functional Requirements for Bibliographic Records. If records do not do this, then they do not function correctly.

As a result, we can still fulfill Cutter’s Objects, or at least most of them!
(this is a quote)
  1. To enable a person to find a book of which either
             (A) the author       }
             (B) the title            } is known.
             (C) the subject     }
  2. To show what the library has
             (D) by a given author
             (E) on a given subject
             (F) in a given kind of literature.
  3. To assist in the choice of a book
             (G) as to its edition (bibliographically).
             (H) as to its character (literary or topical).
(end of quote)

I have to admit that when I finally understood all of this, I found it comforting to know that Cutter’s Objects would still be fulfilled. After all, that is what we have all been working toward for over 100 years!

In this segment, I have tried my best to present FRBR as simply and as honestly as I can, and with a minimum of bias. I would like to point out once again that the purpose of this part of the podcast is simply to demystify FRBR and make catalogers aware that they already know most of it, but not to provide a complete description. Still, it is interesting to note that FRBR will provide the same access as we have now, and the emphasis rather is placed toward providing a more coherent view to our patrons than the multiple card displays we have today that pretty much replicate how the card catalog worked. This means that instead of seeing 100 completely separate bibliographic records under Shakespeare’s Hamlet, where each record needs to be viewed to understand what it is, users will be able to see a more comprehensive and coherent view of the range of materials of Hamlet authored by Shakespeare, or materials about Shakespeare’s Hamlet, or still other resources based on Shakespeare’s Hamlet.

I have decided to stop here and save the rest for the next podcast. There is plenty to deal with in this one and I genuinely want others to make up their own minds before I relate the rest of my personal journey with FRBR, including the struggles with my own doubts, my despair and hope. Oh yes! And there is RDA to discuss as well, which is the practical implementation of FRBR.

To sum up, at this point in my twelve-step process, I had gone through several of the steps, and was in the stage of Comprehension and perhaps a little smug with my success. But very soon, will come my Consternation stage and matters begin to snowball from there. There are various reasons why this happened but it’s not that I think I am any smarter than anybody else. Primarily, I think it was because my own horizons expanded far more than ever before: I was living in a different culture; I started working in different organizations that didn’t use MARC, AACR2 or even ISBD; I began to work more with computers and saw the power of other formats, and I learned a lot when I began to do reference work with the public. But perhaps most important of all, I examined myself and saw that what I had always done with information had begun to change in fundamental ways.

The music I would like to close with is a short piece by Luigi Boccherini, the passacaglia from his "Night music from the streets of Madrid.” This recording is from the Internet Archive, and as before, there is no information about the performance itself, but this time, I am including the entire piece since it is so short, and so much fun. http://www.archive.org/details/Boccherini

That’s it for now. Thank you for listening to “Cataloging Matters” with Jim Weinheimer, coming to you from Rome, Italy, the most beautiful, and the most romantic, city in the world.

Thursday, August 26, 2010

RE: Elementary errors (Was: Rule of three--gone??)

Posting to Autocat

Bryan Campbell wrote:

Given, as you mentioned, that "the rule of three is beaten into our heads so mercilessly," I would think more comments would see this "error" as a brief and welcome break from current tradition. We have complained for years in this forum about the rule of three; in this case, there should be more cheering, less jeering. Sure, it's an error under the current rules, but honestly, it's not one that I would worry much about because no real harm results from retaining the additional names. If it relieves one's anxiety, switch the 100s to 700s, validate the names, and move on.

As I see it, the real error in practice is that we even allowed the rule of three to persist for so many years. Assuming RDA is implemented, I fear that because "the rule of three is beaten into our heads so mercilessly," too many catalogers will gravitate toward the RDA option that allows them to continue it in practice.

Concerning the rule of three and the problems of maintaining high standards, I think eliminating the rule in three may actually make things more complicated. What about dictionaries? What about wikis? What about serials? What about materials put out by international agencies with 20 or so corporate names on them? I'm sure others out there could come up with other materials. As I wrote in an earlier post, "The number of exceptions to the rule would doubtless skyrocket and the practical cataloger would probably rely on the rule of 'do as many as I feel like doing today, based on the amount of other work waiting on my desk and, let's face it, I had a hard night last night.'" As a result, I believe that something that seems as simple as eliminating the rule of three actually winds up *eliminating* a standard.

Still, none of this means that we can't simplify matters tremendously, increase access *to a point* and not hurt our productivity too much, which I think should be of the highest importance today.

But getting back to the topic, I would also like to believe that current standards are not too high, but reality may be showing us something different. If standards in any field are to work, they must be practical to implement. For example, there must be trained people. The standards must be readily available. You should also be able to discuss problems. There should be retraining made available at regular intervals. All of this requires funding (except for the standards being readily available, which today *can* be achieved for free over the internet, if something like the cooperative cataloging rules would be implemented, perhaps even the need for discussion of problems if the Web2.0 tools were used).

If there is no funding available for following and maintaining standards, or only a small, restricted part of the field can afford it (i.e. research libraries and other special institutions), the standards *cannot be implemented* and it doesn't matter if these are standards are in food production, health care, building construction, or anything else. We must ask: what is achievable practically, and especially, in today's increasingly austere financial climate? For example, it would be great to enact a standard that says that all automobiles should get at least 200 miles to the gallon, and maybe one group of people out there can achieve it. But if it can't be implemented generally in a practical, and financially viable way, that standard will get ignored.

I believe that quality of records is lower (although I would be happy to be proven wrong), and does anybody out there really think that RDA will be simpler?

Elementary errors (Was: Rule of three--gone??)

Posting to Autocat

I don't know about everyone else, but it seems as if the number of extremely elementary errors, such as this one, is rising lately, and I have tried to figure out why. I mentioned this in a posting to NGC4LIB, and I suggested a few possible reasons:
1) inadequate training
2) adequate training, but inadequate time to do a decent job
3) adequate training, perhaps with adequate time, but low morale to do a good job
4) one that I wonder about: are the current standards simply set too high?

I'm sure there are other reasons as well. Also, whenever I have seen these sorts of elementary errors, I worry that the person may be doing it on everything, and who knows what other mistakes they may be committing? No matter what though, the rule of three is beaten into our heads so mercilessly, that I find it difficult even to imagine that someone could make a mistake on it. After all, I am sure I am not the only cataloger to have been relieved to see that "nightmare" personal name or the translated name of some foreign corporate body, but there are four, so it winds up saving a lot of work.

And then, I ask: I haven't heard anyone argue that RDA is simpler than AACR2, and if there are these obvious bloopers now, and I have said that I--at any rate--believe the numbers of these bloopers are rising, what can we expect when the added complexity of RDA comes in?

Tuesday, August 24, 2010

RE: all catalog queries are reference questions, but not all reference questions are catalog queries

Posting to From the Catalogs of Babes

I just saw this comment here from Shawne Miksa (http://catalogsofbabes.wordpress.com/2010/04/28/all-catalog-queries-are-reference-questions-but-not-all-reference-questions-are-catalog-queries/#comment-684) about one of my postings, where I stated that the “local collection” must be redefined:

“James and I have gone ’round on this issue before on that listserv. This whole issue is misunderstood. To focus on the local collection does not automatically mean the librarian or cataloger or whoever is thinking narrow-mindedly. It’s rather insulting and ignorant, in my opinion, to even suggest it.”

and later,
“Tell me the logic in declaring a librarian isn’t thinking broadly just because they may still maintain and enhance a local catalog. I do not accept the argument that a library catalog has to be entirely redefined simply because we have a tool that allows access to resources outside the library. I do not say “re-engineered”, just redefined. It is always good to improve a useful tool to get the most efficient outcomes, but that does not mean we have to throw out the baby with the bathwater.”

While I don’t want to be insulting, although I will be the first one to confess that I may be ignorant, to me, the “catalog” must give access into something, and that “something” has always been the local collection. When that collection has changed in fundamental ways, or technical improvements arose, the libraries and their catalogs have changed in reply. So, when photographs began, or later when computer files began to be created, the catalog changed. When a collection takes on a huge, other collection, the catalog changes as well to contain the contents of that other collection. For example, if a library takes on an entirely new collection, such as a children’s collection of 100,000 items, or if the manuscripts division takes on the entire archives of some defunct organization, creating the catalog, or working it into the collection somehow, can take a long, long time. Naturally, OPACs have had their impact as well.

With the world wide web materials, the very nature of a local collection has changed fundamentally, and it is out of our control in many ways. For example, if I am interested in Panizzi and his catalog at the British Museum, I probably want to see a copy of his catalog. Perhaps I can find it through my local catalog, but probably *only* if I am at one of the great research libraries (which most people are not). Several of these items probably would not be available through ILL because they are fragile, so I may have to pay for microfilming and/or scanning, plus pay for the shipping. My library would probably not pay for all the costs.

I would have to want this information very badly indeed, and would probably forget it because of the price. But today if we look beyond the local catalogs and Worldcat, I know that I can even download my own copy at http://books.google.it/books?id=cE0MAQAAMAAJ, and not only that, there is this book that is critical: http://books.google.it/books?id=hS_RgCVnUwQC plus lots more out there. And it costs me, and the library, nothing. This demonstrates that the “collection of information” available “locally” to my users has changed, and changed irrevocably. And thus, we have a new type of “local collection”.

No library catalog does this now that I know of. So far as I am concerned, this is in everyone’s interest: certainly the users’, the reference librarians want this, the CFO would love it especially in this time of tight budgets. But for the cataloger, it would be a nightmare to add all of these resources (and Google books certainly does not represent everything) to each of the local catalogs would be an extraordinary amount of extra work for the catalogers, who would buckle under the strain.

If the only solution is to expect everyone to search multiple times in multiple catalogs, we must admit that this simply will not happen. Of course, these so-called solutions are not solutions but actually backward-thinking, and why? Mainly because we need to think in more global terms, instead of concentrating on the idea of multiple local catalogs with multiple records, there are all kinds of new options.

Thursday, August 19, 2010

RE: Google/Verizon policy framework

Posting to NGC4LIB

I have just come back from a short vacation and am starting to go through some of the previous posts, so pardons in advance if I rehash some older discussions. This thread caused a lot debate and I can't find that my own opinions on this came out anywhere.

To me, it is obvious that this agreement is an effort to stop Net Neutrality and to "enclose the commons" as I mentioned in my latest podcast. To me, to argue the opposite simply does not make any sense, so any arguments can only be that either Net Neutrality is not a good thing, or that it needs to be "adjusted" for more "important" uses, such as the example I read of offering health services, like wireless cardiac monitoring. After all, Net neutrality could obviously get in the way if your doctors, who need your readings which will save you from a possible heart attack, have to wait while some unknown person out there downloads the latest burping baby video on Youtube. I also read the example of the streaming 3D opera performances from the Met, so that people should be able to get this "higher-class" offering before some lonely guy out there can watch his porno movie.

To me, these examples are simply inane.

Since I am not an expert in these things, I cannot say that a change *only in the last part*, that is, the ISP connection to the larger internet (which the agreement says will not be changed from Net Neutrality), will make much difference in how fast a 3D performance from the Met would came through, although the importance of this agreement seems to indicate that changes at that point would make that much of a difference. Nevertheless, it does seem to me that if some things will come through noticeably faster, then other things are definitely going to come through far more slowly and therefore, somebody, somewhere is going to have to wait. This is what I think is far more interesting and what librarians should concentrate on: not on the things that are easier to get, but everything else: what will be harder to get.

So, if I would have a Verizon account with Net Neutral access (which I imagine I would be able to buy) and another person has the one for the 3D opera, it would seem to me that I would still have to wait for the dude to watch his opera before I can read my news or my email, unless perhaps I were willing to pay more for faster access. Rupert Murdoch's materials will be much easier to access than Democracy Now. It's important also to keep track of reality and although it may be sad, it is nevertheless obvious that 3D porno will be vastly more popular than 3D opera. In this agreement, I ask: who are the people that will have to wait, and exactly what will they have to wait for? Is this what people want? How does this fit in with the traditional values of librarianship.

Wednesday, August 18, 2010

RE: comprehensive bibliographic database of "open" resources?

Posting to open-bibliography

Additionally, with the WorldCat Rights and Responsibilities for the OCLC Cooperative (2 June 2010) at http://www.oclc.org/worldcat/recorduse/policy/default.htm,
they begin by stating:
"The purpose of the policy is to define the rights and responsibilities associated with the stewardship of the WorldCat bibliographic and holdings database by and for the OCLC cooperative, *including the use and exchange of OCLC member-contributed data comprising that database*." (my emphasis-JW)

Later in the "Glossary" section, we find something even more interesting:
"WorldCat Data. For purposes of this policy, WorldCat data is metadata for an information object, generally in the form of a record [my emphasis-JW] or records encoded in MARC format, whose source is or at one point in time was the WorldCat bibliographic database.

You have received WorldCat data when (1) you have extracted it directly from the WorldCat database using one of OCLC's services for members (e.g., Connexion, WorldCat Cataloging Partners, CatExpress, the OCLC Z39.50 Cataloging Service, Batchload services) or under the terms of a non-member agreement with OCLC; or (2) you have extracted it from an online catalog or another source to which extracted WorldCat data has been transferred or made available.

Identifying WorldCat as the source of data that has been transferred or made available downstream of the initial extraction from WorldCat can sometimes be complex. A combination of the following data elements in a bibliographic record can help determine if the record was initially extracted from WorldCat:

* An OCLC Control Number along with
- the 001 field that includes value characters "ocm" or "ocn" and/or
- the 035 field that includes the value "(OCoLC)" and/or
- the 994 field"

I think all this must be read together. Even though your library is not, and has never been, an OCLC library, you may still be in possession of what is defined here as "WorldCat Data" and therefore subject to this policy. This also clearly includes single records. Although I am not a lawyer, from what I read here, it seems that once something has touched OCLC in any way at all, and no matter what you have done with it, OCLC claims ownership (i.e. that it is WorldCat Data) and that it falls under this policy.

How this deals with, e.g. a record created by the Library of Congress, perhaps even as CIP (i.e. public domain), then being downloaded and updated by another library, finally, I would take this record directly from e.g. Yale, through Z39.50 and update it myself, according to this, this record would still fall under OCLC's policy.

Would this hold up in court? Who knows? Although I am a fan of many of OCLC's projects and services, this seems to be overreaching to me. And as Karen points out, it could be that metadata, even including subject analysis and standardized headings, are considered to be facts and are not subject to copyright.