Friday, October 29, 2010

Re: Displaying Work/Expression/Manifestation records

Posting to Autocat
Hal Cain wrote:
<snip>
I am astonished to find FRBR described as a "19th century view" -- though I guess I may be grateful to have been spared that absurd "card catalogue" slur that appears from time to time! "Work" is a concept implied by the organization of some 19th-century catalogues; against that, the FRBR distinction between manifestation and item (which maybe defies justification in the context of electronic documents automatically assembled to be downloaded from a source for consultation) is not to be found in the 19th century, nor throughout most of the 20th -- all too frequently in my garnering of retrospective records I find copy-specific or institution-specific information within the shared record; that of course I can agree is, or is a legacy, of card-catalogue practice (or book-catalogue practice maybe), when sharing of records was not a criterion for record content.
</snip>
Sorry Hal, I can't agree with this. Considering manifestation/item, our predecessors had plenty of those in their catalogs. Take a look here at "The catalogue of books in the Boston Public Library in Franklin Place" (1844), and see how many places they note "another copy" http://tinyurl.com/3xhahmx. You can also see "another set". In this regard, I remember reading complaints from people from that time, who said they were sick of looking at the words "another copy" over and over again.

By the way, I noticed in several instances where catalogers back then would consider as copies what we consider different editions, e.g. at the bottom of p. 252
http://tinyurl.com/3anc24f Smollett's "History of England," both 5vols but different publication information. But the one just above that appears to really be another copy.

They certainly knew about editions and copies theoretically as well. Jewett gives a nice discussion of work/edition/copy in his "On the construction of catalogues of libraries" plus on p. 11+ at http://tinyurl.com/3ab2a2v.

The work, as you say, was handled through organization of records, and we can see this in printed catalogs, too. For example, if we take "Index to the catalogue of books in the Bates hall of the Public Library of the city of Boston" (1866) http://tinyurl.com/2v8wojh and look under Cicero (p. 115+), we can see how the records are arranged to bring the works together. Again, in a printed catalog, there is no reason for printed headings as we have with cards and in our records today because they are different tools. Only a single heading is necessary, e.g. under Cicero, M.T., we see "Philosophical works, etc."

This all looks very nice and handy, but transferring these methods to the card catalog had troubles. If we look at Princeton's scanned card catalog http://library.princeton.edu/catalogs/supplementary.php, look up Cicero, click on his name, and we can see that the arrangement of the cards began with Works, then Selections, then Correspondence, etc. before going to the individual works, just like in a printed catalog. So, if you click on Speeches, you will see collections in Latin, then English, and then other languages.

In order to use this very handy arrangement however, people had to look at lots of individual cards (i.e. unit records) often in languages they didn't know or care about at all. Additionally, subjects are sprinkled throughout this arrangement (criticism of his speeches as a whole, criticism of individual speeches, etc.), while subjects for Cicero himself are placed at the very end of all of his works. Such an arrangement is highly complex and this is why many libraries opted for separate name and subject catalogs, but both had problems. The catalogers understood all I have mentioned, but realized that they couldn't have it all, at least not using the tools that they had.

Still, once you got the hang of it, the card catalog was not such a bad arrangement because it made a lot of sense, although still clearly inferior to the printed catalog display. But when the OPAC arrived, the whole thing broke down completely because of the mindless alphabetical arrangement in the OPACs. So, instead of Cicero's Works coming at the beginning of the browse followed by Selections, Correspondence etc., today they display in alphabetical order: Works come only under "W" where no untrained person would ever think to look; selections under "S" and so on. Just as bad are the arrangement of subjects, e.g. "United States--History" where Queen Anne's War comes after the Civil War, History, Military and so on and on, which is just ridiculous. Certainly, things need to be fixed.

I understand this and how it came about. I also have *great* respect for those people who did all of this, but I have to ask: those nice printed displays we saw above for the works of Cicero: is this mainly what people want today? Of course, we should not forget that today, the number of materials by and about Cicero will be far more massive than the relative pittance we see in the catalog from 1866, when that catalog above was printed. That is why I like to point to Fiction Finder, which unfortunately no longer operates, but we saw FRBR in action, and OCLC did a great job. People could see the work of e.g. "Kidnapped" and navigate through all of the huge number of expressions, going through the manifestations and finally land at an item. It was interactive too.

But I found it terrifying. And if that's the way I feel, I am certain that a non-cataloger would feel the same way, if not worse. Also, I can't believe that something like this would really be practical, in the sense that people would find it useful or that it would answer any of the real questions they have. People might play with it for awhile, but then abandon it as useless for their purposes.

FRBR essentially envisions a more or less complete bibliographic history of each work, just as in those catalogs of the 19th century, and while very few people may find complete bibliographic histories useful (I am one, but I am weird!), I think that for the average patron, such a tool would provide nothing more than what they get today, just in a different display that may horrify them once in awhile. And we know people are abandoning our catalogs now for newer tools.

I think we can do more. For one thing, we can spend our time doing *something* to make the displays that have been based on browsing more comprehensible to our patrons instead of just ignoring the situation, like the inane "United States History" example. We don't get many complaints, if any at all. Why? Because people don't browse like that today! Anyway, those displays are shot and have been for a long time. Do we fix them to make them work as they did in the card catalog? That would seem to me similar to figuring out how to incorporate horse shoes into an automobile tire--matters have changed too fundamentally. Can we come up with something genuinely new, innovative, and maybe even a little bit cool?

These are the things that I believe can really make a difference to our patrons and our field.

Brief records

Posting to Autocat

As Mac pointed out in his analysis of the sample RDA records from LC: "Thanks to the efforts of Richard Violette and Jim Bowman, I've been looking at more RDA records in the LC catalogue.  Below are only brief records, in interest of brevity.  And let's face it, most patrons will only see brief records."

Of course, he is right, but therein lies a thought that has troubled me for a long time. If what Mac says is true-as I think it is--then why are we adding all this additional information found in our long displays, to our records? It seems inescapable that IF most patrons are only looking at the brief records, then what we are doing is adding all this additional information for a small minority, or perhaps, primarily for ourselves. Even if we say how important this information is for reference librarians to do their jobs, which I hope is true, the uncomfortable fact is that the number of reference questions is decreasing and crashing through the floor.

To be fair, we need a lot of specific information to do our jobs as collection managers, but the brief display has always bothered me. I understand why it has been implemented: during the days of the card catalog, when browsing through a bunch of cards, everything was full display. Yet, through the arrangement on the card, formalized in ISBD, people could concentrate on the tops of the cards (headings and body of the entry) and move on quickly. If they wanted the "full display" it was simply a matter of dropping your eyes down and reading more deeply. This was too wieldy in an OPAC and they came up with brief displays.

I still don't like them however, and I have felt they are actually a disservice to our patrons. In fact, when I realized that the multiple display in my catalog contained the call number, and that people were bypassing the full display, I decided to, in essence, force my users to look at the full display, with all of the headings, summaries, notes, etc. if they want an item. I did it simply by suppressing the call number and links in the multiple display, and not even having an option for a "brief display". If someone wants to get the call number or the link to the electronic resource, in my catalog they must look at the long display.

Perhaps that is awful of me, but the labor of a click to a long display does not seem to be too onerous for people, and I have received no complaints. And in any case, in my professional opinion, I do not see how a patron can begin to understand how a library catalog works, or sense the potential power within a library catalog if they never even see any of the controlled vocabulary, except for this incomprehensible concept (for them) of "main entry"? If our patrons see only brief displays, how can they have much respect for what we do? And why would they continue to fund us? If the records we are making are aimed only at the small minority and ourselves, I think it may be increasingly difficult to maintain that what we are making is "important".

I am not saying that my solution is the only one. There may be some kind of automated solutions for this, e.g. "onmouseover" events, where if you just run your mouse over a specific area without clicking, you could see the long display, or something like that. But I think people need to be very aware of the full displays and how these displays can help them.

RE: Cost of Cataloguing?

Posting to Autocat
On Thu, 28 Oct 2010 10:05:26 -0600, A Louie wrote:
>Someone else suggested that it could be 1 book an hour = 7 books a day, taking into consideration hourly pay.
>
>How much time does it take for each of you to make one original catalogue for a new item using MARC or AACR2, etc.?
Of course, it all depends on the item and how much work it takes: you may be able to create a record for a new edition of Gulliver's Travels very quickly, but a new work of non-fiction, which needs in-depth subject analysis and a few names set up can take a lot longer. Classification may even be the sticking point. Still, the original non-fiction may be just as quick as Gulliver's Travels if it is just a new edition of something already in the database. Original records are *not* all the same. Plus, if something is copy cataloged, if that record is lousy, it can take more time to fix it than doing an original record.

Also, a lot depends on your system--if it has automated much of the process of creating new authorities headings and so on--and if you are involved in cooperative projects such as NACO or SACO, which have their own demands. Finally, how much work a record will demand can only be determined at the time of cataloging, and not before. Many times, I am sure we have all experienced taking a book off of the shelf and thinking, "Wow! This one is going to be tough!" only to discover that the subject is easy, or that a weird corporate name had already been set up by someone else. On the other hand though, I have also taken an innocent-looking book off the shelf that has literally exploded in my face! What seemed a simple name, for example, may take a long time, or the subject can become a nightmare! (I am thinking of the Russia/Soviet Union/Former Soviet republics problem)

When I started, it was considered 1 original record per hour, but it seems as if this has changed. As Mac pointed out, his people are expected to do a lot more. If we are to maintain any kind of relevance as we move into this "brave new information world," I believe that one of the most important tasks is to raise productivity by a significant amount. Standards of production show that the main points are to automate wherever possible and adhere to shared standards.

Many of the tools we use to do our jobs are still based in the 19th century and must be rethought. The actual amount of work an individual does is less important in this sense. For example, a farmer using a scythe on his wheat is working himself very hard and can accomplish only a limited amount. A big, fat modern farmer sitting in his air-conditioned combine, smoking, drinking a beer and listening to music, isn't working nearly so hard, but is accomplishing much, much more.

Adherence to shared standards is just as important, to keep everyone from having to rework the same things over and over.

RE: Another take on Wikipedia and (academic) libraries

Posting to NGC4LIB
Laval Hunsucker wrote:
<snip>
Some may find interesting this (two-page) article by Corinna Nohn, dated Monday  25 October, published on sueddeutsche.de :
http://www.sueddeutsche.de/kultur/wikipedia-kompilationen-bullshit-amen-okay-1.1015680
"Fehlkauf mit System: Immer mehr aus Wikipedia-Artikeln kopierte B├╝cher finden sich in Uni-Bibliotheken. Die "Enttarnung" gestaltet sich schwierig."
etc.
</snip>
Thanks so much for bringing this to everybody's attention. While I probably shouldn't be surprised about this, I still find it shocking. I remember reading somewhere several years ago that in the 1600s a man in London was printing the same books under different titles repeatedly, and the booksellers forced him to sign an agreement that he would never print another book again! (Sorry, but I can't find where it was)

I found those publishers in the WorldCat database too. I wonder: how could a next-generation catalog help here? Librarians cannot be expected to check everything vs. Wikipedia. What could a next-generation library do to help people not only not to read it, but more importantly to help people avoid buying it.

Thursday, October 28, 2010

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat
On Wed, 27 Oct 2010 10:59:49 -0500, Joel Hahn wrote:
>Thus, while it most certainly *can* be done, doing it *right* with existing records has some difficult hurdles to overcome, and is much more difficult than simply using existing title or name/title authority records as "Work" records (generating new records where needed from the 1XX, 240, & 245) and existing bibliographic records as "Manifestation plus" records, and letting the Expression level be empty--an approach which also has issues.
>
>However, if records as they are now would be beefed up with more linking fields, better tagging of what are now headache-inducing complex cases, and similar improvements (none of which requires adoption of RDA to accomplish, for the record), that would make the process significantly easier and more accurate.
Absolutely correct. Reconsidering our records as they stand now could make significant differences for our patrons in how they interact with our catalogs--differences they could really see and perhaps, even appreciate. Certainly cracking our brains to force everything into this 19th century view of the world of information called FRBR is a lot like forcing a square peg into a round hole. It just doesn't fit. So, instead of whittling the peg down, or boring out the hole, we should open our minds to new horizons because it's a big world out there.

As a result, doing it "right" in an absolute sense is a goal that will most probably never be reached. The achievable goal should simply be to make our catalogs better than they are now. This is what Google and the other tools do: just make it better, perhaps in tiny, little ways, but do it constantly. The public has come to expect constant "improvements" in these types of tools and it would be wise for us to think in these ways as well.

There is a thought-provoking and chilling article in the latest issue of Illuminea from Oxford University Press http://www.oup.com/illuminea/ which is an excerpt from an article by Rick Anderson, "If I Were a Scholarly Publisher" EDUCAUSE Review, vol. 45, no. 4 (July/August 2010): 10-11 http://tinyurl.com/2wn6xlh. The Illuminea version has some comments from a publisher and librarian. (How do these versions fit into FRBR? Why do we have to force it in there? There will doubtlessly be several more "versions/variants/discussions/LordKnowsWhat" placed on the web and elsewhere concerning Mr. Anderson's article, and people will need to be aware of them. Is FRBR the only solution? The best solution? Is it a solution at all?)

He discusses the future of libraries in very real terms and the impacts this will have on scholarly publishers, who rely very, very heavily on sales to libraries, sales which are most probably drying up for an indefinite amount of time. Therefore, what are scholarly publishers to do? His suggestions are, simply put, chilling. Everyone should read this. It represents some of the enormities and seriousness of the challenges we face.

Tuesday, October 26, 2010

FW: [open-bibliography] Library support and REST

Posting to open-bibliography
Christopher Gutteridge wrote:
<snip>
OK. Open is very important, but most people won't do extra work for the common good. I prefer carrots to sticks, but maybe appealing to librarians isn't the only approach...
Peter Murray-Rust wrote:
> I had hoped to find some feeling among libraraians that they cared about this but I haven't seen any - I've blogged, tweeted, etc. and I know these get around.
It may be that individual libraries don't feel they will see the return on investment of the training in new techniques, retooling of data and risk of changing their licenses.
</snip>
My own opinion is that most libraries are extremely bureaucratic places, and that the very concept of "time" in a library environment is much more akin to geologic eons ("You can't change that! That precise matter was discussed at a meeting between the Head of the Library and the Dean of the Faculty in 1965, and it was decided that....") Comparing this attitude to our normal idea of time, and especially to the hyper-fast idea of time on the world wide web, where anything from 2 years ago (or less) may as well have come from the Assyrians, we can begin to understand the problems.

Also, I believe that matters have changed so much for libraries since the WWW, and that reference questions have declined drastically; libraries have already lost the science, technology and mathematics people, the social sciences are leaving, all that is left is the humanities, and now with the budget cuts, I don't know if the problem is that librarians don't care or if they are just terribly depressed, feel they have no control over anything, and prefer to look away.

I was personally hoping that open source and the entire open movement would be the key to excite the field of librarianship again, since I personally believe the open movement is exactly where librarianship belongs, but it hasn't seemed to happen. Libraries (as opposed to individual librarians) are highly conservative and consequently very slow to change. It seems that in the present time of decreasing budgets, even more conservative movements and a real retrenchment may be what is in store for us. I can only hope not.

Yet, if the field of librarianship were to get behind the open movement (and some libraries and librarians are, to be fair), it would be a tremendous advance. I guess this is a rather abstract statement though, and what is needed are some real prototypes where administrators can see the possibilities.

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat
 
Joel Hahn wrote:
<snip>
There may be evidence that people want FRBR-type displays, but it's kind of hard to do that sort of research in the absence of comparable bibliographic metadata designed to support FRBR-type displays, let alone the absence of a truly bleeding-edge FRBR-type display.
</snip>
I cannot agree that we don't know if people will want something until we make it for them. Businesses do this kind of research all of the time. I haven't seen any studies at all, but I have seen the statistics and people tell me how much they love the new tools, and are using library catalogs less and less. My experience working with young people is that they have conceptual problems with the idea of searching "metadata/summary records," which is much more abstract for them than the Google-type searching they can do, which leads them right into the full-text. I don't think young people are stupid at all, it's that they are having troubles relating to what we make because it is becoming more and more distant from what they use much more often: full-text searching. The task is to fit what we make into their world and not vice-versa.
<snip>
But I can't count how many times I've heard catalogers mention here and elsewhere that they follow local practices like cataloging mass-market paperbacks and trade hardcovers and multiple editions of each on the same record, just so that patrons can find all print versions of that work in one place and place one hold that can be filled by any version. (And often e-books and large print versions, too.)
I have also heard several times from ILL librarians who've had to deal with patrons who were absolutely irate that the version of a book that came to fill their request wasn't the version they wanted. (Different editor, different afterward, different edition, different CD-ROMs, print
size too small, etc.) That is likely a case of a loud minority getting the most attention, but any library metadata scheme that isn't granular enough to also support bibliographic research isn't doing its patrons any favors, whether they realize it or not.
</snip>
If there are problems with holds because of versions, that is a separate issue from FRBR and RDA. But concerning ILL: that has to do with the quality of the metadata, where I have seen a lot of problems, too: wrong editions attached to the wrong resources, and don't even get me started on lousy subject analysis! Again, this is a problem with training of staff who are supposed to ensure that the description (and subjects?) follow agreed upon standards and genuinely represent the item. Enacting FRBR/RDA will only be yet another standard everybody can ignore.
<snip>
That has already been done to some extent--OCLC's doing it right now in worldcat.org, for example, and I saw a VTLS demo at ALA shortly after the original FRBR report was published of FRBR "tree" displays--but without the proper direct links in the metadata itself, the process of grouping related records is somewhat error-prone and not as useful as it theoretically *could* be, and the programming required to try to work around the inherent blind spots is very laborious to come up with. (It's also difficult to programmatically split up existing MARC bibliographic records into three levels--but a split into two levels can be done without nearly as much difficulty, by co-opting and, where necessary, generating title & name/title authority records for the "Work" level and the existing bib record for the "Manifestation" level.)
</snip>
But is it less laborious to redo all of the cataloging, retrain people all over the world, change all the systems and documentation, etc. etc. etc.? While I am sympathetic for the poor programmers, after all, that's their job, no matter how laborious it is. It is their job to save our labor, and not the other way around. For example, the records themselves really do not need to be split in order to have novel displays of multiple records. That can readily be seen in all kinds of information databases.

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat

On Sat, 23 Oct 2010 02:40:01 -0400, Hal Cain wrote:
>On Fri, 22 Oct 2010 13:50:49 -0500, Mike Tribby wrote:
>
>>As to Amazon's willingness to make changes in their procedures, they have definitely been apprised of the shortcomings in their search results that we have discussed on this forum as well as the problems their publishers have getting edition and ISBN changes into their system. I don't know of any complainants who have received more than perfunctory answers about these issues, but that doesn't mean no one has.
>
>This also means they can't (or can't truthfully) say that "nobody has complained" about their inadequacies.
People always complain--they have complained about the library catalog for years for all kinds of reasons with few results, and the fact they have been abandoning it for other options speaks volumes.

But returning to FRBR: has anybody ever seen any research or evidence that people want FRBR-type displays? Remember, OCLC found that less than 20% of everything in their database has some kind of variant that would make any difference at all with FRBR. I wonder what percentage of that is for obsolete textbooks that almost no one will ever look at again? In the complaints I have read about the metadata problems in e.g. Google Books, they discuss accuracy of data, but I have never seen a word about not being able to get works, expressions, manifestations and items.

Additionally, I have never seen any attempts that have shown that FRBR-type displays cannot be made with the records we have right now. I personally see no reason why they could not be generated from records as they are now.

I still maintain that above all, we should be working to provide what people want. If research shows that people say they need FRBR-type displays so badly, then OK: figure out how to do it in the most efficient ways possible, but I honestly doubt if the results would come out this way.

Friday, October 22, 2010

RE: WorldCat Knowledge Base to Provide Full-Text Access to OA e-books and Articles

Posting to NGC4LIB
Eric Lease Morgan wrote:
<snip>
On Oct 21, 2010, at 7:41 PM, B.G. Sloan wrote:
> "In what appears to be an expansion of its Direct Request for Articles project, OCLC announced the launch of its WorldCat knowledge base, through which it will be providing one-click access to open-access full-text ebooks and articles in WorldCat Local search results." -- http://bit.ly/crDHMT
From the article:
The WorldCat knowledge base includes material from large open-access journal sources, such as the Directory of Open Access Journals, PubMed Central, and BioOne, as well as freely available content from the HathiTrust digital repository and the Internet Archive. Notably, open-access material from licensed platforms, such as Elsevier, Wiley, Springer, and Nature Publishing Group will be accessible.
This is exactly the sort of thing I've been advocating for quite a while. Based on the content of one's existing catalog (and therefore one's local collection development policy), crawl and harvest content from the Web, mirror it locally, update the local catalog to point to the locally harvested content, index it, and provide services against the result. Such a process addresses many of the traditional library activities (collection, preservation, organization, and dissemination) and manifests them in the current digital milieu. The library profession does not need (nor require) a for-profit company (or any other third party) to do this for us.
</snip>
Yes. The technical problems should not be that difficult today, and it has always interested me why this hasn't happened long before. Somewhere I read that the real problems are with the quality of the metadata. Perhaps this article: http://tinyurl.com/2w5nsp6 "But both agreed that to advance the service; to provide the improvements needed to make the data more uniform, e.g., reconciling alternative data formats;..." Of course, dealing with variations in data formats is a huge issue, but there is also even greater variation within the data itself, e.g. forms of names. Both of these issues are solvable in various ways, e.g. some kind of crosswalks, and URIs with shared concept servers.

But you mention the local catalog, and it seems obvious to me that the purpose and even the very idea of a "local" catalog will have to change in some way as more and more digital resources become available since people can/will be able to "obtain" these new resources with a click of a button (to use FRBR terminology), and therefore will be easier to get than the items on our shelves. Not any less important, these new items will be far easier for people to manipulate than ever before. Of course, much of this is premised on some sort of decent portable digital reader, which may be here already.

It will be interesting to see how this develops.

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat

On Wed, 13 Oct 2010 09:37:00 -0400, Williams, Ann wrote:
>I posted months (years?) ago about needing more information on catalog records such as additional subjects and suggested even adding subjects for chapters. I think when I posted this before, someone objected commenting to the effect that why not trace the whole index. Well, look at what Google books does: does having everything including the text be searchable hurt the patron searching for items, or help him/her? Sure many of our fields and our controlled vacabulary are missing, but is the tradeoff worth it? What If Google decided to add some of our fields (subjects, genre) and authority records for authors? I believe in enhancing OCLC records, but is what I'm doing enough for patrons? Will they recognize the record as relevant if only a small part of the content is? Is there any investment a library can make in a self-designed OPAC that will end up in engaging users more than Google and its like?
You have been asking some great questions as to the realities of what we are facing. A comparison of our records to Google Books is enlightening, and depressing. Take a look at the record for this book in worldcat http://www.worldcat.org/oclc/62127840 and compare it to the metadata page in google books http://books.google.com/books?id=bWEV__6BYPgC which has a huge variety of information, plus you can search the full text, and maybe in a few months depending on the judge, even see the full text. Go through the whole thing, since it takes awhile.

Which would you choose? If you were at the reference desk, which would you recommend? Google doesn't give a d*** about RDA, FRBR, or anybody's theories. Google is interested in constantly improving the user's experience. I just noticed they added the automatic citations at the bottom. Nice touch. The Google record is a great example of metadata and how it can be brought together to make something that never existed before and is better.

And yet, we keep insisting on the FRBR user tasks as to what people *really* want, insisting so much that we will change everything we do to achieve those tasks, and of course we shouldn't forget that people can do those tasks in the catalog right now! Does this make sense?

This brings to mind the story we all know, probably apocryphal, of Nero fiddling while Rome burned.....

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat
On 10/12/2010 11:36 AM, Janet Hill wrote:
>> I think we do know what people search for. They search for information. That information has properties that they may not consciously categorize, but which can be categorized -- at least vaguely -- either in advance/theory, or in retrospect.
>>
>> It's information ....
>>
>> information about: What it's about; what it means; who did it; where is it; can I get it; is it animal, vegetable or mineral; is it bigger than a breadbox and smaller than a cow; etc.
>>
>> information about: People, places, things, existence, relationships
>>
>> (Some people may click at random, or type in search strings at random, just to see what comes up. But since they are random, and in pursuit of nothing that can be defined, it's fruitless to plan for them, and pointless to feel inadequate for not having provided for them)

I guess I am fated to be the eternal skeptic, but I don't know if even this is correct about people searching for information. In this sense, it is similar to the question: why do people go to the library? Is it to search for information? Partly, but I think we all know that they go for other reasons as well: for diversion, for novelty, for inspiration, if you're lucky, to hear other people talk about something you had never thought of before, and so on. This is not random, but something else. I think people approach many websites, and even the search box in the same way, and are doing even more with modern tools (and will doubtless continue to develop in the future). Who knows what tools such as Facebook, or Mendeley will turn into?

Again, if we insist on shoehorning everything into FRBR, we probably can, just as in my previous example of the horse & buggy person saying that a car and a horse & buggy are really the same. The horse & buggy guy may even be able to convince another horse & buggy guy that horses & buggies are the same as cars, but nobody outside the horse & buggy industry else will believe it, especially those people in the car industry.

Tuesday, October 12, 2010

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat

Sorry to appear obtuse, but I don't believe I have missed the point. I said that sometimes people may be searching for authors, titles, and subjects, but FRBR makes a statement that this is what people do and it is clear that people are doing other things. It is my experience that people are primarily doing those other things. What those things are, I do not know, and no one knows, but these things that they are doing now go beyond the FRBR user tasks, which are based in the 19th century, if not earlier. I think Kevin's example of "grapes of wrath" is an excellent case in point. When someone searches that, I do not know what they have in mind: they may want Steinbeck's book, they may want things about Steinbeck's book, they may want the Battle Hymn of the Republic, they may want to find how such an allusion has been used throughout the years, and how it has changed. These are some of the more realistic "user tasks and needs" of today, and perhaps people have been doing this all along. In any case, this is what we have to adapt ourselves to because this is what our public is expecting from information tools.

The entire purpose of RDA is to enact the FRBR user tasks, which allows people to FISO WEMI by their ATS (shorthand again). In other messages, I have made it clear that it is not the fault of those who created FRBR to have come up with this: it was just that FRBR came out too soon, before the real explosion of massive full-text searching by keyword and relevance ranking appeared and changed matters in ways that had been impossible to foresee. Everything must be rethought in light of these new developments, which are no longer so new.

I do not believe that people *never* want the FRBR user tasks and in fact I haven't said that, but it is my experience, and research that I have seen shows that this is only a tiny part of what the public is doing now, and is increasingly less relevant to what they want and need. Now, perhaps after some research, we could find out that people really are primarily doing the FRBR user tasks, although there seems to be a lot of evidence that shows precisely the opposite. I may personally agree that what is happening is good or not--that is completely irrelevant. FRBR describes *how the traditional catalog works.* It does not logically follow that this is also *what people want or need*, and that is where I believe lies the fallacy. Such a declaration of what people want and need must be demonstrated.

We can either accept the fact that people are not following the FRBR user tasks, or ignore it. If we accept it, we can either deal with it and adapt, or we can try to explain it away. I feel that we need to adapt to the new ways, otherwise we will find ourselves in a very lonely spot. People in the hard sciences and technical sciences have found little use for the catalog for quite some time now, while the social sciences are following suit. All we have left are the humanities, who still say they need us, but even that percentage is going down.

In spite of this, I still maintain that there is a tremendous power lying dormant in our catalog records, but that power must be released. To do this requires free flights of the imagination, which means to free ourselves from the FRBR user tasks.

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat

Kevin M. Randall wrote:
<snip>
There is a logical fallacy in play in the argument above. Searching BY author/title/subject and searching FOR author/title/subject are two entirely different things. Just because Google doesn't allow you to search specifically BY an author's name, or a title, or a subject does not mean that you cannot search FOR that author or title or subject.
Can you really say with certainty that someone typing "john steinbeck" is NOT searching FOR the that author's name? Or that someone typing "grapes of wrath" is NOT searching for that title? Or that someone typing "dust bowl fiction" is not looking for something on that subject? I will grant that there may be a few instances in which the searcher is really looking for something else, but surely one can safely assume that the searcher is likely looking for that author or that title or that subject.
The absence of a search index limited to specific data type does not mean that no one ever searches for that data type.
</snip>
I do not know what people search for; I have thought a lot about what I search for myself and I am still unsure. Sometimes it's subjects, or authors, much more rarely for titles. With me I found that I look a lot for quick information that helps me solve a particular problem. I hesitate to call that author title or subject. I don't know what I would call it. People are finding information in brand new ways that we cannot imagine because they have tools they have never had before. I have still not seen any research that says that our public wants FRBR user tasks, but I have seen lots of other research.

I still maintain that it is illogical to claim that people "are really" searching by author title or subject in tools where they cannot, by definition, do it, such as Google. To me, that makes as much sense as some horse and buggy maker saying that working with an automobile *really* is like working with a horse and buggy. Both have to intake some kind of fuel, both have an exhaust, both make noises, both allow you to get back and forth to work, both need maintenance, etc. Of course, a horse and buggy is not a car. To insist on something like this, when the new tools make possibilities almost endless, limits our own imaginations terribly, and this will have serious consequences for our profession.

I do agree with Mark Presnar that "selection" is very important, and in this sense, becomes very similar to Ross Atkinson's "control zone." I plan to discuss this in a future podcast. I believe there are many possibilities open to the use of our records in novel and important ways, but we must keep our minds open and not limited to 19th-century methods.

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat

Vosmek, John J. wrote:
<snip>
Jim,
You are always telling us how users aren't searching for authors, titles or subjects. Do you mean that they aren't searching for controlled author/title/subject headings? That is the only thing I can imagine, yet that distinction seems irrelevant to me. Even when people are keyword searching, they're still searching by authors (in the broadest sense, of course, including names of corporate bodies, etc.), titles and subjects (again, in the broad sense "I'm looking for something about..."), aren't they? What else are they searching for if not these? Please give an example if you can. I'm baffled by your assertion.
</snip>
I have answered this several times on this list and others. I must say that I am just as baffled by the opposite assertion. From the moment that OPACs offered keyword searching, people began to flock to it and the old methods began to become less used. And the moment that Google/Yahoo-type of relevance ranking began to become dominant, the old ways began to seem very strange to anybody who was not brought up under them and even those who were old enough began to forget. Several times, I have had to explain how and why the old ways worked.. When someone uses Google primarily, the mere idea of searching by author, or title, or subject is very strange because you cannot do it. In any case, asserting that "people search for works/expressions/etc by their authors, titles and subjects" is completely unwarranted and flies in the face of how everyone today searches for information.

How can we search by author, title subject in Google? We cannot, yet people love Google. How can we explain this away? Well, either by proclaiming that the Google result is not useful, or we can simply ignore Google. But if we accept that the Google results are useful (as 95%+ of the populace would probably say) and we do not ignore it, then the conclusion seems inescapable that people do *not* find WEMI by their ATS (resorting to shorthand), bit they are doing something else. In fact, to believe anything else seems completely illogical. (I believe that this realization could possibly have major repercussions for the history of cataloging as well, bit that is another issue.

What are people really searching for? I don't know and I wish I could say, but there are some real efforts being made to find out what people are doing with information. I believe that until the cataloging world accepts these facts, there can be very little progress.

At the same time, I will state once again that I believe people would love to be able to search by ATS, although WEMI is much less important to them. Yet, we must reconsider how to do this in the emerging information world, because it is pretty clear that this world will not wait for us.

Thursday, October 7, 2010

FW: Displaying Work/Expression/Manifestation records

Posting to Autocat

On Thu, 7 Oct 2010 10:11:43 -0400, Myers, John F. wrote:
>And yet the American University of Rome, of which James is the Director
>of Library and Information Services, perversely maintains a catalog in
>the face of these user behaviors. Why bother, when he readily
>acknowledges that his patrons don't want the information there (or
>perhaps more accurately don't seem to want the organization it provides
>for the information it contains)?
>
>What do his patrons want if they don't want to use
>authors/titles/subjects? Is he developing an alternative system that
>meets those needs? What is an effective means for describing and, more
>importantly, accessing the atomized snippets that he has previously
>reported as the focus of his students' interests, for describing and
>accessing the resulting mash-ups?
>
>How effective are his patrons' information seeking behaviors actually?
>Is it the case that "what they want" is serving their best interests
>with respect to information retrieval and research? Or would catalog
>structures (based on Cutter and updated in the FRBR model) serve them
>better, if only they could be educated in the use of those structures?
>
>Perhaps more provocatively than I ought to be, but genuinely curious
>about these questions,
I can only do so much, and I will try to discuss this in more depth in my next podcast. I only have so many computer skills, but one thing I can do, and have done, is to implement my Extend Search, which searches all kinds of things. You can use it yourself in my "perverse catalog" which you can enter by going to any page there, and clicking on "Search other collections". There are other ways of implementing the Extend Search, and I have tried to implement this by using the information in the catalog record. I have always believed--and still fervently believe--that the individual catalog record, if made according to high-quality standards in various ways (going beyond AACR2 or RDA), could become extremely powerful *if correctly utilized*.

My students (and that includes lots of students studying abroad in Rome from the US) very rarely even understand the concept of searching by author, title, or subject, and I won't even mention works/expressions/et al. (e.i. WEMI). They search like they search in Google. They have no real idea of controlled vocabulary, or of anything relating to a syndetic structure. Are they stupid? Of course not, but they believe that the system should do this for them, and I think they are right.

The simple fact is that people find the searching capabilities of keyword very useful, and much more useful than traditional cataloging functions. Do they want or need WEMI? In my experience: sometimes, but it is very very rare when they need this kind of information. Why? Well, for one thing, OCLC discovered that less than 20% of the database is even involved with variant editions of any kind. Sometimes they want a specific edition, e.g. Hobbes' translation of Thucydides, or Heaney's translation of Beowulf, or "the latest edition". A simple keyword search with modern sorts handles this far more efficiently and easily than FRBR foresees.

What people do want is other materials on the web that are not being cataloged by catalogers and will not be cataloged, at least in any kind of time from where they will be useful, e.g. think tank publications, public lectures and videos, up to the minute newspaper articles, materials in the hugely growing open archives, and so on and on and on.

And please, I want to make it very, very clear that I think people really do need the information in library catalogs because I think they need to find resources *reliably* by their authors, titles and subjects, but it must be implemented completely differently than how it is done today. First, people need to begin to understand and see how powerful it can be, but currently, it just doesn't work and is totally unconvincing. Do I know how to do this? I would never be so presumptuous (obviously, people think I am presumptuous for even daring to question the wisdom of FRBR!), but I do have some ideas that could be tested and ultimately proven correct or incorrect. I'm sure lots of others have their own ideas. And I would bet you the baby's shoes that Google would have some very interesting ideas! I doubt if the Google-guys have much attachment for FRBR.

I do question how seriously patrons need WEMI, and I suspect that perhaps it has always been based more on library inventory control than anything else--and I am *absolutely not* dismissing the absolute need for a library to keep inventory control--but perhaps it is more important for libraries than for users. In any case, while occasionally someone may require this kind of information, certainly, I have seen absolutely no need for anything approaching the FRBR displays, which really are "so 19th-century"!

It's imperative that catalogers get away from the dead hand of the traditional user tasks, and how the traditional catalog has functioned. The information universe has changed completely, and we have to deal with it or be left behind.

You mention "snippets" and what should we do? Should we catalog "snippets" of information? I certainly hope not, but I have considered it and believe that any solution would be to incorporate keyword results in some way, much as in the print world, people work with tables of contents and indexes thereby bypassing entire items, which they often do not want.

RE: Displaying Work/Expression/Manifestation records

Posting to Autocat

This has turned into a very interesting thread indeed! I will only point out once again that there is the Cooperative Cataloging Rules at http://sites.google.com/site/opencatalogingrules/, so there is a choice! For many libraries out there, there is no choice since they no longer have a budget that could cover the costs of redoing everything.

Concerning:

On Wed, 6 Oct 2010 16:03:59 -0500, Kevin M. Randall wrote:
>Mac Elrod wrote:
>> Seems to me we are thinking in terms of philosophical categories, not  patron needs.
>
>The philosophy behind FRBR is quite definitely geared toward meeting patron needs!
This idea that FRBR is geared toward meeting patron needs requires serious rethinking. Why do we believe this? I have seen absolutely no evidence for it, but it seems it is just accepted. Are our patrons demanding to find works/expressions/manifestation/items by their authors/titles/subjects? Not my patrons. They want something else. In fact, in the many reports I have read discussing what people want and expect from information today, I have never seen mentioned anything similar to FRBR user tasks.

And yet, the cataloging sector of the library world insists that people want the FRBR user tasks. In examining this, we must first recognize that the library catalog *right now* allows people to do the FRBR user tasks. The current library catalog *right now* allows users to search by varying types of uniform titles, authors, subjects, and to bring together *right now* the works, expressions, manifestations and items, as shown in FRBR. FRBR does not posit *anything new* (except for some possibly useful new attributes, e.g. extent of an expression). FRBR describes in another way exactly the same thing that we do now and have done for a long time. This fact needs to be accepted, understood, and examined.

What FRBR defines as "new" are the *displays*, it posits nothing new in the way of access, and in essence, eliminates the unit card (or unit records). So, if I have 100 different versions of Beowulf in various translations and editions, the user can find all of these right now using traditional catalogs, but in an FRBR world, would not have to look at 100 or so different unit cards, and they will get the FRBR displays, which are almost exactly (if not exactly) the same displays as those found in 19th century printed catalogs.

Once we recognize that FRBR does *not* bring in anything new to the matter, but is a matter of display, and changes nothing in access (aside from eliminating the rule of three, which is only a single rule change that does not need an entirely new cataloging code), other things start to make sense, e.g. typing out abbreviations, or changing "1962-" to "born 1962", which are also matters only of display.

What would be new are the attempts to eliminate ISO2709/MARC21 and to put catalog information in RDF, or any other type of XML format. But this could be done today and you do not need FRBR/RDA to accomplish this.

Research has shown that fewer people are finding the library catalog useful, and are turning to other tools. The scientists left some time ago; the social scientists have been leaving for awhile; and now even the humanities are starting. These problems that people find with our catalogs: are they based on "display"? Or do they lie elsewhere? I ask, does FRBR merely restate the functions of the traditional catalog in late 20th-21st century terminology, or does it offer something new? To me, it is obvious that it offers nothing new. And to be fair, I don't think that FRBR even suggests that it does.

I am not claiming that we give up, or conclude that the records we create are useless. I believe completely the opposite! But we must rethink what we are doing in this new reality (which isn't even so new anymore!), and this can be a terribly daunting path to embark upon.

This is what I am trying to discuss in my podcast series of my "personal journey", by the way. (Apologies for yet another bit of self-promotion. Series at: http://catalogingmatters.blogspot.com/search/label/podcast)

Monday, October 4, 2010

The Functional Requirements for Bibliographic Records, a personal journey Part 3

The Functional Requirements for Bibliographic Records, a personal journey
Part 3
Link to pt. 2



Transcript
Hello everyone. My name is Jim Weinheimer and welcome to Cataloging Matters, a series of podcasts about the future of libraries and cataloging, coming to you from the most beautiful, and the most romantic city in the world, Rome, Italy. This installment continues my personal journey with the Functional Requirements for Bibliographic Records (or FRBR).

This series: The Functional Requirements for Bibliographic Records: a personal journey, has gone on for two previous podcasts. I believe that this installment will make very little sense without the first two, so I strongly suggest that you listen first to them, in order. Links to the earlier podcasts are available from the transcript.

I also want to mention that I added a Google Translate widget to my blog, so that those who want it, can now get an instant translation of the transcript, or any of my other postings, in a surprisingly large number of languages. Still, we are all aware of the problems of automatic translation, so keep in mind that Google Translate is designed only to provide help, but translations may turn out to be faulty. If Google Translate has me say something stupid, it is not necessarily my fault!  

Now, to continue my journey:
Let me sum up where I am in my twelve step process, and pardon me for repeating the steps yet again, but I feel it is important:
Determination
--Incomprehension
----Humiliation
------Renewed Determination
--------Joy
----------Comprehension
------------Consternation
--------------Serious questioning
----------------Serious doubts
------------------Disillusionment
--------------------Despair
----------------------Hope

After encountering initial failure in my first efforts to understand FRBR, I now understood it, but based on my widening experience, I was being faced with questions without apparent answers. These questions I could not simply ignore, and consequently, I had entered my Consternation phase. Those first questions had to do with some rather uncomfortable facts concerning the manifestation, or I was used to calling it, the edition. This had been an aspect of traditional cataloging that I had always taken for granted, and yet it became crystal clear to me that different communities viewed the same physical item in quite different ways. These differences were actually based on definitions, and I saw how those definitions could vary widely among different bibliographic communities: you could be a late-20th century AACR2/ISBD cataloger, a rare book cataloger, someone cataloging for FAO of the United Nations, a cataloger from the late 19th century, and so on, and each person would look and relate to the same physical object in a unique, and often quite a different way.

I myself had experienced how on one day, a book was a new manifestation or edition, but because of a new LC Rule Interpretation, on the very next day, that same book was an item or copy. And this concerned a concept that to me had earlier seemed so fundamental and solid: whether this thing I am holding in my hands is a copy of something else or not! If there was no agreement on a point such as that, how could there possibly be any agreement on anything at all? (Jumping ahead for just a moment, this is the sort of reasoning that would eventually lead me to my Despair phase)

Still, it is important to say that I had no doubts as to the ultimate correctness of FRBR since its principles were based on Cutter’s Rules, which have served as the foundation of modern catalogs since the latter 19th century. These rules represent the solid base that we could all rely upon, and therefore, as far as the problems that I saw, I could safely put them on a shelf in the back of my mind labelled “Snags”, since I knew--or at least I had faith--that the problems I saw were either some kind of imperfection in my own understanding, or these were examples of some minor anomalies that would be worked out in time. As a result, I felt relatively comfortable and assured while I was in my Consternation phase.

But I couldn’t ignore it all forever and I was quickly entering my Serious Questioning phase. When I worked at the Food and Agriculture Organization of the United Nations (or FAO), I found myself looking at all kinds of cataloging from different communities, such as those I myself was working with, but I also saw other records from journal and book publishers who wanted to share our data, and still other communities who were making online videos, the geographic mapping community, and from various others around the world. At the same time, Google Books was just starting to get off the ground; I could see fabulous resources in the the Internet Archive, and all sorts of digitized books were becoming available through different projects ranging from the University of Virginia to the University of Heidelberg to think tanks to fabulous antiquarian map sites from dealers. FAO itself was placing very important information online; not only books and documents, but conferences, videos and images, entire workshops, statistics and so on, and I saw how many other organizations and universities were doing the same things.

I discovered that few of these projects followed widely recognized standards, and often, some of these communities had no standards at all, or nothing that you could really label as standards. For example, someone might say that their records “followed the Dublin Core standards”, a peculiar idea that actually meant that the computer coding may have followed Dublin Core: i.e. creator, date, relation, and so on, but the information within the coding, (i.e. what I will call here the actual cataloging information) followed no standards at all, i.e. something authored by the “United Nations” could be entered as “United Nations” or “UN” or “U.N.” or “ONU” or “Naciones Unidas”, or a host of other possibilities.

And yet, at least these communities were semi-organized. Not the least important of these communities was the burgeoning open archive movement, where everything was supposed to be “self-managed” in some sort of way that was completely unclear to me.

Open Archives are a major topic and will be the subject of a future podcast, so I will avoid the discussion of them for now. For those who are interested however, I have placed a link in the transcript that goes to additional information. I suggest, and link to, Peter Suber’s Open Access Overview http://www.earlham.edu/~peters/fos/overview.htm. For my purposes here, I simply want to point out that I felt I was witnessing an exponential growth in the immediate access to materials in open archives, that is, to materials that are highly important to patrons of any library.

It turns out that so far, my prediction of the rate of growth of these open archives has turned out to be true, although I think it was always a pretty safe bet. Links to the statistics are available in the transcript.
(Statistic of growth in open archives: ROAR:
http://roar.eprints.org (click on graphical analysis to generate it yourself) and OpenDOAR:
http://www.opendoar.org/onechart.php?cID=&ctID=&rtID=&clID=&lID=&potID=&rSoftWareName=&search=&groupby=r.rDateAdded&orderby=&charttype=growth&width=600&height=350&caption=Growth%20of%20the%20OpenDOAR%20Database%20-%20Worldwide)]

At the same time as I realized that the numbers of these new, easily accessible sources of important information had the potential to become genuinely overwhelming, and since I was at FAO and in communication with managers of different web sites, I also realized that these overwhelming numbers of resources did not necessarily imply a complete absence of metadata, since I discovered that metadata was continually being created at practically every level.

This metadata (I cannot find it within myself to call it “cataloging information”) was being created by publishers, journal databases, and sometimes by the authors themselves, as they added their materials into the open archives, but the vast majority of it was not being created by library catalogers. Much of this metadata was created for internal management and workflow purposes; for example, the publisher or editor could find out how far along a specific author may be on a chapter of a book, or if an item has been assigned an ISBN, and so on.

As a result, instead of a problem of quantity, the difficulty seemed rather to lie in the quality of the metadata. I had seen that there was quite a bit of metadata being produced according to accepted standards, such as our metadata that followed the AGRIS standards, but these standards did not follow AACR2 or ISBD. For example, the basic rule of “exact transcription of the title page” does not exist in the AGRIS rules, and catalogers are directed to enter only corrected titles. There are also some bibliographical concepts that do not exist in ISBD or AACR2. Therefore, although our AGRIS descriptions were standardized, the headings were standardized, the subjects were standardized and so on, they were all uncoordinated with respect to AACR2 and vice versa.

There were still other standards that were unrelated to any other standard, and the result was that the same items were cataloged over and over because the standards were not shared. Much of the rest of the metadata I saw did not adhere to any type of standards at all.

While it was certainly my opinion that AACR2-type cataloging was the “best”, I could not deny that there were many other standards that had been around for a long time, and that thousands of people, if not far more, had found them highly useful; therefore, I recognized that my personal bias in favor of AACR2 could just as easily be explained by the fact that my initial library training took place in the United States because I just happened to be born there, and not in some other country.

As an added complication, there was that damned Google that kept drawing me back like a moth to a flame, and doing a pretty good job of finding a lot of the information I wanted, so long as I wasn’t looking for anything in depth. As a result of these considerations, I found myself falling deeper and deeper into my Serious Questioning stage, although FRBR was not in my conscious thoughts in any focused way, as I had other tasks to attract my attention: practical cataloging and now, systems.

My time at FAO was when I made my first major strides into systems. I had made several rather extensive websites earlier, such as those of the Cataloging and Technical Services documentation at Princeton University Library, and several specialized cataloging manuals, one of which was my own Slavic Cataloging Manual, but databases had always remained beyond my abilities. While I had wanted to learn about databases and XML earlier, and had read books and asked people for help, I just could not understand it and nothing worked until I met two colleagues at FAO, who ultimately became my close friends. I am forever in their debt because they sat me down and showed me how to build simple databases and how to actually use XML.

I had been studying XML for some time, and one of the most important things I learned at FAO was that the way I had been approaching XML was completely wrong. XML is short for Extensible Markup Language, and its native format is terrifying to behold. For me, I had focused on creating these terrifying XML files of bibliographic records and the most that would happen was that I would run it through a program that would tell me whether the XML document was “well formed” or not. If it wasn’t, then I had more work to do, but sometimes it would say that it was “well formed”, and ... that was it! Since nothing else happened, it was tough to get excited over the “success” that my XML was “well formed”, and therefore, it had always been completely anticlimactic. As a result, I could not grasp how there could be any practical advantages in converting our records to XML format, and therefore, I was exceedingly skeptical over whether libraries should change to XML from native MARC.

At FAO, I discovered that while the XML format is certainly very important, that’s not the fun part. The cool options come from something called XSL-Transformations and related tools that work with XML. So long as your XML document is valid and well formed, then with XSL-Transformations, you can actually transform your XML record into anything you want.

Think about that for just a moment: anything you want.

So, I learned how to take an XML record (in my case, an AGRIS bibliographic record in XML format) and turn it into another format, as I did by turning it into MARC21; or I could make it into a pdf document, or an MS Word or Open Office document. I could transform XML-MARC records into web pages. I even found I could change batches of records into Excel sheets, where I could get some new views that could help for purposes of quality control.

The first few times I did it, I thought it was magic. I figured that probably I could have even converted those records into a movie if I had wanted to badly enough. What does this mean in the real world? As one example, a newspaper encoded in XML can be published simultaneously (or transformed) as a printed document and as a website by simply applying a new XSL-Transformation, and therefore the newspaper itself only has to be created one time. Once again, this same XML file could probably even be transformed into a video.

So, XML on its own is not very exciting, but when paired with complementary technologies such as XSL, it can provide radically novel displays and can sort and re-sort records in a whole variety of ways. I kept telling myself: with XML, you can transform the record into anything you want. This is not something that can be fully understood immediately since it is so expansive, and I am still coming to grips with it myself. Anything means anything.  Now, as an aside, I will admit that probably it isn’t really anything, but I think it’s important at this point to assume that it literally is anything, so that we can eventually find and learn the limitations.

There are other technologies based on XML documents: XLinks, XQuery, XForms and so on that I have not worked with, but I am sure even these are not the end and that there will probably be other developments in the future. This is one reason why I believe it is vital for the library world to shift to XML formats of some kind, so that they can be transformed. I believed (and still believe) that such a capability will represent a fundamental break with the previous cataloging traditions and will have profound consequences for librarians and others, probably both good and bad, many of which we cannot foresee today.

When you add all of this to the possibilities arising from the complementary power of modern browsers and other systems to display and actually bring in information from distant databases on the fly using what is called web services, where everything can display on a single computer screen, and where the user can interact with it in various ways, the possibilities are literally endless. While I didn’t know how to make all of these things I am talking about, I could do a little bit, and that little bit helped me to understand more, and consequently to imagine possibilities that earlier, had never occurred to me. For those who are interested, I have added links to some simple videos about XML and web services from the transcript. I especially recommend What is a Mashup? from ZdNet, which shows how fun it can be http://www.youtube.com/watch?v=ZRcP2CZ8DS8 and IBM’s more technical, but I don’t think overly so, An Introduction to XML: The Basics http://www.youtube.com/watch?v=pPKV6dBZ5n0

Now that I was concentrating on digital resources, I saw their numbers increasing at an unheard of rate. What were the consequences? In libraries, I had seen and heard stories of major reorganizations, but while I heard specifics in some cases concerning the reorganization of catalog departments, I did not hear anything at all about how the number of catalogers would increase. In fact, I heard the opposite. Slowly, slowly, these realities of how the creation and access to information was changing began to work its way into my brain, which led me even more deeply into my Serious Doubts phase.

Although at that time I was much less involved in the U.S. cataloging world, there was discussion about it at FAO, and I gave some presentations on FRBR to my colleagues. While I did my very best to describe FRBR, my subconscious doubts began to rise to the surface: What does the creation of works/ expressions/ manifestations/ items accessed by their authors/ titles/ subjects have to do with the enormities of the problems I saw? We were being faced with an avalanche of information from everywhere at once, so it seemed that what should take primary importance was to increase the number of catalog records by a quantum factor in some way. Otherwise, although we can claim that we have some type of “control”, it will be control over a very quickly diminishing percentage of worthwhile resources, until it becomes practically infinitesimal and useless. How in the world could you keep a straight face when you declare that you have control?

But on the other hand, how could productivity be increased like that? Catalogers were working hard already and no one was even suggesting that new catalogers would or even could be hired on in enough numbers that would make an appreciable difference.

But I remembered that the problem did not seem to be with the quantity of metadata, but the quality, and so to me, a major part of any solution was obviously to work together somehow. Yet, I realized that this innocent-sounding little phrase “work together” held a vast number of consequences and troubles and fights that seemed so insoluble that I personally did not want to think about them.

How could everyone possibly work together? There wasn’t even agreement on what a manifestation was, so what did this portend for expressions and works and everything else? At the same time, when resources were available with just a click, for me it was pretty much irrelevant whether it clicked into a pdf file or an html file that might include an image or even a video.

Another concept I learned at FAO, which is very important in that institution and I believe also exists in other fields, and in those where it does not exist, it should, is the concept of sustainability. In agriculture, sustainability is demonstrated in the saying we have all heard: give me a fish and I eat for a day, but teach me to fish and I eat for a lifetime. This is absolutely true, but the key is to realize that such an idea is not limited only to fishing or agriculture. In all fields, there are quick-fix solutions and long-term solutions. Quick-fix solutions may be necessary, but by definition, they deal only with emergencies and cannot be relied upon in the long term. Therefore, such solutions are not sustainable.

On the other hand, a long-term solution is just what it says: a solution that will not necessarily last until the end of time, but at least it will for the foreseeable future and consequently, such a solution is sustainable. To take a specific example, there can be emergency solutions for villages faced with a temporary shortfall of rain and the locals need water for simple survival, so everybody gets together and takes them water. But in the long term, if global warming is turning the area into a desert, other solutions, far more drastic, will be needed.

Emergency solutions normally do not engender too much resistance because after all, everyone is facing an emergency, but long-term solutions inevitably cause tensions because in those cases, people are contemplating genuine, unavoidable changes, that is, changes that will last, for all practical purposes, forever. In such situations, there will be winners, and there will be losers. The winners in the current situation may not be the same after the imposition of the long-term solutions, and they may fear that they will turn out to be the losers; therefore sometimes they will oppose the long-term solutions. Nevertheless, changes will be unavoidable, since in the example above, the desert is advancing inexorably, and it is vital to avoid an endless number of increasingly severe emergencies, each requiring any number of quick-fix emergency solutions, that in any case must all end up in failure and possible catastrophe.

After my time at FAO, I felt that sustainability should be an important concept for catalogs and cataloging, as well.

Around this time, fully in my Serious Doubts phase, I left FAO to become Director of the Library of the American University of Rome, a small, undergraduate institution. For the first time in my library career, I would begin to work both regularly and extensively with the public as a reference librarian, and almost immediately as I started work with undergraduates and faculty on their research, I fell into my Disillusionment phase, followed rather quickly by Despair.

At this point, I shall stop once again, and save the rest of my journey for yet another podcast, part 4. I will do my best to finish in the next installment, but I can’t promise anything because there is still quite a bit to cover: my experiences with the public, and seeing how they work with information and what they expect to be able to do with it, plus my struggles with Information Literacy, led me toward new phases. But I’ll talk about them later.

The music I have chosen to end this segment is an excerpt from a fun piece by Marco Uccellini, called Aria quinta sopra la Bergamasca or the Fifth tune for the Bergomask performed by the group Il Giardino Armonico. For everyone’s information, I discovered that a bergomask is a dance that made fun of the people from Bergamo, a region of northern Italy, who were supposed to be notoriously bad dancers. For example, in Shakespeare’s A Midsummer Night’s Dream, the clowns dance a bergomask. The piece here is an excerpt; if you would like to listen to the entire piece, the link is available from the transcript. http://www.youtube.com/watch?v=ca_9FpzcLds&feature=fvw

That’s it for now. Thank you for listening to “Cataloging Matters” with Jim Weinheimer, coming to you from Rome, Italy, the most beautiful, and the most romantic, city in the world.