Thursday, September 30, 2010

New Google Books Options (Was: Disappearing cities)

Posting to Autocat

On Wed, 29 Sep 2010 11:36:48 -0500, Ted P Gemberling wrote:
>I thought that was an interesting example of how keyword searching keeps one "stuck in the box" of individual character strings. Actually, I think it shows the need for controlled vocabulary. David Weinberger (in Everything is miscellaneous) thinks that keyword searching liberates us from structured search systems, but in reality, it puts us in a different box. Our new "box" is not the subject heading system we're using, but the text strings of our language. If you do a subject search in the SAF, you'll find that Sunrise, Florida has been set up as a heading since 1983. If you use LCSH, you're not going to confuse Sunrise with Sarasota.
>
>I'm not saying that we should stop doing keyword searches on Google. But as Karen suggested, libraries are the place to go if we can't find information that way.
I agree. There are many traditional tools that would be highly useful to the new services such as Google. In a lot of ways, Google admits this now, with the additional methods of refining the search that you find in the left-hand column, by dates, sites with images, shopping, and that eerie Wonder Wheel, which I find completely baffling!

Incidentally, I discovered something new in Google Books I haven't seen mentioned anywhere. If you do a search for e.g. the Venerable Bede: http://www.google.com/search?q=venerable+bede&hl=en&safe=off&tbo=1&tbs=bks:1,bkv:f,sbd:1&prmd=b&source=lnt&sa=X and look down the left hand column, you will see the subjects extracted from what appears to be the metadata appended to the records, in this case, we can see what looks to be our subjects, e.g. Great Britain/ History/ Anglo Saxon period, 449-1066, and are perhaps BISAC descriptors, e.g. Gift books. When you click on one of these, it refines your original search, "venerable bede" by that term, e.g. venerable bede --> cathedrals. But if you click from within that set of records, e.g. into "Education", it appears to give you not venerable bede --> cathedrals --> education, but venerable bede --> education. Of course, I keep saying "appears" because I am not sure at the moment.

In this way, it seems they are creating a Worldcat-type interface, although so far it is only with subjects. It also doesn't work completely in every display, and the results are strange, e.g. a search for "julius caesar" provides only the subjects: drama, fiction, literary criticism. But if you limit it to "full view" (we can see much less full view in italy than in the U.S.), the subjects switch to: gaul, great britain, rome. But if you limit to "preview and full view" you get: drama, fiction, history, literary criticism.

Rather incoherent at the moment.

Also, the url for Google Books was always books.google.com. Although this still works, you are switched over to a new url: http://www.google.com/search?tbs=bks:1

I just noticed this yesterday.

Wednesday, September 29, 2010

RE: Why do they hate us?

Posting to: Why do they Hate Us? from the Chronicle of Higher Education

I think the reason there is so much anger out there is because many people feel they are being taken. Let's face it: the real purpose of a university today, in spite of John Newman's missive, is to provide certification. While I am sure there are some spiritual seekers out there who really are in school for "education", "enlightenment", and so on, who will later be happy with their Master's or Ph.D in philosophy while they mop floors or sell socks, while they pay off their student loans over many years, I suspect there are many, many more who want to end up with a good job.

While the press is replete with stories about how people with a college degree are faring--not so well, but less badly--than those with only a high school degree, I wonder how skewed such statistics are in favor of higher education.

But beyond that, just imagine that after you have read about those statistics, you glance over to see your unemployed son asleep on the couch, whose education cost you enough money for a few brand-new cars--or even much more, and I hope people will question those statistics that show how "well" people do with a higher education.

To quote Chico Marx: "Who you gonna believe, me or your own eyes?"

Disappearing cities

Posting to Autocat

“US city erased from Google maps”
http://news.bbc.co.uk/2/hi/programmes/world_news_america/9038870.stm

This is a report from the BBC about a town in Florida called Sunrise that "disappeared" from Google Maps. The effects on the town were devastating because Google Maps has become-apparently-one of the main ways that people find local businesses today.

The search that they show: "sunrise florida car dealers" illustrates how almost everyone searches today. It's very logical, but as this demonstrated, the result can easily go to a somewhat better known town (i.e. higher hit rate?), such as Sarasota, with a street of Sunrise Boulevard. The reporter at the end of the news clip, holds up a yellow pages telephone book for Sunrise, Florida, where someone would look for (I guess): Car dealers and hope that they would find a cross-reference to, perhaps, Automobiles-Retail. But let's be realistic about it: who is going to do that when so many people are using Google Maps, and if you don't find something, it may as well not exist; and if you are that car dealer in Sunrise, Florida who is losing business, you won't be too interested in theoretical disputes or someone saying, "Well, everybody just needs to be trained to use the yellow pages." They won't. Those days are over.

I think this is a great example of the disconnect between the traditional and the modern methods of searching and shows what people expect today. In this case, Google did one of their famous/infamous "tweaks" and "fixed" it. But it occurs to me that when they tweak it for one purpose, perhaps they mess it up for somebody else who may not even understand the problem. I wonder what the car dealers and florists located on Sunrise Boulevard in Sarasota, Florida would think about Google's tweak?

Incidentally, I have noticed that in some of the tools I have made for my patrons, I use several "canned searches" using Google in and I have discovered that when I check them, the nature of the results change all the time, and I have to "re-tweak" my queries, too.

Monday, September 27, 2010

RE: OPAC design, philosophical

Posting to Autocat

On Fri, 24 Sep 2010 09:56:54 -0400, Murphy-Walters, Angela wrote:
>I wonder, where are the children, young adults, adults -- students, professors, researchers -- who want to learn? Are there really no longer enough people interested in learning, in mastering a subject or a topic within a subject, to make it worth our while to provide full catalog records and the systems to access them fully? Is it really true that every, or nearly every, student who walks into a library is interested in getting just enough material to complete an assignment without regard to what is best or most interesting? Do researchers, whether in academia, corporations, the media, etc., no longer want comprehensive information from which they can glean what is relevant for their work? That seems to be the argument made by many on Autocat and elsewhere the past few years and I find it very disheartening--if not frightening.
>
>There have always been people who will do the bare minimum needed to get by in their school and work assignments. Is it okay to simply do what is best for them and make it a hurculean chore for the very people who may stand a chance of improving our future to do real research? Is there no way that catalog librarians, despite limited resources, can serve the greater good?
While I agree with these sentiments, there is at base, an assumption that must be analyzed, and the analysis may not lead to conclusions we would like. This assumption is that our tools provide better access than these other tools. Is this assumption correct? And in any case, we need to determine what "better" means. I agree a serious consideration of these, and related, assumptions may be disheartening and frightening--at least at first--but I think it is still absolutely necessary to do it.

In my own experience, I cannot conclude that I get "better" results using traditional library tools instead of newer tools. I get "different" results. I also cannot conclude that one is easier than the other, since in order to get useful results from e.g. Google Scholar, I find it to be a lot of work and time consuming because there are no controls at all, while when I use a library catalog, I understand how it works.

This makes me quite different from a regular user of these tools, who rarely understands how a catalog works, and doesn't realize how difficult it really is to search the new tools, that is, until they are doing something serious such as research for a paper, and their favorite tools and methods begin to break down.

Concerning those people who do the bare minimum, I have several times mentioned a paper by Marcia Bates. I think it would be most efficient for me to just point to a former post of mine at: http://catalogingmatters.blogspot.com/2010/07/fw-trust-online-young-adults-evaluation.html, which is relevant to this topic and has a link, plus a quote where she summarized decades of research as the "Principle of least effort": "People do not just use information that is easy to find; they even use information they know to be of poor quality and less reliable--so long as it requires little effort to find--rather than using information they know to be of high quality and reliable, though harder to find."

The conclusion, in my opinion, is not to just throw up our hands and give up, but to make reliable, quality information much easier to get. Expecting the mass of people to work harder to get information, to learn how to use unwieldy systems and so on, is just unrealistic, I think. But there is a lot we can all do to make the information that is *really* available to people (i.e. not just through the library's local collection), much, much easier to find.

There are an entire number of discussions and arguments hidden within this statement, but I think it is essential to open them and realize that the traditional methods are not better than the newer ones, but they are different, and if correctly merged, could complement one another very nicely to provide something that is genuinely new *and* better than ever.

Friday, September 24, 2010

RE: Interpreting MARC: Where’s the Bibliographic Data?

Comment to Interpreting MARC: Where’s the Bibliographic Data? by Jason Thomale Code4lib journal, Issue 11, 2010-09-21

This is a great article, and I am sure I will refer to it repeatedly. But it does show something more about the differences between a cataloger and a programmer: the cataloger knows the rules and looks at the record as a complete entity, whereas the programmer looks at each field separately. So, in the records you show, you must go beyond the 245 field to get a true grasp of the situation. Here is an excerpt of a record from LC, taken from one of your examples:
==================================
100 1_ |a Bach, Johann Sebastian, |d 1685-1750.
240 10 |a Partitas, |m harpsichord, |n BWV 825, |r B major; |o arr.
245 10 |a Partita no. 1, BWV 825 |h [sound recording] ; |b Englische suite no. 3, BWV 808 = English suite = Suite anglaise ; Franzosische suite no. 2, BWV 813 = French suite = Suite francaise / |c Johann Sebastian Bach.

700 12 |a Bach, Johann Sebastian, |d 1685-1750. |t Englische Suiten. |n Nr. 3.
700 12 |a Bach, Johann Sebastian, |d 1685-1750. |t Franzosische Suiten, |n Nr. 2.
==================================

With our cataloging rules, the parsing you mention has always been performed manually, using separate 700 author/title analytic entries, plus the 240 field.

The fundamental purpose of the 245 field is not so much for search and retrieval, as to provide a reliable transcription of what appears on the title page, including all of the typos, etc. It is there for description and identification purposes.

The access of the item has always been through controlled fields. When keyword was introduced, while it added to access in certain ways, it also “threw a spanner in the works”, in other ways, when looking at it from a traditional viewpoint.

But yes, your basic point is correct: in many ways, MARC records are very akin to textual markup language.

RE: Will RDA make OPAC design better?

Posting to Autocat

On Thu, 23 Sep 2010 14:25:30 -0600, Janet Hill wrote:
>Molti anni fa, I was working at Northwestern. And they were just organizing a media library. The folk responsible for the media collection wanted Cataloging to create skeleton records so that things could circulate, but they told us that no one ever searched by anything but title, so not to bother.
>
>But we couldn't help ourselves, so we provided real records, and traced directors, actors, screenwriters, composers (etc.) that appeared prominently.
>
>AND LO! Use of the material skyrocketed, AND the materials started being used by all kinds of people outside Film Studies ---- English, Drama, History, Art ....
>
>It turned out that "they" wanted to find materials in all sorts of different ways for all sorts of different reasons.
>
>Skeleton records waste money. They waste the money we spent on purchasing the materials in the first place, because they don't exploit the various facets of those materials.
This is correct, but it needs to continue and discuss what is possible today with the new tools. In the card catalog world, and through the days of early computerization, the rule was: "Catalog it once, do it right, move on." This was a correct attitude since there were very few instances when an institution could actually afford to pay people later to find the skeleton records, get the books from the shelves, research the books, update where possible and desirable, and finally put everything back. This required funding projects of significant size to create entire teams of people to redo the work of former days, and of course, all done at the expense of not doing the new materials. Emphasis should be on the new materials, in order to avoid finding yourself in the absurd loop of having to do skeleton records for the new materials, then having to correct all of those records by still future projects, and so on ad infinitum. Therefore, if you wanted to maintain high standards and reliability, the above rule made lots of sense.

Still, what that meant for the public was that significant amounts of materials they wanted were lying uncataloged in dusty backlogs waiting for the catalogers to get to them, so it is understandable that the public wanted at least some kind of access instead of none at all, and as a result, there was a push for skeleton records, etc. etc.

But there are new possibilities today, such as systematic updates of selected records through queries of distant databases, and this age-old conundrum can begin to change. Today, it is *possible* to catalog a book in such a way that it can be tagged as a skeleton record, and if the local database is configured correctly, it can query outside union catalog(s) automatically to discover if some other library has updated the record. If so, the new information can be added to the local record, all without any human-intervention.

There are almost an infinite number of ways of accomplishing something like this, but the fact is that today, the cataloging process *does not have to end* where it used to, and therefore, a catalog record, once it leaves the hands of the cataloger, can still be a dynamic entity, bringing information in from all over the place.
And yet, for this to have even the slightest chance of working efficiently, I believe that adherence to cataloging standards becomes even more critical than it has been in the past. For automatic searching and matching to occur correctly, the information in the local skeleton record must match what is in the remote database. It could be done through ISBN (which has known problems), perhaps something like the International Standard Text Code (ISTC) http://www.istc-international.org/html/ (but this is something brand new), or even better in my opinion: close adherence to ISBD, which is actually designed for optimal matching.

The times, they are achangin'!

Thursday, September 23, 2010

RE: Interpreting MARC

Posting to Autocat
On Thu, 23 Sep 2010 03:14:40 -0400, Hal Cain wrote:
<snip>
>Well, I see it rather as a need to REGULARIZE the data and distinguish every separate element. RDA can do that, more or less; but RDA in MARC is not going to do it; not unless we formulate and push proposals for changing MARC. Comments of mine elsewhere about adding subfields brought the response that in some fields all available letters have already been used; and when I suggested moving to UPPERCASE letters, which is technically possible, I was told that was judged impracticable for cataloguers to use. Well, after getting abreast of fixed-field coding and $w in authority references, I personally think that is a very bad argument to justify continuing the kind of mishmash of data elements that Jason Thomale so clearly points to.
</snip>
AACR2 can do that right now, too--you don't need to change to RDA. There are several problems with MARC, as Jason Thomale's excellent article points out. One of the problems, as he described so ably, is that programmers are now used to seeing the meaning of the fields within the coding, e.g. <title> as opposed to 245a. But of course, 245a does not mean title, but rather title proper. 245b does not mean subtitle, but other title information. And 245c has that little addition: statement of responsibility, *etc.*, which conceals a whole number of things.

A programmer should not be expected to understand these things, so, even if we changed the coding of 245c to <StatementofResponsibilityEtc> it still wouldn't solve the problem of parsing everything out "correctly", as he pointed out very clearly.

Is the solution to implement even more fields, adding all the uppercase letters and perhaps including the entirety of the Unicode character set? I think this would make the problem far more complicated than it is now and the result woud be something much worse and completely unwieldy. UNIMARC "solved" this by making many of their subfields repeatable, while adding a few: 200d Parallel Title Proper, and 200g Subsequent Statement of Responsibility. [See the Descriptive Information Block in the Unimarc Manual at: http://archive.ifla.org/VI/3/p1996-1/uni2.htm]

Of course, neither FRBR nor RDA addresses either of these problems, but concluding that everything we have is a mishmash doesn't make sense to me since we know that what we have made has been very useful for lots of people. So what do we do?

My own suggestion is to not create a problem where there may not be any at all. For example, I haven't heard of a popular outcry against the problems Jason describes, except from other programmers. On the other hand, I think everybody who catalogs realizes that you are very often mixing title information into statement of responsibility. In spite of this, people have managed all right, and it seems to me that most people just search keyword anyway, since that is what they are used to in Google, online catalogs, databases, youtube, and what have you. That's what I do and what I have seen everyone do, and then they narrow their search. For example, in Worldcat I can search for "Sister Carrie" by keyword, http://www.worldcat.org/search?qt=worldcat_org_all&q=sister+carrie and then narrow down my results in a whole bunch of ways, by format, by author, by date, etc. In just a second or two, I can narrow it down to books by Dreiser from 1954.

Google has different, but similar options available. http://www.google.com/search?q=sister+carrie
I hope this is how people are searching because it makes the most sense. You search for a huge batch of resources, and *the system* helps you narrow it down. It is lightyears more powerful than the traditional card catalog and is how I tell my patrons to search, if the options are available.

So my own question is: if we assume that these types of methods indicate the future of searching, are the problems that Jason described so well, all that important? I think it is clear that MARC coding has to change, but it should not change for theoretical purposes, but it must change for practical purposes for the future. This--to use Darwinian terms--means it has to *adapt* to the new changes in the environment. What should this be?

RE: Will RDA make OPAC design better?

Posting to Autocat

On Wed, 22 Sep 2010 10:58:34 -0500, Scott Piepenburg wrote:
>RDA does not (not really), nor should it, address retrieval, manipulation, display, etc. of information in an ILS. That is a function of system design and what it is supposed to do. While RDA (or any cataloging code) has a hand in it by determining what MUST and/or SHOULD be in the data record, how that data is used is ultimately up to system designers and us as librarians.
and Carolyn Baker wrote:
> Does the library have what I want/need?
> Where is it?
> Can I check it out and take it home?

First, I want to thank Tony La Luzerne for his vote of confidence. I really appreciate it, and while I fear that while he may be correct about "too little, too late," I believe it is far better to fail after your very best attempt than just to shake your head and say that it's hopeless.

Concerning the question that RDA does not address issues of retrieval, display, etc. although that may be its stated goal, I don't believe that is correct. Embedded into RDA is the FRBR world view of work/expression/manifestation/item (WEMI) using controlled access points for authors/titles/subjects. An embedded world view occurs also in MARC21, where we see the world view of AACR2, which in turn is based on the cards in the card catalog (e.g. single main entry, as only one example). No additional access is provided through RDA than what we have right now (aside from the elimination of the rule of three, which I don't know is such a great idea, and anyway could just as easily have been done through a new Rule Interpretation).

RDA is designed to present the FRBR world view (WEMI), which we must admit, in the ultimate scheme of things, is not that much different from what we have right now, except it essentially devalues or even eliminates the unit record, and in its stead envisions a more coherent multiple display of different variants; a display that is extremely reminiscent of those found in 19th century printed catalogs. But the end result, even if we assume that our patrons really do want to find/identify/select/obtain works/expressions/manifestations/objects by their authors/titles/subjects, we must recognize that they can do *precisely the same things right now*, today. The difference in FRBR and RDA is in the multiple display. In fact, I suspect that the current MARC format in XML format could probably generate these displays now. At a superficial level I see nothing preventing it, but I could be proven wrong.

Concerning Carolyn Baker's:
> Does the library have what I want/need?
> Where is it?
> Can I check it out and take it home?

This is true to a point. But additionally, (at least I believe) the catalog should allow the patron to explore the intellectual contents of the library. This means that we must move further back in the process: I'm not even sure of what I want or need; can someone or something help? This is where the subject heading arrangement that allows for all kinds of exploration of concepts through its subdivisions, plus the syndetic structure of the main headings (BT, NT, RT, and notes) could be much more valuable to the public than it is now. So, I think I may be interested in "Dogs" but discover that what I am really interested in is "Dogs--War use--Vietnam--History--20th century." or "Hiking with dogs." In this way, the card catalog arrangement helped us to think. This undoubted power of the card catalog, which can be easily demonstrated, is something I have not encountered anywhere else and has been lost in the conversion to electronic form because it has not been adapted to the new technology. This is something that needs to be reconsidered from the beginning, in order to bring back a power that, I think, is sorely missed.

But beyond this, we must reconsider the question: what does the library "have" today? Is it only the materials that the library pays for, i.e. the books on the shelves and the expensive databases? If our catalogs allow people to search only the materials we pay for, it is automatically limiting and any person who uses Google will see this immediately and the result is that many conclude our tools to be defective.

Of course, including resources that are essentially out of the library's control (on the world "wild" web), means changes in selection, workflow, reference, etc. Yet, I think it is inevitable, since these are the directions the world is taking.

Wednesday, September 22, 2010

RE: Whiny and demanding; rude and arrogant; clueless and uninformed

Posting to NGC4LIB

Mitchell, Michael wrote:
<snip>
Jim,
Perhaps if you held Tech Services and Systems people in higher regard you'd be able to provide your students with off-campus access to your academic databases. I'm sure Tech Services and Systems could easily be able to provide that if they wanted. How 20th century is "These databases work only from on campus"? For someone who's so eager to cast stones, you certainly have room to grow in your own library's current offerings. Or does only the future count for the elites? If you can't or won't max out today's systems how are you going to have any clues for the future? You can hardly talk about student information needs and wants if you're artificially hindering their access. I'm reminded of standing in a little used section of one of my first libraries with my library director. I was looking at the few pitiful outdated books on the shelf and suggested we should get some more newer books. Of course she said, "Well nobody every uses this section" and I observed, "No wonder." Time to get some imagination for the present too I'd say.
</snip>
Interesting that you chose these points. You assume that I do not hold Tech Services and Systems people in high regard, apparently because I raise some difficult points. I maintain that I hold Tech Services and Systems in very high regard, and the majority of my personal work experience is precisely there. I am merely making some observations from a rather long time insider viewpoint that may make others uncomfortable: I see that real, fundamental change is inevitable. And that goes for everyone: deep changes for you and for me and for everyone else. Many, if not most, of these changes will be very unwelcome by many of us. That is a fact, and cannot be ignored.

In your example of the outdated books on the shelves and the fact that they are not used, you are absolutely correct, but buying new books is not the solution, although it may have been the only one possible 20 years ago. Today, there are possibilities of getting videos of book talks, open archive materials, full-text in Google Books and many other digital projects, cooperatively built sites, academic blogs, think tank publications, entire courses from MIT or Berkeley, and on and on. Those things are just as important as any book to my patrons, and many of those materials are much more exciting, dynamic, and up to date than a new book. So yes, we need books (still), but that in no way ends the issue and in many ways, is only the beginning. The situation is clear: we can either try to get some level of control over these materials, and that is the beginning of a long discussion, or we can say we can not or will not control these materials, and leave them to the Googles and Yahoos. I think choosing the second option is tantamount to eventual oblivion.

Concerning the off-campus access you mention, that is also very interesting.

If you really need to know, I work in a tiny university with a tiny library. When I worked at Princeton University for many years, and at the United Nations here, it was great to rely on lots of experts in all kinds of areas. But now, I am in a tiny, university library that has a very small collection. What in the world would compel me to do such a thing? Well, for many reasons that I will not go into in public, but also because it has always seemed to me that the growth of the web provides the greatest opportunities and challenges for the *smallest* collections. That is, the existence of web materials for, e.g. Princeton or Harvard is often simply making a new format available of something they already have. For small libraries however, it can literally be the difference between night and day. Incidentally, my library has been battered terribly in the last couple of years by the economic crisis, and I am not alone in this.

People who have worked only in larger libraries may believe that working in a smaller one is easier, but I assure you that this is totally untrue: you have to be able to do everything, or almost. But no matter what, you can't do everything. In my case, I can catalog, I can do reference, I can select, I can run the library, I can manage staff, I can do budgets, I can do some rather simple programming and other computer tasks, etc. but I cannot set up a proxy server. For systems people out there, something like this may be elementary, but for me, it is something I cannot do. This is beyond me and I need other areas of the organization to help me, but those areas are also under tremendous pressure. So, just as other places have backlogs, I have my own problems.

In spite of this, because of the limited technical knowledge I do have, I have managed to build upon an open source catalog (Koha), to create something that actually, the computer experts at Princeton had proclaimed "impossible to create for 100 years" when I described it to them: my Extend Search, which, however imperfect, still adds hundreds of thousands or even millions of worthwhile materials available as easily as I have been able to figure out. In fact, we even got our Middle States Accreditation in March of this year, which demands a *lot* of work, I assure you. So, at least at one very important level, my Extend Search has proven itself.

But in any case: yes, I am trying to get the relevant people to set up a proxy server. Yet, this is the reality of working in a small organization.

RE: Whiny and demanding; rude and arrogant; clueless and uninformed

Posting to NGC4LIB

A short reply back:

Daniel CannCasciato wrote:
<snip>
> What is my evidence that catalogers don't have the imagination? They *STILL* insist that people primarily want . . .

that isn't evidence, since there's no data out there proving your assertion. Take a browse into the RDA or Autocat discussions on the matter and it would appear the assertion is baseless.
</snip>

Which proves my point: you can't prove anything like this using only catalogers. Autocat and RDA are lists for catalogers. You must go outside. Please point me in the direction of any kind of research that says that our patrons really do want the FRBR kind of access so desperately, as opposed to other needs. One article I find particularly interesting is: "The Pragmatic Basis of Catalog Codes: Has the User Been Ignored?" Jon R. Hufford http://wendolene.tosm.ttu.edu/bitstream/handle/2346/510/fulltext.pdf?sequence=1 I agree with the conclusions, but will point out that when user needs were really debated was in that Royal Commission that we all learned about in library school, where Panizzi defeated everyone. After that, later catalogers pointed back to the Commission, as if the matter were settled for all time. I think it is long past time to genuinely open up the debate again, and assume that we know nothing. Apparently however, such a stance is considered by many to be just too controversial.

As I have pointed out many times: the debate is being reopened now, but much of it is not necessarily in the library field. A lot of research is being done now concerning scholarly communication; but I think that it is even more important for people to just sit with a regular user and do some reference work for awhile. Simply talk to people who need information and try to help them. It is very difficult, and incredibly enlightening. I have learned a lot from every patron: both negative and positive. Unless you are really lucky and you get a day where everybody just happens to ask for editions of something that we already know about, (What is the latest version you have of the Iliad?) you will discover for yourself how little use FRBR is for answering real, everyday questions. Reference librarians see it every day, and they have known it for a long time already.

This is a sad fact to accept. The reason I point out feelings is that this realization has hurt my own feelings. After all, I have spent years working on improving my cataloging abilities, and this is very hard to accept, but it is a fact. Once we face up to it though, the question becomes: where do we go from here? In my own experience, I think that's when we become free. That is where we discover that our skills and our records are, and still can be, immensely useful (or at least, I hope so), but we must repurpose them. It sounds as if Christine Schwartz is a great example in this way. I agree with her completely. I have tried my best to follow the same track.

My next podcast will probably prove interesting.

RE: Whiny and demanding; rude and arrogant; clueless and uninformed

Posting to NGC4LIB

Daniel CannCasciato wrote:
<snip>
Jim Weinheimer wrote in part:
> At the risk of being too blunt and making myself the object of general derision, I think that at the level these people will be discussing, perhaps it would be best that if technical services people attend, they should only be there as observers ... what is needed now is *imagination* and this imagination should not be limited by what the technicians immediately consider to be impracticable.
> . . .
> After all, when the Ferrari Formula 1 racing team is figuring out what they need to do to win races, they do not want the mechanics saying, "Well, that just can't be done."
I sincerely doubt both those premises. Cataloging doesn't make one unimaginative, just as being a mechanic doesn't. (And I bet engineers do want to know what works in the field and what doesn't, especially in a time-constrained environment such as racing.) Actually, I'd think systems folks are more akin to the mechanics in this analogy while catalogers are more akin to engineers, but either way I'm against a blanket elitist exclusion of thousands of practitioners by using them as scapegoats for others failures. In the case of the catalog service, it has been failed by leadership in librarianship -- collectively we've let our software under-utilize and under-serve our patrons. Whatever we need, I doubt that fewer informed ideas is the answer for success.
</snip>
Well, I figured that I would hurt everybody's feelings, but the beginning post & title of this thread (that everybody outside of tech services are whiny, etc.) plus the idea that technical services people should be there so that they can nip "impracticable" ideas in the bud just did not (and still does not) seem to be a step forward.

What is my evidence that catalogers don't have the imagination? They *STILL* insist that people primarily want to find/identify/select/obtain works/expressions/manifestations/items by their authors/titles/subjects. And yet, that is EXACTLY what people can do right now in our catalogs, what they have been able to do since the 19th century, and our patrons are turning away!

In fact, catalogers believe people want the FRBR user tasks so much that they are willing to spend scads of money to redo all their cataloging rules, rebuild systems and even split the library world in order to achieve it. And they do this without doing any research into what their clientele base needs or wants. Even questioning this wisdom is considered practically outrageous. I am not the only one who believes this rather obvious fact, but I am willing to put my opinions out there. At some point, the cataloging community is going to have to face up to this.

As I have pointed out many times before, catalogers are immensely important, but they are not trained to think in ways such as, "Should there be a system that allows dbpedia to interoperate with our authority files to make a better searching experience for our patrons?" or "How can we use Google Trends & Analytics to find out how our tools can fit in better?" or "How can we completely rethink workflow to include RSS feeds for automatic updates?"

Catalogers think in different ways: in ways that follow the current standards, such as: I have this person's name appearing in three different forms on this item. Everything also conflicts with a names already set up in the NAF. What do I do?

That is their job. It is very important and difficult. It is not to come up with novel ways outside the standards, just as it is not the job of a mechanic to come up with some novelty that is outside of the way a car is supposed to work. That's why standards exist: to be followed so that the entire system can function. Any cataloger who would say, "Well, I'll set up the heading this way, just because I like it better than the other way, and mine is cooler" should be fired on the spot.

You can follow standards unthinkingly, routinely, methodically, or brilliantly. Believe me, sometimes it takes a lot of imagination to solve those sorts of cataloging questions, but it's different. Catalogers looking at the questions above will think, "How can I do this according to AACR2 when all the forms are different?" But on another level, that is not the problem. Once it is decided to implement something like building a dbpedia-authority file interoperability module (which should have been worked on for quite some time now), catalogers will become much more important.

I hate to hurt people's feelings, but what is important for this conference is *not* adherence to the standards, which is what catalogers are very good at, or at least they should be.

I take very serious issue with:
<snip>
...but either way I'm against a blanket elitist exclusion of thousands of practitioners by using them as scapegoats for others failures.
</snip>

We must understand that the failure lies with us: not with others. The library catalog, as it exists today, is obsolete. That may be very tough to accept, but until this simple fact is accepted, there can be little advancement. Yet, I want to emphasize once again, that this is a failure of *the catalog* and not with *the catalog records* which currently display--I believe--only a tiny fraction of their potential power. Still, we must get away from the FRBR user needs straight jacket if there is to be a chance.

But I maintain that nothing, absolutely nothing, that is suggested now can be considered "impracticable" until after exhaustive tests are made. What seems to be "impracticable" can turn out to be child's play with just a slightly different point of view.

Naturally, there are catalogers who can be more open to what people really want, which is what this conference is about. I like to consider myself among their numbers, but perhaps I am shot, too.

Tuesday, September 21, 2010

RE: Whiny and demanding; rude and arrogant; clueless and uninformed

Posting to NGC4LIB
Janet Hill wrote:
<snip>
You'd better include technical services folks (catalogers, metadata specialists, whatever you want to call them) in your collaborative group, since they are the ones who are most intimately acquainted with the content, various ways to manipulate it, what kinds of things it can do to display, and in some cases which things are impracticable (though possibly desirable), and which cool things might be possible (if only .....)
</snip>

At the risk of being too blunt and making myself the object of general derision, I think that at the level these people will be discussing, perhaps it would be best that if technical services people attend, they should only be there as observers, that is, to sit, watch, and listen. It seems obvious to me that what is needed now is *imagination* and this imagination should not be limited by what the technicians immediately consider to be impracticable. With the power of modern tools, which can bring in content from myriads of different sites and databases, not only from the traditional tools created by libraries, but including new cooperative tools created by non-librarians, often by true experts in the fields, e.g. musicologists, cartographers, publishers, agronomists, astronomers, physicists, and yes--even the public at large, while the importing itself can be done in a whole variety of ways--it will take a very long time of experimentation, trial and error, and so forth, to reach the point where anyone could truly declare a desirable function to be "impracticable".

Far more could be done with the tools that exist on the web right now--right this second, but what we need are both an exceptionally wide vision that goes far beyond the traditional tools, and genuinely fresh ideas (which abound all over the web). Our focus should be on building the tools people want and assume that nothing is impracticable until it is proven to be so. Even in those cases when it is determined to be impracticable to implement something completely, perhaps 80% could be achieved, or a new tool could appear tomorrow that would make it all possible.

After all, when the Ferrari Formula 1 racing team is figuring out what they need to do to win races, they do not want the mechanics saying, "Well, that just can't be done." It is the mechanics' job to try their utmost to implement what the drivers and engineers determine is needed, and that means a lot of highly innovative thinking on everyone's part.

Otherwise, they do not win races and they may as well all go home.

Saturday, September 18, 2010

RE: An earlier time of transition in our profession

Posting to Autocat

Classified catalogs can be very powerful, but they always had to have supplementary indexes. A quick Google Books search provides several examples, e.g. http://books.google.com/books?id=zpAIAAAAQAAJ "Catalogue of the Liverpool Library" (James Smith, 1814), the copy from the Bodleian.

The main descriptions (very light) are arranged in a classified manner, and you browse everything under the main heading, e.g. I am looking on p. 216: Education, with the individual books arranged in an alphabetical arrangement by a type of catchword title, beginning with either the possessive form of the author's last name, when (I suspect) the main title is more or less copied as on the book, as for instance "Chesterfield's letters to his son", but when the catchword reflects the title less perfectly, they change it to: "Cornish on classical learning", which may not match the item of the original very closely at all, but of course, this would take time to check to see if my supposition is correct. On p. 337, you have an actual author index (supplemented by anonymous authors). Finally, there is a list of the classes on p. 46. These supplements are critical.

But that was not the only way to do it, and if you look at the "Catalogue of the San Francisco Mercantile Library" (1854) http://books.google.com/books?id=4pkQAAAAIAAJ, you will see precisely the opposite arrangement: the main descriptions (more substantial) of the books here are arranged under author (main entry), while the "Analytical Catalogue" (i.e. the classified arrangement) starts on p. 137.

With card catalogs, things changed a bit again. In the U.S., libraries tended to arrange their cards in alphabetical arrangements because they were seen to be serving the common people more than an educated elite. Therefore, a regular person may be interested in Dogs but not have a clue where to begin browsing for it in a taxonomic arrangement. So, they put it under "D", while the arrangement of books themselves served as the classified arrangement for the public, with the shelflist--which was mostly off-limits to the public--serving as the old classed catalog. Many European libraries opted for classified catalogs with indexes. A lot of the reason for this was that most were closed stacks anyway. Both methods had their advantages and disadvantages.

While all this may be interesting historically, that's what it all is and where it should remain. These historical considerations should be there to help us imagine new possibilities but not to limit us in any way. I am sure there were many other arrangements of the bibliographic descriptions out there just waiting to be found, and they should be brought to light, so that they could help us imagine even more possibilities.

With today's tools, there are many, many ways of arranging, resorting, redoing everything, so multiple arrangements of the same materials are possible. So, in a correctly configured system, you could arrange the same materials by Dewey arrangement, LC, Bliss, Colon, UDC, any arrangement the patron would want, while new arrangements could probably even be made on the fly. For example, the Mendeley software (free! http://www.mendeley.com/) will take the documents on your topics, do a semantic analysis of them, and do the searches for you, along with social networking and other incredible things. It's not so great yet, I don't believe, but it does have promise.

I have gone back and forth over the value of classification for online resources. Currently, I think it could be fabulously useful. While some out there would probably argue that classification is old hat now, I think that if it is to be useful, it could (and must) be taken to a new level using these powerful tools. How something like Mendeley and DDC, LCC, UDC, Bliss, etc. numbers would work, I don't know, but I think it could improve matters for everybody.

Wednesday, September 15, 2010

Cataloging Matters Podcast no. 4: The Functional Requirements for Bibliographic Records a personal journey Part 2

The Functional Requirements for Bibliographic Records
a personal journey
Part 2

Go to Part 1




Transcript

Hello everyone. My name is Jim Weinheimer and welcome to Cataloging Matters, a series of podcasts about the future of libraries and cataloging, coming to you from the most beautiful, and the most romantic city in the world, Rome, Italy.

In the previous podcast I began a description of my own personal journey with FRBR, and as it wound up being quite a bit longer than I had anticipated, I decided to stop and continue it with a second part. For those of you who have not yet heard the first part, I believe that this installment will not make nearly as much sense without the first one, so if you are interested, I would strongly encourage you to listen to that one first. The link is available from the transcript.

Before I continue describing my journey however, I believe I need to explain something about the approach I have chosen, since I fear that otherwise, I may be misunderstood.

While I fully realize that such a personal approach may seem strange, out of place, and perhaps even uncomfortable when discussing something as esoteric as library catalogs, I believe that being honest is very important in these matters. Besides, it is vital to keep in mind that the argument over the future of information storage, retrieval, and use is not only a technical discussion, it is necessarily a political issue.

What do I mean? In this regard, I believe that many of the events of the last few years bear out the wisdom of what James Madison wrote as far back as 1822 in his letter to William Barry: (and I quote)
“A popular Government without popular information or the means of acquiring it, is but a Prologue to a Farce or a Tragedy or perhaps both. Knowledge will forever govern ignorance, and a people who mean to be their own Governors, must arm themselves with the power knowledge gives.”
(end of quote)

I would like to continue the line of reasoning from this famous quote of Madison: If information equals power, then in many ways for the vast majority of human history, this has meant having access to, and the use of, libraries: whether those libraries were ancient, medieval, or modern, subscription, public, government, or private, or whatever form they happened to take. If we also consider that by definition, people who live under dictators or absolute monarchs have no power, and consequently, these unfortunate people have little need for information since their wishes and opinions make no difference in the scheme of things anyway, but when those people decide to take the power, as they have in our modern republics, then everything changes and every single person is faced with certain responsibilities: the individual citizens have the responsibility to inform themselves on the issues of importance, and also, the organized mass of people, primarily through their governments, have the responsibility to ensure that the citizens have the means to inform themselves.

How can anyone possibly be expected to vote even half-way intelligently if they don’t have access to some kind of reliable information that is superior to whatever gossip and superstition that is available locally, or to simply vote as some local political boss or populist journalist says? If that happens and the people do not have enough information to make a reasoned decision, that is when it all becomes a farce or a tragedy, or perhaps both.

In the United States, as in other countries, the public library movement became absolutely critical to the development of a modern democratic society, so that the citizens could understand enough to be able to make informed choices.

And yet, while thinking about Madison’s quote in this way may make us feel good about our forms of government and our forms of society, it doesn’t exhaust the possibilities in today’s world: we are rediscovering that the issue of access to information is intensely controversial not only in national politics, where we hear different political groups today claim that the media is slanted toward the views of “the other side”, but concerns toward information are penetrating ever more subtly into people’s lives, and can even have major international consequences--true, sometimes with a strange, postmodernist quality to them. To bring up only one well-known example, I am sure many of you remember those cartoons satirizing Muhammad that were published in a Danish newspaper a few years ago. For those who would like some more information, in the transcript, I have posted a link to the incident in Wikipedia. [http://en.wikipedia.org/wiki/Jyllands-Posten_Muhammad_cartoons_controversy]

I would like to set aside the moral, legal and cultural issues of the cartoons for the moment, and only mention that if those same cartoons had come out in 1955, probably no one outside of the metropolitan area where the newspaper was published would have known about them since everything would have been locked away in newsprint; a few copies would have wound up in libraries, and afterwards they would have been so exceedingly difficult to access, or even know about, they would have been forgotten quickly; but when the cartoons were published on the Internet 50 years later in 2005, the results were immediate: riots in various countries on the other side of the world; Danish embassies were stormed; people were killed; and a huge international scandal ensued, with aftershocks felt even today. In the transcript, I put a link into the latest news stories dealing with the cartoons in Google News. http://www.google.com/search?q=muhammad%20cartoons%20denmark&tbs=nws:1&source=og&sa=N&hl=en

I mention this incident not to blame or find fault with anyone, but to simply provide an example to show how truly complicated such matters are today, and that the lesson we learn is the simple fact that for better or worse, the Internet has changed the age-old information structures forever. Because concerns over information and access to it are changing in such fundamental ways, it turns out that when we debate issues of information storage, access, and retrieval, we find ourselves discussing changes in societal power, and there could very well be far greater consequences than what we can imagine now.

Even though it may be true that the vast majority of Internet users currently spend their time watching the latest youtube video of some boy wiping out on his skateboard, there are also new types of information sharing going on that are of greater significance and that will have far more profound consequences. For most of human history, libraries were the hubs of information sharing, but as technology forces changes to this traditional structure, we are witnessing changes that are out of our control in many ways.

So, I believe that any discussion of libraries and access to information is not merely a dry, theoretical exercise without consequences, but the discussion is actually very practical, and when solutions will be adopted by libraries, which have traditionally been the focal point for information, they will have widespread consequences for human society, whether what we make succeeds and libraries flourish, or if our tools fail, and society discovers that it has less and less need for libraries and our skills, and thereafter rely on the Googles and Yahoos and whatever else appears on the web. Whatever the results, I believe that such a discussion is very, very important to everyone.


With that out of the way, let’s return to my journey, and recap the twelve-step process I experienced:
Determination (to understand it)
--Incomprehension (not understanding anything)
----Humiliation (not telling anyone about my incomprehension)
------Renewed Determination (to understand it)
--------Joy (at the first glimmers of understanding)
----------Comprehension (full success)
------------Consternation (the first questions)
--------------Serious questioning
----------------Serious doubts
------------------Disillusionment
--------------------Despair
----------------------Hope

After experiencing initial failure in understanding FRBR, I had overcome my problems and entered the stage of Comprehension, and felt pretty good about it. But what effects did entering the Comprehension stage have on me? Actually, not much at all, since nothing practical existed and everything was just completely theoretical. I may have given a presentation or two to my cataloging colleagues about FRBR, but everything went on as normal.

As I remember it now, my feelings toward FRBR at that time were associated primarily with relief that I actually understood it. In more specific terms, I didn’t have problems with the group 2 or 3 entities--they were just about the same as the names and subjects we already had, but those doubts of the group 1 entities that had been nagging at me from the beginning remained at a barely conscious level. It was only quite a bit later I realized that down deep, I was thinking: “Everything has a work? Everything has an expression? Really? What does this mean?” But consciously, I was happy that nothing much was different.

It turns out that my initial serious questions were not directed toward FRBR itself, but rather on general cataloging practice dealing with copy vs. edition, or what FRBR calls, item or manifestation. Essentially, it comes down to a very practical question: “I have this ‘thing’ that I need to add to the collection. Is this thing something new, and if so, I need to make a new record, or is this ‘thing’ a copy of some other thing that already has a record in the catalog, and so I do not make a new record?” It almost goes without saying that this is the most basic question, i.e. Do I make a new record or not?, that must be answered before a cataloger can begin to do anything at all.

To explain once again, the manifestation or edition is supposed to describe “the thing you are holding in your hands”, and this is also the point of ISBD, i.e. the international standards detailing how to describe an item. This standard also implicitly defines variant manifestations by telling us exactly what aspects of the item we must choose to describe and how to do it. I understood this and was fine with it, but I had seen myself how updates to some of the rules resulted in some major changes. For instance, I remember the consequences of the update to LCRI 2.5B9, about counting plates, which now says “If the leaves or pages of plates are unnumbered, give the number only when the plates clearly represent an important feature of the book. Otherwise, generally do not count unnumbered leaves or pages of plates.” Before this update, we had always counted the unnumbered plates.

Clearly, this was to save cataloger time for greater productivity and is simple enough, but it turned out that at just about that same moment, I had a book with unnumbered plates, (I believe it was “The New Russians” by Hedrick Smith, issued both with plates and without) and with the updated LCRI, I was not supposed to add the plates to the record since they were not “clearly important”. Consequently, what on one day would have been a new edition or manifestation, on the next day this same book suddenly became a copy or item. I also considered highly dubious the update to LCRI 2.5B8 to use 1 v. (various pagings) for complex paging much more often than before, which would have to lead to the same consequences as not counting the plates and thus merge what had previously been different manifestations.

This bothered me, but of course, I did it.

At about the same time, I became interested in rare book cataloging, and discovered that those people consider an edition or manifestation in a much more detailed way than regular catalogers ever did, figuring out states and issues of the text, looking for points, the condition of the book, plus a dust jacket(!) all of which general catalogers ignored, while they threw out the dust jackets. (When I learned about this, I looked at dust jackets a little differently, but still threw them away) It turns out that these seemingly tiny variations can lead to an exponential difference in price that can literally blow your mind. If you have a first edition, first issue, first state, first everything and in mint condition of Fitzgerald’s The Great Gatsby--and remember: don’t throw out the book jacket!--why, you might even be able to buy that house you’ve been wanting. See, e.g. the website Modern First Editions http://modernfirsteditions.net/ (the link is available from the transcript)

In any case, in rare book cataloging, it was indisputable that they really were much more concerned about the variants in the actual text than we were. We never compared text, but focused our attention on the areas outside the text itself, e.g. the prominent areas and preliminaries (in cataloging terms). It was clear why the situation was the way it was: regular catalogers had to deal with far greater numbers of materials than our colleagues in rare books, and we just didn’t have the time. But after talking with several scholars, it became obvious that they were not aware of these subtleties in the catalog: where in the same catalog, a rare book record describing the different states for copies of Huckleberry Finn, could be seen side by side the records made by us.

Yet another factor that added to my discomfort was the circumstance that Princeton University Library was in the midst of a huge retrospective conversion effort.

Now, let me assure those those of you who do not know what the remarkable term “retrospective conversion” means in the library catalog sense, that it has nothing whatsoever to do with the Mormon religion and how they can baptize people who are long dead. (Just a joke) What “retrospective conversion” means in libraries is the process of taking the information in the card catalogs and transferring it to the computerized environment. It is invariably a huge undertaking, and my own involvement in the project spurred me into doing research into the history of Princeton’s catalog. It was surprising that it went back to 1755, then its first printed catalog appeared in 1760, and from that time, all kinds of changes took place. It turned out that studying the history of the Princeton catalog was very similar to studying the history of library catalogs in general, both the good practices, and the bad.

One thing I noticed however, was that there had been several previous “retrospective conversions” at Princeton over its 250 or so years, and it was fascinating to see how everything fit together logically, or did not fit together very well at all. In any case, through my historical studies of Princeton’s catalogs and other catalogs as well, I saw even more ways of cataloging materials, some that differed radically from what I was doing, while some aspects of it, not everything, but some aspects, I thought were superior to what I was doing.

Some other factors: for several years I was the moderator of a huge email list for ASIS&T, the special interest group for Information Architecture. While there were some librarians on that list, there were extremely few catalogers, if any at all, and in any case, their tiny voices would have been crushed under the weight of so many web developers and new-fangled “information architects”, each trying to come up with new ideas. Also, I was moderately involved in the Dublin Core initiative, which also had different ways of looking at information.

While these considerations did not bother me all that much, I stored them away and I looked at the records in the catalog we were making much more critically. It was clear that the idea of the edition/manifestation is not something that is inherent to the “thing you are holding in your hands”; rather, it is a matter of definitions and who you happen to be. I realized that there were obviously many ways of looking at “the same thing”, and this would become even clearer to me later.

I was fully ensconced in my Consternation Stage.

I can’t deny it: I was rather happy in my Consternation stage. It didn’t trouble me very much at all since, although I saw some of these problems, it was even easier to look away and I didn’t really think about them. At a subconscious level, I suspect that I assumed that the problems I saw were not really problems, but just “matters waiting further refinements,” or in other words, while there were certainly a few anomalies, nothing is perfect and such problems should be expected. Nevertheless, the overall purpose and structure remained sound. As I said in part 1: everything followed Cutter’s principles, and that is what we had been aiming at for over one hundred years. How could there possibly be a problem? Even though I had discovered that there were lots of other ways of viewing the information universe: Dublin Core, rare books, the information architects, angry faculty members from the 1820s forced to double as librarians, and so on, modern catalogers had the answers. We knew what was right because we had the weight of the history of cataloging and we could point to undeniable successes for a long, long time.

When my wife and I moved from the United States to Italy in 2001, I had to confront many changes with respect to the course of my former life. Several assumptions I had were shattered. As only a single curious example, one of the first of my assumptions to fall away came from how the U.S. press portrays Prime Minister Silvio Berlusconi. In the United States, I had the idea that he is a clown and a fool. Once here however, I discovered that he is far from a fool, but in fact, a deeply clever person in many ways, whether or not you happen to agree with his policies. As another example, the ways in which people relate to food and alcohol are totally different from the United States, and I found that I like the Italian ways a lot. It was in this state of mind that I began work at the Food and Agriculture Organization (FAO) of the United Nations, not long after I left the U.S.

At FAO, I did not work in the library, but in the documentation unit, which was responsible for the collection, cataloging, and indexing of all the documentation produced by FAO, anywhere in the world. What genuinely surprised me was that the cataloging standards that both my unit and the library followed were something I had never heard of before: the AGRIS standards supplemented by our internal FAO standards. The AGRIS standards had been around since the early 1970s and did not follow AACR2 or even ISBD, but nevertheless are used by libraries the world over.

Those rules are different. When cataloging and indexing at FAO, I discovered that I had to approach materials differently than I did as an AACR2/ISBD cataloger; I looked at them differently; I had to consider different aspects I had not dealt with before, and ignore other aspects that previously had been vital.

Finally, I need to mention something else that was quite new and that would become important only after FRBR came out; a tool I think everyone would agree changes everything, or almost: the website with that very silly name, “Google”. In spite of all my misgivings, I was forced to admit that this strange contraption could actually work!

I was quickly entering my Serious Questions stage.

At this point however, I would like to stop and save the rest for yet another part, part 3. I regret the inconvenience to anyone listening of going on so long, but that’s just the way I am. Probably, this second installment will prove to be a bit more controversial than the first one, since in the first I tried my best not to say anything new and concentrated on describing FRBR, while in this one, I am actually raising some questions that proved to be uncomfortable to me. From this point on however, the number of questions will grow and grow, while they also become progressively more uncomfortable--as I said, at least they are to me.

Still, I am going to jump ahead for a moment and emphasize that I think there are many excellent grounds for hope, so in my opinion, all is not lost but the key is to discover and define what are the problems facing catalogs today, avoiding theory as much as possible, and then to focus our efforts there. Otherwise, I fear that we are simply avoiding the questions and instead of facing the very serious issues before us, we are trying to force this new universe of information into forms where we feel more comfortable, whether this new universe fits or not, and whether anyone wants what we make or not, and if we don’t deal with the serious problems facing us, there is a danger that we will end up spending a huge amount of time and energy creating tools that will be irrelevant to our users and to society at large.

The music I have chosen for the end of this segment is the allegro of Corelli’s Sonata for violin and basso continuo in E major Op. 5 No. 11. For a change, I do have some information on this recording: it was performed by the Locatelli Trio, with Elizabeth Walfisch, violin, Richard Tunnicliffe, cello, and Paul Nicholson, harpsichord. If you would like to listen to the entire recording, the link is available from the transcript. http://www.youtube.com/watch?v=NwY0kp8PnE4

That’s it for now. Thank you for listening to “Cataloging Matters” with Jim Weinheimer, coming to you from Rome, Italy, the most beautiful, and the most romantic, city in the world.

RE: Interesting conversations about RDA and FRBR ...

Posting to RDA-L

Bernhard Eversberg wrote:
<snip>
For classical music, it is indispensable. Apart from this, I think, one must certainly retain it for prolific authors, difficult though they are to define.

LibraryThing, from the outset, had no such notion. Later, however, they realized that some kind of "grouping" was badly needed, and they came up with the "Canonical title"! They invented the notion of the "Series" as well, after realizing that this kind of grouping can be very useful, and their understanding of it is even wider than ours.
So, they uncannily re-invented the bibliographic wheel. Can we go ahead and abolish or even neglect it, make it square or something?
</snip>
But who are the ones doing the reinventing? As only one example in music, there is the Classical Archives at http://www.classicalarchives.com/ and their searching module is quite interesting. Play with their advanced search and see what you get--something that would be difficult to get out of traditional library catalogs that I think the public most probably likes. The Internet Movie Database is very useful too. When (and if) libraries put their records online in a more accessible manner, they will be the last ones, and it will be very difficult to know precisely who will be doing the reinventing.
<snip>
But we cannot base decisions solely on what average or even above-average patrons know or instinctively want or what we believe they want. As soon as they start thinking and consciously working with bibliographic data, the LT lesson teaches us, they start re-discovering and re-inventing.
</snip>
I also believe it is difficult to know, but FRBR/RDA make precisely those same assumptions. Still, when things are reconsidered independently, there may be a rediscovery, but it is rarely the same as the original knowledge--there is more often than not several new and important twists provided for the new people.

But instead of pitting it as "us" vs. "them", (Us vs. Classical Archives & IMDB) another way of looking at it is that we are all in it together, and we are doing the same work over and over and over. This is the sort of thing that I think could be improved by working together and sharing this kind of information (OK, the Classical Archives is a paid service, but they aren't the only ones out there) so that everyone can benefit. If there were different choices as to the "clicks" selected in the Classical Archives, with some of the choices coming to our materials, that would help us and our patrons, too.
<snip>
But first of all, liberate works that are now incarcerated inside all sorts of "collections" or "multiparts" (whose "workness" is somewhat dubious). Here, the notion of the (physical) "item" is really not the best of concepts, in terms of usability of the catalog, to base a description and a record on.
</snip>
A terrifying possibility, but one that I agree is probably necessary, although libraries do not, and will not, have the resources to do it. I remember working on single volume conference publications that could take days because each one had dozens of individual papers, and instead of one item, the single volume became 40 or 60 or more records. I think the only way it could be done practically would be through some kind of crowdsourcing.

Also in this regard, with the recent, and very positive, DMCA changes and the possibilities to remix, the very notion of implementing FRBR-type structures for these materials is staggering. See: http://www.eff.org/press/archives/2010/07/26

RE: Interesting conversations about RDA and FRBR ...

Posting to RDA-L

Bernhard Eversberg wrote:
<snip>
It also goes well with the paradigm of all known retrieval systems, based as it is on the idea of
the "result set", resulting from a query that uses attributes of various kinds, and all of them can be viewed as attributes of items. Certain combinations of attributes define subsets of items - some of these subsets can be called "manifestations", "expressions", or "works".
The identification of the work, however, remains the open question. It has to be done somewhere. Traditionally, it was pinned down by the "uniform title", and many of our records have this as a distinctive attribute. Add to it the language, date, form, medium, numeric designation, key, coordinates, etc. - and you single out the crucial subsets that FRBR views from the top down.
</snip>
I don't know if I agree that the identification of the work has to be done somewhere. Perhaps in some formats (I am thinking primarily of music), it is more important than others, but even then I don't know if people are able to find what they want using the newer tools, e.g. ITunes and Youtube. But in library catalogs (both OPAC and cards), very few people I have met even understand what a uniform title is, much less be able to work with it. This is not to say that searching by work is unimportant, but people must first be made aware that it is even possible, while the very concept of controlled vocabulary (even personal name control) is being forgotten among the general population.

What I like from those comments, and especially the thread of Karen Coyle's, is that people there seem to be approaching the problem in a fresh, new way, instead of saying that first of all anything must fit into this WEMI pattern. At least from my understanding of the thread, what is especially forward-looking is the focus on the individual attributes without grouping them into a prearranged structure. Each community should be able to group them as they wish; which they will anyway!

Free the attributes!

RE: Thank you for an excellent day of RDA discussions!

Posting to alcts-eforum concerning the forum: Preparing your library for RDA

All,

I have been following this excellent discussion as closely as possible. It has obviously proved to be a very interesting topic for librarians! The responses as to how individual libraries plan to deal with the implementation of RDA and I have learned a lot, so I want to thank the moderators! I particularly like the practical questions people raise as to productivity and workflow. It goes without saying that these areas will be hit, but there is still no real indication what the real consequences will be. I have read vague promises that productivity will increase, but I have never seen any ideas as to exactly how or why it should: I don’t believe anyone could say that RDA is any simpler; at its best it will probably be just as easy, or as difficult, as AACR2, but I have found it pretty tough. Yet, we are supposed to simply have faith that it is a step into the future.

It is interesting that Tom Hickey did research on WorldCat and discovered that over 80% of the records reflect only a single manifestation, and therefore, FRBR really applies to less than 20% of what is out there.

(By the way, the original site seems to have disappeared, and I can only find citations to it, with broken links, e.g. p. 8 of Barbara Tillet’s “What is FRBR?” where we find:
Hickey, Thomas; & Vizine-Goetz, Diane. Implementing FRBR on large databases [online]. [Dublin, Ohio]: [OCLC], 2002 [cited 31 December 2002]. Available from: http://staff.oclc.org/~vizine/CNI/OCLCFRBR  files/frame.htm
Trying to find this reminded me of looking for Aristotle’s lost book on comedy, as we see in Umberto Eco’s “The Name of the Rose”!)

In any case, RDA and FRBR seem to be a strange adaptation of Pareto’s 80-20 rule, where we are building tools aimed primarily at the part that is less than 20%!

For those who are interested, I would like to point out the possibility of the Cooperative Cataloging Rules at http://sites.google.com/site/opencatalogingrules/

Tuesday, September 14, 2010

RE: New ed. of Chicago Manual of Style

Posting to RDA-L
Hal Cain wrote:
<snip>
Quoting "Adam L. Schiff":
> I was sitting at lunch today, reading our weekly alternative newspaper The Stranger, and lo and behold they have a book review of the new (16th) edition of The Chicago Manual of Style:
http://www.thestranger.com/seattle/hyphenate-this/Content?oid=4830760
>
> The are a number of changes to the style manual mentioned in the review that have implications for RDA instructions and examples.
>
> RDA A.10: The guidelines for English-language capitalization basically follow those of the Chicago Manual of Style.(1) Certain guidelines that differ have been modified to conform to the requirements of bibliographic records and long-standing cataloguing practice.

Why should cataloguers (as evidenced and prescribed by RDA) follow styles which differ from the leading English style guides in the various English-language countries?

We're constantly being told that the data we craft may be employed not only in bibliographic catalogues of the kind we're used to but elsewhere, in as many different contexts as people can imagine. While I doubt some of those claims, I think some of the difference of stylewe're used to are retained for no good reason.
</snip>
Of course, every style guide is different, and champions of each format think that *their* form is best and will fight to the very end to keep what they have.

One of the "new" needs the users have from our bibliographic records that they didn't need back in Cutter's day is to be able to get automatic citations. People want them badly, and the reference librarian part of me wants them badly too, because people always mess up citations. It would be great if there were only a single citation format (or, by the way, a single way of cataloging!) but there isn't and won't be for a long, long time, if ever. Fortunately, modern tools are powerful enough to make everybody more or less happy, and OCLC is doing a fine job of it.

For example, take a record from my catalog, http://www.galileo.aur.it/cgi-bin/koha/opac-detail.pl?bib=7144, and click in the right-hand column "Get a Citation". A box will open up with citations ready to copy & paste. This is made available through OCLC and some very innovative RSS feeds (not MARC format!), and where OCLC does some additional filtering behind the scenes, e.g. take a look at the forms of author's names in each of the formats. The problem is, there are lots of duplicate records in OCLC, and multiple possibilities result, as we see here. I could easily limit this to the first five, but the first five do not necessarily seem to be the best choices, so I am letting it "all hang out".

Still, this is a great tool that has helped my patrons immensely and I applaud OCLC for this! It should also help catalogers understand how text in a database can be reworked for different display; e.g. we see the first names reformatted (capitalized, only initials, etc.). There are many ways of accomplishing these transformations and all of this allows for many possibilities.

I would like to point out that although automatic citations are a "new" need in library terms, in absolute terms, they are pretty old. According to Wikipedia, the first citation management software came out in 1984 (Reference Manager), Endnote came out in 1988, which has been some time ago, so in many ways, we are catching up here, too.

Friday, September 10, 2010

RE: Elementary errors (Was: Rule of three--gone??)

Posting to Autocat

On Thu, 9 Sep 2010 16:02:14 +0000, Laluzerne, Anthony wrote:

>John,
>
>I think what Jim is trying to say (and Jim, correct me if I am wrong) is that the majority of searchers don't distinguish Google from our catalogs, or even care that they are different products. And, this will increase exponentially as the number of intermediaries (librarians, namely) to point out those differences are becoming fewer and fewer because more information is accessible without the aid of an intermediary (checking the web at 1 am in your pajamas).
Yes, this is exactly what I mean. Thank you. And as others have pointed out, the library catalog is being used less and less for people to find information. Already, it is of little use for the hard sciences and technology; it is being less and less used for the social sciences such as business; the humanities still use it, but it seems dubious to hitch up to them because they appear to be in trouble themselves (see the article in the Chronicle: "Can the Humanities Survive the 21st Century?" http://chronicle.com/article/Can-the-Humanities-Survive-the/124222/) Plus, I shall state (once again) that the only place where library use is up is with ILL, and the moment the Google Books become generally available, people will not need ILL nearly as much, and the last "success" will plummet.

Research has also shown, and my own experience completely concurs, that the average person rates their searching abilities as either "very good" or "expert". I suggest that catalogers sit down with someone (most preferably an undergraduate student) to *watch* and *listen* (i.e. don't tell them what to do and how to do it) to what they do when they are searching for information. How do they search? How do they react? What do they take for granted? What do they like and what do they dislike?

The way I look at it, catalogers have their theories, codified in the 1840s, defended "under fire" during the Royal Commission, and pretty much accepted after that. Whether those theories were right or wrong, or true or false, is a matter for historians to decide. The fact is that the basic underpinnings of what we do is still based on what was figured out back then. Deciding that those guys figured things out for "all time" is akin to saying that Aristotle said it all on physics or Ptolemy on astronomy, or Marx on labor history, that all modern literature is just a rehashing of what Shakespeare wrote. We need to figure things out for ourselves, today, utilizing the power of the tools we have at our disposal, and what our predecessors, although they were very good, could never have imagined. As an example of the limit of their imagination, I suggest Cutter's "The Buffalo Public Library in 1983" (written in 1883) and available at: http://en.wikisource.org/wiki/The_Buffalo_Public_Library_in_1983

In this regard, there is a very interesting paper "The Pragmatic Basis of Catalog Codes: Has the User Been Ignored?" by Jon R. Hufford http://wendolene.tosm.ttu.edu/bitstream/handle/2346/510/fulltext.pdf?sequence=1. I agree with this essay, but I will point out that the testimony to the Royal Commission was all about user needs and how the users felt their needs were not being met, but after the Royal Commission, it seemed to be all cut and dried.

In the first edition of Cutter's rules (1876), he discusses the users: http://tinyurl.com/3x4xpom (p. 526-527) and lists the questions they ask that the catalog can answer. The remainder flows logically from this, but I ask: isn't it time to genuinely reconsider all of this? He says, "There are two sets of probable inquiries, the first asking what books the library contains; the second relating to the character of the books." He then provides some sample questions. Are these the kinds of questions we receive today? Are these really the types of "probable inquiries"? Can a cataloger decide if this is correct, or a reference librarian? Also, notice in all of his questions from users the word "book" is repeated. Note also that he says, "what ... *the library contains*". Many assumptions underlie these important statements of Cutter.

Again, I state categorically that catalogers *cannot know* what the users want because they are not the ones in contact with the users. Catalogers have another, very important, job. The reference librarians and selectors, who actually deal with the public are the ones who have an idea what they want, and some catalogers wear these different hats, but these people have an idea what the user needs are not from their cataloging duties, but from their public duties.

It is vital for us at this time to be humble, to watch and listen and learn. I have to admit that I don't even know what my own needs are as a user, and I've probably thought about it as much as anybody out there. All I know is that they are changing, as new plugins come out (Zotero, Diigo, etc.), as new services come out (e.g. Mendeley, Connotea), plus all of the open archives that make outrageous amounts of material available, (e.g. Mr. Hufford's article above, plus zillions of other useful things) but many times, not so easy to find. All of these new tools and types of access have made tremendous changes to the way I use, access, become aware of, information and not the least important, what I expect from it. Catalogers should be vital parts of this. How? I don't know, but it will be different from how it is today.

I can't believe I am the only one who has seen real changes in how I relate to information.

Thursday, September 9, 2010

RE: Elementary errors (Was: Rule of three--gone??)

Posting to Autocat

Kevin M. Randall wrote:
<snip>
James Weinheimer wrote:
> As I wrote before, I still believe that people want the traditional information that library catalogs provide, but that we should no longer believe that people primarily want to "find/identify/select/obtain: works/expressions/manifestations/items by their authors/titles subjects".

This is still useful access for many, but we really need to go beyond these traditional tasks and why I believe FRBR is already obsolete.
You are continually making this assertion, but I have yet to see any kind of example of what you are talking about. What exactly is it that people are trying to do, in regard to bibliographic metadata, that FRBR doesn't address? I would like to see even just one example.
</snip>
I reply: look at how you, yourself, search in Google. Are you doing the FRBR user tasks? No, because it is impossible to do them since there are no options for author, title, or subject, while retrieving the WEMI is completely out of the question, and yet nevertheless, the results can be quite good, and many people out there vastly prefer those results over traditional catalog access.

I recently addressed this up to a point on the RDA-L list, a copy of my post is on my blog at: http://tinyurl.com/2wxhuch. The main problem is that the FRBR structure of WEMI is based on printed items that do not change (taken almost completely from Cutter in 1876, which is actually quite amazing!), and it really does not fit virtual materials, which can be incredibly complex. Just a bit of working with the public as a reference librarian shows that people have incredible difficulties with the catalog. And yet, we must admit that people have always had trouble with a library's catalog. It is nothing new and has been with us for a long time.

The public searches the library catalog like they search Google, and necessarily they get very poor results. That's one reason why many prefer Google. The very idea of searching by author is strange to these people, while the concept of authority control is almost impossible for them to grasp. The rare times when they do listen long enough to really understand it, they tend to like it a lot, but the difficulties of explaining the idea of authority control to people who are not interested (i.e. the vast majority of people) should not be underestimated.

Several non-librarians, non-catalogers, and even some catalogers that I have conversed with, have concluded that since this is undeniable, what we make is obsolete and should be dispensed with.

I want to state categorically that I *do not agree* with that at all, but we must recognize that there are these deep problems--and the problems are *not* with the public, who just do not know or understand how to search the catalog, and are therefore somebody else's problem, the problem lies elsewhere, and in many cases, with us. Exactly where? I'm not sure, but certainly the traditional catalog interface and searching methods are from another time. There is a lot of research going on right now, trying to find out how people are relating to and using information, and what they expect. One problem (or promise?) is that we are in a time of transition, where new tools come out almost daily, so anything that may be discovered today, may turn out to be changed in only a couple of years or so.

At one time, catalogers built their tools and the public was expected to adapt their behavior and "needs" to them or do without. Now, this situation has turned around: we are the ones who must adapt to them, or ....

There are many other issues that I won't go into now, because I am saving them for my podcasts.

Wednesday, September 8, 2010

RE: Why we need authority control

Posting to Autocat

On Tue, 7 Sep 2010 17:22:09 -0400, Brenndorfer, Thomas wrote:
>There is one aspect in these RDA examples from JSC that I find particularly interesting with regard to how our record structures have been rooted in the past:
>
>http://www.rda-jsc.org/docs/6JSC_RDA_Complete_Examples_(Bibliographic)_revised.pdf
>
>To indicate the primary relationship between the work and the manifestation, there is a field for the controlled access point for the work, derived from RDA 17.8, with the footnote "No equivalent encoding in MARC21." The first example is "17.8. Work manifested. Munro, Alice, 1931- . Lives of girls and women"
>
>The absence of this kind of field I noted in many OPACs with hyperlinks to related works-- there is no hyperlink usually from the 1XX + 245 fields, which is the combination we use to otherwise indicate the heading for the work entity as organized in a card catalog, or single entry listing.
>
>At the Library of Congress catalog, http://catalog.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&PAGE=First, the "Author/Created Sorted by Title Browse" is the closest match to the card catalog display. In some respects, because variant titles from see references in authorities are also interfiled, this is also the closest to the IMDB display of showing variant titles of films together with the "authorized" titles in a result list. But the Library of Congress display for author/title is a little awkward to use, and does not adequately reflect the nature of the relationships between works, expressions and manifestations.
>
>The key point here, especially with the example from the JSC examples for RDA, is that our records are rooted in a card catalog era that have not allowed for the future flexibility of online displays.
I see it more that our present records and rules are designed for a printed environment (not just cards), and this requires *different types of pre-coordinated browsing*, name headings, subject headings, corporate body subdivisions, and the rules still show this, e.g. the number of cross-references for multiple surnames, rules for subject subdivision order, etc. Most of the time this browsing was alphabetical (at least in the U.S.), but I think it's time to admit that the new keyword environment is not conducive to the traditional sorts of browsing.

You mentioned that:
<snip>
The absence of this kind of field I noted in many OPACs with hyperlinks to related works-- there is no hyperlink usually from the 1XX + 245 fields, which is the combination we use to otherwise indicate the heading for the work entity as organized in a card catalog, or single entry listing.
</snip>

This is a very simple programming problem to have the catalog search both with the 1xx and 240 when there is a presence of a 240. I have always wondered why they made the uniform title as they do in the bibliographic file with a 1xx/240 combination, but in the authority file, they put it all together (as it should be):
100 10 |a Shakespeare, William, |d 1564-1616. |t Hamlet. |l Italian. |k Selections
I don't know. This question has come up before, but I don't remember how or if it was answered. I imagine that when they created MARC, they must have had problems with online displays back then and had no choice except to split the string, but I may be wrong.

While I agree that:
<snip>
The key point here, especially with the example from the JSC examples for RDA, is that our records are rooted in a card catalog era that have not allowed for the future flexibility of online displays.
</snip>

I don't think we need RDA/FRBR structures to achieve these novel displays, since the modern formats such as XML are so much more flexible. In any case, any online displays we consider must come from an analysis of what appeals to users and what they need from our records, probably involving a certain amount of trial and error, and not from theory. The FRBR displays I have seen that show the work, expressions, manifestations, items are certainly not the answer to our patrons' prayers.