Monday, May 31, 2010

RE: AACR2 and RDA sample records from LC

Posting to Autocat

"Mark K. Ehlert" wrote:

<snip>
LC is now offering some comparative sample bib records constructed under AACR2 and RDA rules: http://www.loc.gov/catdir/cpso/RDAtest/rdaexamples.html
Linked to from LC's RDA Test documentation site: http://www.loc.gov/catdir/cpso/RDAtest/rdatest.html
</snip>

These are interesting examples that show how little RDA is changing anything of substance. If the general public thought catalogers were weird before because we were obsessed over our semicolons and dashes, this new focus on abbreviations and capitalization will not make us look much better. Of course, with the exception of eliminating the rule of three-which has both definitely positive and definitely negative points-none of this has to do with access, but simply how records display.

This reminds me of a debate I saw many years ago on the MacNeil-Lehrer news show, where one man had "translated" Shakespeare into modern English, while there was another man who didn't like the translation. It turned out that the second man had no problem in general with translating Shakespeare, but it was important to translate the work to make it more understandable to people today. They focused on Juliet's famous line, "Romeo, Romeo! Wherefore art thou Romeo?"

This had been translated as "Romeo, Romeo! Wherefore are you Romeo?" and the second fellow said this was wrong. People don't really have much trouble understanding "are you" for "art thou". The problem in this line lies with another word: wherefore. Most people assume it's just another way of saying "where" but it really means "why". If you think Juliet is just wondering where Romeo was at that moment, the later soliloquy makes little sense, but when you realize she is saying "why are you Romeo?" things become clearer:
"Deny thy father and refuse thy name;
Or, if thou wilt not, be but sworn my love,
And I'll no longer be a Capulet."

And she continues on to,
"What's in a name? That which we call a rose
By any other word would smell as sweet."

So, a better translation would have been something like "Romeo, Romeo! Why are you named Romeo?" or even "Romeo, Romeo! Why art thou named Romeo?"
The second fellow convinced me!

LC's RDA examples illustrate a similar dilemma, in my opinion. Because of changes in technology from the last 20 years or so, people have lots of problems understanding and using both our catalogs and catalog records, both of which are based on an earlier time. But the changes in RDA will not lead to a better understanding among the public of how our catalogs function or how to use the catalog records to do things you can't do in Google. We need to concentrate on the real problems facing our patrons and not just deal with the display.

RE: FRBR examples

Posting to Open-bibliography

Ben O'Steen wrote:

<snip>
I think there is a mismatch between what exists and the FRBR compartmentalisation. we really have are items with multidimensional* relationships to other items that only make sense in context. Forgive my RDF-biased metaphor but I see items as being any node in a graph and that graph containing much of the information on the people, publishers, sources of data, book holdings, and so on.
</snip>

I agree that "item" is readily understandable (even with these crazy mashed-up web sites!) but "work" is not a thing but just one way to arrange those items. "Expression" is a sub-arrangement of "work," and "manifestation" is actually the public display of the "items". This is the traditional method of the catalog. Still, it does not follow that this traditional arrangement also describes how information itself is structured, and therefore we, and everyone, must rework what we are all doing.

I think that one of the main problems with FRBR is that different communities view it in different ways. Scenario 1) sees FRBR as a model of traditional information structure, as organized and described in library catalogs. Scenario 2) takes FRBR as a broader model in more universal and philosophical terms, how information in structured and used no matter what kind of information it is, and where it resides.

Therefore, traditional library catalogers, once they begin to deal with some differences of terminology, will readily understand the FRBR datamodel because with perhaps a few exceptions, it is what they have always done.

But the other question remains: does FRBR offer a model in the universal context of scenario 2? I have never seen this demonstrated anywhere, although I have seen it expressed both openly and accepted tacitly. This basic tension between FRBR as a description of the traditional way library catalogs have been arranged and how they work vs. the broad philosophical statement that this same traditional model also describes the essence of how "information is structured" demands far more proof than I have seen. Add to this the fact that it conflicts with practical reality, and it all becomes especially important as libraries attempt to share their work more widely with other communities.

It seems as if it would be much more useful and profitable both to the library community and to those we wish to share with, if we were to concentrate on scenario 1: to make sure that others understand our traditional datamodel, what they will find there and how they can best extract it, so that when we share our information, others can take it in the best ways for their needs. But this repurposing of our work to fit into this increasingly dubious FRBR model actually seems counterproductive.

For example, for a cataloger, creating a uniform title for a book he or she is cataloging is relatively easy now, but to turn this into a separate "entity" for the work and/or the expression is a completely different matter demanding more labor, while the usefulness to our patrons remains completely unclear, and it certainly doesn't make our catalogs any more "shareable" than right now.

Friday, May 28, 2010

RE: Not sure what this means

Posting to NGC4LIB

Laval Hunsucker wrote:

<snip>
There have always been dumb users and clever users [ and a fluent continuum in between ], and there always will be. I have known, and know, quite a few in the latter category ; some of whom are students, some faculty members, some neither.
</snip>

I don't want to label people who do not know how to use the traditional tools "dumb"; rather, the task is for us to build something that will be relevant to the needs of today's users. Expecting people to search and use materials the same way they did back in the 1960s is unrealistic.

When automobiles arrived, they did not shoot all of the horses, as Shawne pointed out, but fewer and fewer people needed to know about horses, their peculiarities, their noises and smells. Eventually, almost nobody needed to know any of that and horses ceased to be a factor in their lives. Instead, people had to learn the peculiarities of automobiles, and their noises and smells. It would be unfair today to expect people to know how to ride a horse or hitch up a team to a wagon, and to label as "dumb" those who don't know these things.

The fact is, times have changed. For the moment, people still have no choice except to use a library catalog if they want to access the books within local collections, so the majority use them more or less incompetently. (I was almost totally incompetent before I became a librarian, but I didn't realize it) But when the Google Books-Publishers agreement is approved eventually (which could be any moment!), and 80% or more of what they need is online at the click of a button, while the rest will be very easy to ignore, then *everything* can and probably must change.

How will the library world react to the eventual approval? Tough to say, but judging by the glacially slow movement of FRBR and RDA (and I won't criticize them here), the library world will not be able to adapt quickly enough and may be overwhelmed. Perhaps people will still demand paper copies, but I think most will be satisfied with what they can get at the click of a button. To me, it's obvious: librarians must turn their focus away from paper since their patrons have.

I don't mean to be alarmist, but to me, the fact there will be major changes is absurdly easy to predict and it seems best to prepare instead of suddenly "being surprised". There's a lot we can do, if we just change our focus.

FW: [open-bibliography] FRBR examples

Posting to Open-bibliography

Karen Coyle wrote:

<snip>
That is exactly what the OL page does. The page is derived from the data in the database. Whether or not it shows summaries or first lines is really up to the implementation, but it is essentially what you say: a query that brings the relevant records together. At that point, users can move in various directions. If you click on the subject http://openlibrary.org/subjects/ship_captains_in_fiction from that page you go to a subject page that has a publishing timeline, and a lot of choices for next steps for the user. I think this should be the point of the library catalog (or any catalog for that matter), which is to help people discover things they might not have originally thought about.
</snip>

This is good since this display is generated and demands no real extra work to create from scratch. Of course, the formal FRBR work attributes will necessarily get thrown out. I still question the utility of the summary for Moby Dick. If you take that out, and the Lucene/Zebra indexing is applied to extract the headings, what are we left with? A traditional list of catalog records. It's nice; it's certainly an improvement over the traditional displays; but does this sort of display fit people's needs today?

The only way to find out would be to conduct research on different populations, but I would hazard the guess that mostly, it does not. For example, I can't imagine how my students would find this type of display to be of any practical help to them. Yet, when I ask myself what really would be of use, I find it far more difficult to find an answer.

I think I am too much into the traditional methods to figure this out and I would have to defer to outsiders. Nevertheless, the FRBR user tasks seem to me more irrelevant than ever.

RE: Not sure what this means

Posting to NGC4LIB

--- On Thu, 5/27/10, Ed Jones ; wrote:

> I received my weekly e-mail bulletin from the Times Literary Supplement this morning. Browsing its contents, I came on a summary concerning a published collection of old photographs of London. Clicking on it, I was taken to the corresponding review in the TLS. The review mentioned by title a 17-volume 1902 survey of life in London and, curious, I copied and pasted the title into my browser's Google Books search box to see what I would find. It returned the complete set, readable online since they fall within the public domain. This all took less than a minute from opening the e-mail to perusing a volume in the set. I've grown so used to such sequences, that I had to stop to consider how remarkable it was.

This is exactly what my Extend Search does, except it goes far beyond Google Books, which I think is extremely important. I have just managed to incorporate this as a Firefox plugin. You can see the announcement on my blog http://tinyurl.com/2ujvbxr. Of course, it can be improved in a thousand ways. In a practical sense however, often students come to me saying they can't find anything on their topic. When I show them the Extend Search, it almost always happens that within 10 minutes they are complaining they have too much! That's a problem, but a completely different problem.

I agree that what you did was remarkable and demonstrates the power of what can be done today. I know that our public wants to be able to do these same things, especially when ebook readers finally begin to take off.

I think it is important to rethink the purpose of what we are doing and, in library terms, reconsider the definition of "the collection" to include what is *really available* to people, to include sites such as what you showed with the 1902 set found in Google Books. But there is a *lot* more and no one can convince me that our users don't want this. Yet, it is horribly complicated to discover what is really out there and is where the Google-type tools falter.

Once this redefinition is accepted, then the purpose of the catalog, i.e. the method of finding resources in "the redefined collection" begins to assume another dimension. This is when management comes in: to determine *in practical terms* what this catalog-tool should allow people to do, figuring out how best to build it and who should do it, and how this can all function with the tremendous bibliographic legacy we have now.

These are some of the directions that we need to take today and I maintain that if we would build a tool that achieves this, however imperfectly, the huge amount of information that is now available would make a difference in society. It certainly has in my own life.

Thursday, May 27, 2010

RE: FRBR examples

Posting to Open-Bibliography

Karen Coyle wrote:
<snip>
And in response to Jim, the Work level is one that is very useful for user services. Most users come to the library asking for a Work, not a manifestation. They want to read Moby Dick or Alice in Wonderland or the latest book in the Twilight series, and it is the story they are interested in, and that is the Work.
</snip>

Sorry Karen, I have never in my life known anyone who wanted the "work" of Moby Dick or Alice in Wonderland. They may be interested in a choice of specific language versions (expressions) but while you may be interested in all English translations of the Bible, and need some specific books and verses, I haven't met anyone who also needs Bulgarian and Chinese and Korean and Cherokee. What they (and I) would do would be to browse through the cards (as I later discovered, the uniform titles for these famous works), and "ooh!" and "aahh!" as I saw War and Peace in German, French, Chinese, Japanese, ... But I didn't need any of them.

Again, in the card catalog, the "work" was just one way for the cards to be arranged so that they could be browsed. They had to be in there somehow, and that was one way. But taking this old arrangement and declaring it to be an "entity" with different "attributes" is a fallacy in my opinion. It's not--it's an arrangement of cards (manifestations). Taking the concepts of "work" and "expression" beyond arrangement distorts them into something they are not, resulting in all kinds of strange problems, as people are discovering now and as your work has demonstrated.

To continue, simply because catalogers arranged the cards in the drawers in certain ways does not mean that people ever "wanted" these arrangements. This "work" arrangement was probably most useful for the catalogers themselves, who needed an inventory of the collection.

This is why I continue to maintain that we need to build something for our users *today*. What I am thinking about are some really new ideas, e.g. Google's displays, e.g. http://tinyurl.com/2uu77gs, which has all kinds of cool things, although several I don't know how useful they are. I think people may find the word cloud useful, but I'm not sure if the Google API includes it. I have a feeling that Facebook may come up with something if they haven't already (for better or worse).

I don't know how people search and what they want; I am learning about how I myself search, and it is almost totally different from what I did 25 years ago. I don't think anybody knows right now since it is changing constantly. But there is a lot of research being done on "information seeking behavior" and "scholarly communication".

RE: FRBR examples

Posting to Open-Bibliography

Christopher Gutteridge wrote:

<snip>
Thanks, Karen. This pretty much rubs in that there is no "quick start" for this model, and the person creating records is expected to understand the details of the data model. This may work well with fully trained library staff, but in the author-deposits world I work in, that's impractical.

Part of the goal of open bibliographic data is that it should be reasonably easy for people to create and consume (and discover, but that's a separate discussion).

This model seems to describe the 'truth' better than some, but does it facilitate or hinder creation and consumption?
</snip>

I will point out again, that even if we make the effort to somehow create a coherent FRBR datamodel and manage to get most of our records into it, I still have not seen any advantage it will have over what we have now. I still do not see how it will add a single access point that we do not have presently, how it will help metadata creators (professional or amateur), how it will increase productivity, or anything at all. Karen's examples show the enormous difficulties we face to even retain the capabilities we currently have, even in the extremely simple examples she uses. I fear matters could become far, far more complex when applied in a comprehensive way to the real world.

Putting records into a more accessible XML format or RDF format would be a good first step, but it seems increasingly clear that FRBR is strictly an idealistic intellectual model that falls apart in the practice of the real world, while practical implementation appears more and more unwise.

Perhaps one way of beginning to approach the problem would be to clear the mind of all FRBR, RDA, AACR2, MARC and everything else, and ask: if there were no other formats or records of any sort, what would you do? This would be to approach the problem strictly as a user, which is I think the most important task, and focus on the needs of users. Only then can you begin to build tools that attempt to fulfill these needs, and then decide how best to incorporate the past records with the future records, and so on.

RE: Garrison Keillor on self-publishing

Mike Tribby wrote:
In a column appearing today in the Chicago Tribune, Garrison Keillor writes about changes in book publishing, specifically the advent of self-publishing.
A highlight:
"And that is the future of publishing: 18 million authors in America, each with an average of 14 readers, eight of whom are blood relatives. Average annual earnings: $1.75."
In the Tribune's continuing efforts at online helpfulness, they provide a link to information on World War I apparently because Keillor mentions Kaiser Wilhelm II and "his coterie of plumed barons" in passing. I love those helpful links almost as much as Amazon's zany suggestions of stuff I might like!
The full article is here:
http://www.chicagotribune.com/news/columnists/ct-oped-0526-keillor-20100526,0,1253174.column
Thanks for sharing this. This idea that Keillor mentions of what I will term as “every man his own publisher” actually seems to be a continuation of what we have had for a long time. Take a look at LC’s catalog for titles “every man his own …” http://tinyurl.com/36325jz and on Google for “every man his own …” http://tinyurl.com/36z6rtv

It’s interesting that when I search more specifically “every man his own publisher” in Google, I only get one result, where it is used as some sort of paraphrase. Maybe it’s a brand-new topic for research and publication!

While I sympathize with Garrison Keillor, the days he talks about are over. When printing first came up, lots of people prophesied the end of civilization, and certainly the printing press changed many things forever in some fundamental ways, yet humanity survived. And lots of people think it even got better ;-) I think the possibility to realize very clearly that you are not alone in your thinking and opinions, but that that there are many out there who not only share your ideas, they are actually interested in what you have to say, signifies a fundamental advance in our society. It is a choice to be something other than a passive recipient, silently consuming what others decide to give you like cattle feeding at a trough, but with these new tools on the web, you can participate, and maybe even… make a difference!

If the for-profit world of publishing is going to survive, I think it needs to become much better than it is now and what it has been for several decades. Just as libraries are having to face real competition from various new sectors and entities, so does the world of for-profit publishing. I think people are slowly realizing that while they (and I) want to read what Garrison Keillor writes and there are lots of people out there who want him to have the freedom to write more, I don’t think people care much about his publisher. Of course, the publishers want us to assume that writers need publishers if they are to survive, but that assumption is changing today. This is the change where I believe, lots of opportunities await.

As matters work themselves out, I think what will come will be good—and far more exciting--for everybody!

Wednesday, May 26, 2010

RE: question - cataloging theory -- it's all about browsability

Posting to Autocat

>Julie Hankinson writes:
>
>>>Case in point: Books on the environment.
>
>>>Books on environmental theory are in GC. Books on environmental technology are in TD. Books on global warming are in QC. Books on solar, wind, etc. are in TK. Books on sustainable architecture are in TH. Do you know how confusing this is to patrons?
>Of course this is the nature of an interdisciplinary topic, to be scattered all over the Library.
>
>Sounds like the time for a pathfinder through the possible LC classes with a few pertinent examples. Unless you want to inspect and reclass everything as received. Sometimes you can do that in a small library.

In my experience, it is far too much to expect people to understand LC classification, and besides, as I tell the students in my information literacy classes, although browsing is a very pleasant activity and it is certainly much productive to spend your time browsing books in a library than it is to go off to a party and do drugs or beer bongs, nevertheless, browsing is one of the worst ways to find information. Ever since the Library at Alexandria, when catalogers saw a single papyrus with a work of Aristotle on politics, a work of Euclid on geometry along with excerpts from the Iliad, there was *no choice* except to create what we now call metadata records and put them into a separate catalog. Those lucky enough to be able to use an ancient library really liked the contents, but they liked the catalogs just as much. Therefore, ever since those ancient times, when someone is interested in, e.g. the writings of Euclid, they really can find them but they *must use* that catalog. The only other option would be to have massive numbers of duplicates shelved all over the place.

Later, in the 19th century when journals really started coming out, individual libraries tried to catalog each article of each journal issue but quickly found themselves overwhelmed. Poole took up the slack and created his index, which was quickly followed by many others. In this way, ever since the mid-19th century, if someone wants to know what articles are in all of the massive numbers of journals in a library collection, they have had to look into these indexes, which may or may not have cumulations. A real pain to be sure, but much better than not having any indexes at all and being forced to browse each issue of each journal on the shelves. People felt lucky to have these tools.

So to me, browsing has always been oversold. When you rely on it, you are guaranteed to miss a huge amount.

This is not to say that there is no purpose to browsing, since it does serve a very important purpose, but I'm not exactly sure in my own mind what that purpose is. To me, it is primarily psychological and is most useful in the beginning phases of research, when you are still figuring out your topic, what you are interested in, and so on. It is the most effective way I know of to just find something I want to read. Somehow, I think the quiet environment of a library; being surrounded by all kinds of mental creativity stored in the books; the tactile experience of touching them, all have a calming influence and can help you think.

In the new information environment, it may become extremely important to try to recreate this kind of environment online, but I wouldn't have a clue how to do it. How do you create a virtual space on the web that is conducive to calmness and contemplation?

Somehow, Muzak doesn't seem like part of the solution!

Tuesday, May 25, 2010

Google and Trust

Posting to NGC4LIB

Here is an interesting article that is relevant to the discussion about the importance of library ethics, from the NY Times, titled: "Sure, It's Big. But Is That Bad?"
http://www.nytimes.com/2010/05/23/technology/23goog.html?ref=companies

A few excerpts:

"Google says its mission is to give users the information they're looking for even if that means giving its own content priority and de-emphasizing sites it believes offer poor experiences. "Telling a search engine that it cannot innovate and show results in a way that benefits users would undermine the very goals of our competition laws," says Matthew Bye, a Google lawyer."

"Google also says that linking prominently to its own services over those of rivals is good for consumers and not malicious. Its famous search algorithm, conceived by one of the founders, Larry Page, at Stanford in the 1990s, uses a series of complex and opaque formulas to rank the sites within a set of search results. The algorithm is responsible for what Google calls the "organic" listings that appear on a search results page."

"Mr. Erickson said that Google executives thought they were doing the right thing for consumers and the Internet, and that simply by educating lawmakers on Google's good intensions, they would ultimately win the day."

These attitudes throw into very sharp contrast the differences between library ethics and business ethics. The idea of giving "users the information they're looking for" i.e. in essence, doing the users' thinking for them, vs. the idea of showing people what is the range of information available to them within a specified collection becomes somewhat clearer here. Certainly, the library injunction that we cannot profit at our patrons' expense is shown is especially pertinent. While Google may be very sincere that it is in the public's interest to give "its own content priority and de-emphasiz[e] sites it believes offer poor experiences," it does ring rather hollow since they are emphasizing their own products, which they naturally consider to be "better".

The idea that it is important to "educate" lawmakers into believing that Google's "intentions" are "good" demands a great deal of blind trust on the part of the public. But it has been shown that many people really do trust Google, for some examples, see: http://cornellsun.com/node/23886 and http://www.sciencedaily.com/releases/2007/08/070821153921.htm, or even http://www.bizjournals.com/triangle/stories/2010/05/17/daily10.html. These results mirror my own experience working with the public where I have discovered how deeply they trust Google.

Of course, you can never know someone else's intentions, and especially so if that other "someone" is a corporate entity, which must, by law, be focused on maximizing profit for their stockholders. How they do that can be achieved in different ways, and of course, one way is to convince the public that "We do it all for you," "Don't be evil" and so on.

But from this NY Times article it appears as if at least some members of the public are concerned about trusting Google too much. Perhaps it's time for libraries to make a statement that we respect people's privacy, cannot make money on someone's information needs, and we cannot emphasize our own personal political and moral agendas? The general public does not know this.

A few thoughts.

Monday, May 24, 2010

RE: Is FRBR too complicated?

Posting to Open-bibliography

Dan Matei wrote:

<snip>
> I've been trying to understand / implement FRBR and have constantly felt the domain model here is poorly thought out especially in now it relates to non-book media (e.g.

I feel that poor FRBR model needs a defendant :-)
</snip>

I agree. The more I work with FRBR, try to understand it, and consider its possible uses with organizations outside of libraries, it seems increasingly useless. I cannot imagine that anyone outside of libraries would ever consider implementing it because it bears little relation to their work and needs, and they certainly wouldn't do it on the "authority" of librarians.

I view FRBR as a theoretical framework that attempts to continue a conception of the structure of information and how people access it, as it was seen among Anglo-centric librarians of the 19th-century. Traditionally, the "work" and "expression" were merely points of collation in the catalog, be it printed, card, or later OPAC, where people who browsed could find the records collected together in certain ways. The actual utility of e.g. the uniform title was questioned by many (how many readers need to know all of the translations in all of the languages and all of the variants of Homer's Iliad?) and even establishing uniform titles for series was abandoned by LC a few years ago.

The expression, also merely a point of collation in the catalog, was at least seen of more use (lots of people want to know of all the different English translations of the Iliad). Still, it is important to keep in mind that what they were doing was *arranging cards* (i.e. manifestation records) in a card catalog (or in a printed catalog). They would create a card for an item (i.e. manifestation) and the cataloger had to put it into the catalog somewhere. They were not creating "work records" or "expression records". They made records only for real things and arranged them; plus they had some additional files to help them arrange the cards. Different catalogs could be arranged in all kinds of ways. Certainly not all of them were arranged by WEMI principles.

So, one of my theoretical problems is that FRBR has taken what had been mere organizational points for cards in a traditional catalog, and transmuted them into strange things called "entities" that have all kinds of attributes. The result is a lot of extra work as we have to scurry around to create records for these things called "works" and "expressions" and find all of these attributes that are of only passing curiosity--if that.

What is even more curious is that the underlying purpose of all of this is to continue a 19th-century view of information organization and retrieval.

It just doesn't make much sense to me. I think there are far more interesting and productive ways to go.

RE: Experiment with Extend Search

Posting to Autocat, NGC4LIB, MetadataLibrarians

I would like to share an experiment I have implemented using an Extend Search function attached to my local catalog, based on an open-source Koha catalog.

I won't give a history of the development of my Extend Search here (although I should write it down somewhere), but in essence, it has always been my opinion that since a well-made catalog record will help you find related materials within a specific catalog (using the heading structure), then with the WWW, the information in this same record should help you find related materials "elsewhere" on the web. With this in mind, a few years ago I implemented what I call the Extend Search within my own catalog as demonstrated in, e.g. http://www.galileo.aur.it/cgi-bin/koha/opac-detail.pl?bib=25893 (a record at random) where you can select or highlight any text and an Extend Search box pops up (javascript), where you can click on it and search in specific databases in specific ways (very simple php). When my users discover this option, understand what it does, and I describe its powers and limitations, they find it very useful and easy to do. In its current incarnation, I have managed to make the Extend Search into a completely separate module from my catalog, except for adding a bit of javascript.

On another topic, I have written several times concerning my fears for libraries when the Google-Publisher agreement is implemented eventually. My fear is that there will be so much to work with on the Google-type sites (the millions of materials on Google Books and Scholar, plus I am sure there will be many new initiatives) that it will be very logical for our users to start with the Google tools. They will find so much there that they will go to the library resources less and less, although they will *actually" need librarians more and more.

I *may* have found the beginnings of a solution.

I have felt for a few years now, once I began to understand the possibilities of the newer browsers, that a solution could involve a browser plugin, but my experiments in this regard have been notably unsuccessful. I have worked quite a bit with the Hyperwords plugin for example, to recreate my Extend Search, but everything I have done has wound up a failure. A couple of days ago however, I got another idea: rather than recreate my Extend Search in Hyperwords, I could merge them. I believe this is a successful step forward, and in an amazingly simple and quick way, it brings the entire web under the control of my Extend Search. Whether it is ultimately successful or not is a matter of debate, but I find it very powerful. Not too many of my patrons have seen it yet, but so far everyone has liked using it and they especially like its simplicity, while one has even called it frightening(!). Of course, each library or institution, or whatever, could write its own.

It is completely open and anybody can use it. I have written a short procedure to implement this at: http://www.galileo.aur.it/opac-tmpl/npl/en/libweb/AURHyperwords.html. It only works on Firefox. Apparently if you are running Linux, you may have to save the file I created to your own machine to load it.

To understand the Extend Search feature in my catalog, please see the short discussion at: http://issuu.com/j.weinheimer/docs/extendingthesearch?mode=embed&viewMode=presentation&layout=http%3A%2F%2Fskin.issuu.com%2Fv%2Fcolor%2Flayout.xml&backgroundColor=61A900&showFlipBtn=true

How could this be used? Each library or each discipline (whatever) could make its own Hyperwords links going to the tools and databases they select, e.g. a special set of links for Classics, or Renaissance Architecture, for Agriculture, Computer Science, or anything at all. So, a student who is researching Thucydides could load the Classics one from, e.g. the Warburg Institute, (if not a special "Thucydides" one!), then when working on another paper on Computer Science, load another special set from MIT, and so on and on. All these could be incorporated with local ones as well. Therefore, there would also be *many* possibilities for cooperation among librarians.

As a practical example of how it could be used, a recent report discovered that students turn to course readings very heavily when they are doing their research. ("How College Students Seek Information in the Digital Age" http://projectinfolit.org/pdfs/PIL_Fall2009_Year1Report_12_2009.pdf) When you implement this tool, the course readings can be used actively and easily, once the citations are online. For example, here are the readings for a course at MIT "American Soap Operas" and using this tool, the readings can be searched using my Extend Search very easily. http://ocw.mit.edu/OcwWeb/Comparative-Media-Studies/CMS-603Spring-2008/ReadingsandViewings/index.htm

I searched for Brown, Mary Ellen. "Motley Moments: Soap Operas, Carnival, Gossip and the Power of the Utterance." in the section Articles and Open Archives, and using Google Scholar, found 31 citations of it, plus lots of other things.

A new, better plugin similar to Hyperwords but made specially for libraries, could be the equivalent of the reference librarian, or the expert searcher, that our patrons could call on by simply selecting text or right-clicking or whatever, and they could choose from links to help screens, IM or Skype calls to reference librarians, to other selected databases, the possibilities are almost endless. And best of all, with a plugin, we would be *wherever the users would be*, that is: the moment they open their browsers. Even if they would go directly to Google Books, for example, they would only need to select text (or however) and the library would be there to help.

By the way, there are some amazing things you can do with the Hyperwords plugin right now. Look at: http://www.worldcat.org/title/aflam-al-misriyah-ka-masadir-lil-malumat-dirasah-fi-al-dabt-wa-al-hifz-wa-al-itahah/oclc/466857179, select the Arabic text, and in the Hyperwords menu, select Translate and then Arabic to English. I don't know how good the translation is since I don't know Arabic, but it's almost magic!

I would appreciate comments on this. Is it useful? Naturally, I can imagine many, many areas of improvement. Thanks in advance!

Please share this with others who may be interested.

Friday, May 21, 2010

RE: RDA a "done deal" at ALA

Posting to Autocat

On Thu, 20 May 2010 12:43:50 -0700, Ed Jones wrote:

>I guess this is why I tend to view the introduction of RDA with more equanimity than many. Been there, done that.

I began my library career later, after the introduction of AACR2, but I have worked a lot with all kinds of earlier library rules as well as with rules outside libraries.

While I have no doubt that catalogers can deal with learning and implementing RDA if they are directed to do so, it still hasn't been made clear exactly how this is supposed to help anybody. AACR2 and its massive changes really did open up the possibility of record sharing on an international basis--not just the ISBD areas, but taking the entire record became possible as the headings themselves became shared. That was a substantial improvement.

While I understand RDA as a theoretical construct, I personally have never seen any practical advantages of the RDA/FRBR theoretical datamodel over what we do today. I still do not see how it will lead to greater international sharing of records among libraries, and especially among metadata groups outside of libraries, who will *never* accept RDA or the datamodel, which really is a bizarre creation. Many of the RDA rules still follow card usage: a single main entry, emphasis on browsing headings (still atavistically termed "access points" when each word in the entire record now is equally a point of access), plus the refusal to deal in a constructive way with the power of full-text searching, the economic crisis and so on and so on. All of this leads me to believe that RDA will only split the cataloging world at a very inopportune moment. It will not improve accsss, increase productivity, nor will it be implemented by metadata creators outside the normal library community. Plus of course, you have to pay to access the rules.

Again, this is not to blame anyone. The world of information really is changing that quickly, and RDA is mired in the practices of the past. The problems with catalogs do *not* lie with the rules and guidelines for input; changing them will change nothing at all--I think we all realize that. The real change with the WWW is that people have discovered new ways to search for information and what they expect to do with the information once they have found it. The traditional library catalog no longer fills that need. Instead of dealing with this fundamental problem, FRBR and RDA still maintain that people want to: find/identify/select/obtain --> works/expressions/manifestations/items by their authors/titles/subjects. Yet, this is so outdated that I don't need to discuss it. Is that how anybody on this list really searches when they are doing their own research?

Certainly the people I work with do not. Most people under 35 or 40 know only Google's single text box and consequently, they don't even know that they can search by author/title/subject! While I do search for authors/titles/subjects sometimes, it certainly is the exception rather than the rule, and certainly it is *extremely rare* when I actually care about works/expressions/manifestations/items. Since I am an historian and real book-lover, at least I do want this kind of information once in a very great while, but for 99.99% of our public, I doubt if they want it at all.

I am not saying that what catalogers make is useless--quite the contrary. Our records and methods represent a control over information description and access that is not replicated anywhere else. That needs to be recognized up front. But since the way the public interacts with information has changed so fundamentally, the way this same public interacts and uses the records we make must change as well. Otherwise, they will go elsewhere.

I am all for change, but not for the sake of change in itself *if* it demands a lot of labor and funding. There must be substantial advantages shown: that productivity will increase substantially; we will gain a lot of new and very useful records for copy; the need for training will go down substantially; even if access were improved significantly, that may be a reason. Whatever the advantages will be, they must be laid out in very practical ways, not just discussions of horrifying graphs that fill your soul with despair, along with vague promises of a bright future someday. RDA has not shown these things while AACR2 was clear in the improvements it promised.

That's why I again suggest for people to consider the Cooperative Cataloging Rules at http://sites.google.com/site/opencatalogingrules/

Tuesday, May 18, 2010

Update to the Library News Feeds

For those of you who use the tool I created to keep up on library news at http://www.galileo.aur.it/opac-tmpl/npl/en/pages/news/librarynews.html, I thought I would add a section on Library Jobs. It all works pretty well except for the one called "Academic Careers" where you must select " Library administration, librarian" for it to work.

I thought this might be handy for everybody.

RE: newbie question - cataloging theory

Posting to Autocat

Madeline wrote:

<snip>
Is there such a thing as "strict" cataloging vs. "loose" cataloging? I don't know if I'm using the correct wording here. A librarian I know of will make no changes to the call number in a record, even though it makes much more sense to have these books in a different location. What would you call her? What would you call a cataloger who is more flexible in their cataloging?
Thank you.
</snip>

I guess I look at this somewhat differently from the others. To me, "strict" cataloging means strict adherence to standardized rules and guidelines, while "loose" would grant much more reliance on "cataloger's judgment". I've written about cataloger's judgment before, in short: I don't really understand what "cataloger's judgment" means. If one person's judgment is just as good as another's judgment in a given situation, there is a huge element of randomness allowed, which would mean that whenever cataloger's judgment is allowed, the result is a "looser" form of cataloging. The proposed RDA rules appear to leave much more elbow room to cataloger's judgment than earlier rules.

I see "strict" or "standardized" cataloging as essential in a network that wants to share records, and I think of it as similar to the world of mass production. For example, if I have a company that makes bolts and nuts, do I want to make bolts and nuts that can be used only on one make of automobile, or make bolts and nuts that can be used on 500 makes of automobile? The principle of "interchangeable parts" means that I should go for the latter, that is, if I want to make bolts and nuts that are genuinely useful to the most people, with the result that I can make more money. Additionally, if I am making a new automobile, I should probably use nuts and bolts that everybody else is using, even though I believe that a unique nut and bolt that I design may be "better". If I opt for the unique one, then it will be a huge pain for everybody else because getting a new, unique nut and bolt will be much more difficult and they will tend not to buy my automobile.

In a similar way, the more I make records that are unique to my own collection-and therefore "better"--and I downgrade the importance of the "interchangeable parts" aspect, the less useful my record will be for others outside of my collection, and consequently, the less useful records outside my collection will be when I want to bring them in. As a result, if I have my own practices on, e.g. series tracings, if I go against the standard decision, the result will be that I will have to revise every copy record with a series statement that comes into my catalog. That results in a lot of work.

I may agree with those standardized decisions or disagree, but if I disagree, it means that I must take the responsibility to revise every record that comes into my catalog, which quite possibly will result in so much work that it will end in exhaustion and eventual capitulation, where you just give up and say that we'll take whatever comes in. This is especially true today in times of perpetual budget crisis. (I guess it's obvious that I come down much more on the "strict" side in the sense of adhering to shared standards)

Still, it is important to note that this scenario is changing completely in the new information environment we are entering, and we get strange aggregators such as Google Books, that mashes all kinds of different metadata records together. Take a look at a Google Books metadata record. It's really bizarre. When people will go first to Google Books (which will happen once the settlement with the publishers is eventually approved), what will be the purpose and function of the "local catalog" in that kind of environment? I personally do not know how standardization will work in this strange, new world, but nevertheless, it remains my feeling that standardization will be as necessary as ever, if not more so.

Monday, May 17, 2010

Future Libraries

From a reply to a private message

I think physical collections of printed-type materials will slowly go away. It is truly sad, but probably inevitable in the long-term. The publishers can't stop the digital revolution forever, and they will be forced to provide the materials people want, and if they refuse, more and more authors will just do it themselves, because it really is easy today. Yet, when this happens I fear it is going to be devastating for physical libraries. Once the vast majority of materials that people use (not everything) is online, fewer and fewer people will need to come to physical libraries, except as a place to meet people, do their email, take a class or two, and managers are going to have to reconsider the purpose of maintaining the very expensive physical collection.

I personally think it is a bit too soon to get rid of 75% of a collection, but I could see that if there is a digital copy on Google or the Internet archive or one of the umpteen-zillion digital sites on the web, you could very probably get rid of your own copy. So, I could see that physical libraries will become archives and maintained and protected for a time of emergency, if something happened and all of Google Books sold to the Chinese or all of these other sites got wiped out somehow. All of this could happen very, very quickly once they come out with a good, cheap ebook reader and when the Google Books agreement is ratified , especially if it takes place duing the economic crisis and tough decisions will have to be made.

Still, as I keep pointing out, it doesn't necessarily mean that librarians will fade away with libraries. Those who see their jobs as primarily maintaining a physical collection, they are true dinosaurs, but if you think of librarians as those who select and organize information and help people find it, I think there will be lots of opportunities. I remember I worked with a graduate student in architecture at Cornell, and he asked me what I thought the library of the future would look like.

I said that I thought it would be like the ancient Roman baths, where people would go to work out, get a massage, bathe, meet people, eat a little bit, shop a little bit, but there was always a library attached. If you look at Wikipedia on the Baths of Titus, they have a picture of the library: http://en.wikipedia.org/wiki/Baths_of_Trajan

I could see something similar in the future: the librarian would have offices with computers hooked up to the internet in some mall or metroplex or wherever lots of people go, and people could go to them for quality help they can rely upon, and they can be assured that these are not people just trying to pick their pocket. Maybe we could do some massaging on the side!

RE: Consolidated ISBD

Posting to RDA-L

Bernhard Eversberg wrote:

<snip>
ISBD, however, is not a code of cataloging rules.

The introduction says:
"The International Standard Bibliographic Description (ISBD) is intended to serve as a principal standard to promote universal bibliographic control, that is, to make universally and promptly available, in a form that is internationally acceptable, basic bibliographic data for all published resources in all countries. The main goal of the ISBD is, and has been since the beginning, to provide consistency when sharing bibliographic information."
</snip>

I'm trying to understand how ISBD is *not* a code of cataloging rules, or as I prefer to think of it: standards for input of bibliographic information.

<snip>
The printed records were thus conceived, at that time, as a communication format for the transmission of structured information. No verbal or numeric tagging could be employed in printed bibliographies, as goes without saying, but the punctuation had to do double duty for that purpose.
</snip>

While I can understand this idea that the primary goal was to communicate structured information, and the only way of doing that in a print world was through punctuation, I think that this obscures the fact that the focus was still on the information to be communicated, and the punctuation was less important. My evidence is to compare the ISBD with the user guide for Dublin Core (http://dublincore.org/documents/usageguide/elements.shtml) So for example, the DC guidelines for "Title" are (in their entirety)

-------------------
4.1. Title
Label: Title
Element Description: The name given to the resource. Typically, a Title will be a name by which the resource is formally known.
Guidelines for creation of content:
If in doubt about what constitutes the title, repeat the Title element and include the variants in second and subsequent Title iterations. If the item is in HTML, view the source document and make sure that the title identified in the title header (if any) is also included as a Title.
Examples:
Title="A Pilot's Guide to Aircraft Insurance"
Title="The Sound of Music"
Title="Green on Greens"
Title="AOPA's Tips on Buying Used Aircraft"
-------------------

Contrast this to the in-depth ISBD guidelines for title (available through http://sites.google.com/site/opencatalogingrules/isbd-areas) and anybody can see immediately DC gives practically no guidance when compared with ISBD. This is not to criticise, but merely to point out that one has standards for input (cataloging rules) and the other does not.

In many ways, I see the current discussions as very similar to those in the later 19th century when libraries wanted to exchange catalog cards. The problem was: each library had their own size card and cabinets, and a uniform size card was absolutely necessary if they were going to be exchanged. It was also one of those debates that you either won completely or lost completely, since if your size card was not accepted, you had to recatalog everything, which was a terrifying prospect even then. So, you were either a big winner or big loser but in the end, they discovered that all they had agreed upon was an empty card with a hole in the same place!

While that was important, it paled in comparison with the need for and the complexity of sharing the information on the cards in some kind of coherent way--which was the entire purpose. It was *not* about just sharing cards, but sharing the information on those cards. Figuring out a standardized empty card was only the first, and relatively easiest step.
(As an aside, at Princeton Univeristy the cards were too big and Ernest Richardson, then the librarian, tried having his catalogers cut down the cards and then write somewhere else on the card what was cut off. That one didn't succeed!)

Certainly we should not have to enter punctuation by hand today. Not that it's so difficult to learn to do (pretty much the easiest part of ISBD) but it's a little bit like plowing a field with an ox and plough. There are better and more productive tools available.

And concerning displays, we must emphasize the possibility of multiple displays. I think having a standardized one, primarily for use by librarians, is a good idea, but other displays are much more useful for our public, e.g. citations they can copy and paste, exportable records for personal reference databases, and others. I have also felt that the displays of multiple search results could be made far more useful for both users and catalogers than those I have seen.

Sunday, May 16, 2010

RE: Digital Information Seekers: How Academic Libraries Can Support the Use of Digital Resources; Briefing Paper

Posting to NGC4LIB

Laval Hunsucker wrote:

<snip>
You make it sound terribly rational and empirical. I don't believe that for a moment. I'd even say that characterizations such as "sometimes mistakes occur" and analogies with something like "to build a space shuttle" amount to ludicrously pretentious hyperbole.
</snip>

"Ludicrously pretentious hyperbole"? Let us examine. According to Wikipedia (the easiest place to get this type of information), the design of the space shuttle began in the early 1970s and the shuttle consists of approximately 2.5 million parts. http://en.wikipedia.org/wiki/Space_Shuttle

Princeton University's library (the one I know the best) began in the 1760s and currently has 6.9 million books, and 6 million microforms
(http://www.princeton.edu/main/library/) plus, there's a lot more than that when you count the manuscripts, papers, maps, and many other items. This is only one library in the world, and far from either the biggest or the oldest. In the US, libraries have been building a "cooperative machine" since the beginning of the union catalog at LC in 1901. Today, here are the latest statistics from Worldcat http://www.oclc.org/worldcat/statistics/default.htm. Right now, there are 183,028,917 records and 1,566,411,480 holdings. Of course, Worldcat represents only a fraction of what all libraries in the world control.

I don't know how many people have worked on the space shuttle project, but I would guess it is in the tens of thousands. According to the LC report "Study of the North American MARC Records Marketplace" http://www.loc.gov/bibliographic-future/news/MARC_Record_Marketplace_2009-10.pdf there are currently 8000 *original* catalogers in North America. This does not include a much higher number of support staff. How many people this would translate into since 1901, I don't know but doubtless, it would be quite a large number.

There is also extensive documentation that someone must learn before they can add records to our "machine" (i.e. the catalog) to create author, title, subject access points, plus the whole realm of description, plus the classification. Finally, they must know how to encode it in the computerized format, which is quite complex in itself.

In spite of this, so long as you have certain information, the machine works rather well. Not only can it identify an item you want, it can also help you find related materials that you never knew existed. That the system works has been proven by the existence of masses of materials that have been deeply researched before the age of computer technology and people had no choice except to use our tools. Scholars and legislators, students and the general public used our tools for centuries. It was a lot more work back then, but the tools worked nevertheless.

So, is comparing what we do with the space shuttle "ludicrously pretentious hyperbole"? I think I have demonstrated that the point is at least debatable.

Maybe we consider that somehow, the task of the space shuttle is "more important" than our work. This is mistaken as well since our work is as necessary as any other: I hope bakers wash their hands and don't throw garbage in our food; that mechanics and carpenters care about their work. People need information to be decent citizens and to advance their knowledge and careers. We need to keep all of this very much in our minds.

One of the main problems we are facing however, is that many librarians prefer to deny the power and utility of what we have made, and this is why I feel I must speak out when I encounter such statements. These statements do not help either our profession or the public we are supposed to serve. Organizing information (yes, I keep maintaining that) for later retrieval using all kinds of tools and methods is what gives purpose to libraries and librarians. The public wants and needs the skills of librarians but librarians and the tools need to change in fundamental ways to respond to these new needs of the public.

I don't need to go into that yet again.

Thursday, May 13, 2010

RE: Digital Information Seekers: How Academic Libraries Can Support the Use of Digital Resources; Briefing Paper

Posting to NGC4LIB

Laval Hunsucker wrote:

<snip>
To what extent is the following up of instructions, rules and guidelines to be construed as evidence of, or a manifestation of, ethical judgement or expertise ? To what extent is the *establishment* of such rules and guidelines to be construed as evidence of, or a manifestation of, ethical judgement or expertise ?
</snip>

This is not the place to explain the principles of subject analysis. Suffice it to say that it can be done, but no one should expect perfection. The task is complicated and sometimes mistakes occur, which should cause no surprise. Building the space shuttle is also complicated and it blows up from time to time, but we don't conclude that it is impossible to build a space shuttle. You can build one; it just blows up once in awhile.

From my own career, I began by thinking about subject analysis as: "How could there possibly be any problem at all with figuring out the subject of a book?" to "How can any human being ever hope to do something like this?" until finally, I began to learn how to do it. Just like learning any other kind of skill, there is a method, there are many standards and manuals to learn how to use, plus you acquire an attitude in using that method. The underlying idea however, is following the rule of "consistency," which means following a whole realm of precedents. I may have a resource on the tilling of land in the Ottoman Empire in the 16th century (made up). I have to parse this topic in my mind, and find out how similar resources have been handled; discover the subjects they have been given, and follow those usages wherever possible. If I have something genuinely new, I am in effect, creating a precedent for others to follow (just like any other new conceptual name or title), and it is up to later catalogers to follow my precedent.

The "attitude" I mentioned is not the same as "lack of bias" but rather a commitment to and understanding of the problems and to solve them as a professional, keeping to a minimum personal concerns such as morality, politics, religious, and pecuniary. This is very difficult to expect from an untrained person--a member of the general public or the authors themselves, as one can witness the user assigned tags in tools such as LibraryThing and Amazon. Naturally, this is beyond the ability of Google, *although* it has other powers.

I wrote a short discussion of this several years ago when I was still at another institution. It's been archived in the Internet Archive but the images do not come through (which happens a lot with the Internet Archive)
http://web.archive.org/web/20000819000847/http://www.princeton.edu/~jamesw/mdata/MetadataCreation.html

I don't think the images are all that critical and people can figure out what I mean. The only real change I would make to it is where I mention "standardized terminology" and would try to explain how standardized conceptual URIs can change the situation to an extent.

Most catalogers don't think of the ethical aspects of their work--but that doesn't mean that these aspects do not exist in their work or that they aren't important. Simply approaching the task as professionals succeeds in the goal.

Wednesday, May 12, 2010

RE: Digital Information Seekers: How Academic Libraries Can Support the Use of Digital Resources; Briefing Paper

Posting the NGC4LIB

Alexander Johannesen wrote:

<snip>
First of all, "usefully" is subjective and has been the topic of many discussions here; useful to whom? I think it's fair to say that traditionally, most library organisation of materials have been mostly useful to librarians.
</snip>

There have been some great comments to this thread.

I also have problems with "usefully" and would prefer something more like "reliable". This means standardization; a type of guarantee that *if* you have specific information, you will get a "reliable" result, i.e. you will retrieve the set of "all" records related to the specific concept, within specific parameters and subject to known limitations.

Concerning the consequences of the ethics of information, let me give a concrete example of a free, online, and highly enjoyable video I watched last night: "David Morrison: Surviving 2012 and Other Cosmic Disasters"
http://fora.tv/2010/04/24/David_Morrison_Surviving_2012_and_Other_Cosmic_Disasters, a public lecture given by a NASA scientist, who has been fielding questions about the supposed end of the world in 2012! I had no idea it is such a big thing for so many people, forcing some to even consider suicide! Although no one mentioned it, it seems very similar to the 1938 Orson Wells' "War of the Worlds" radio program that caused mass panic and similar reactions. (By the way, you can listen to that great program and many others at http://www.mercurytheatre.info/)

From the scientist's lecture, it turns out that much of the world is just as much in thrall to superstition as it has ever been. These people are obviously getting information (presumably through Google) and it is very one-sided.

When you only use something such as Google and the super-secret algorithm which can be and is manipulated, when you search for "Iraq War" you retrieve many strange things, but in the library catalog, where you can search for "concepts" (as I have explained in previous messages), you can get a useful display: (e.g. LC catalog) http://catalog.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&Search_Arg=iraq%20war&Search_Code=SUBJ_&CNT=100&hist=1. Is this system perfect? Obviously not. But, if the catalogers are all doing their jobs, also in an ethical manner, the heading "Iraq War, 2003---Atrocities" (which certainly has deep political and moral overtones) will be used for everything about this concept, although many authors--and the *catalogers themselves*--may not believe that atrocities took place, and they didn't use instead "Just war doctrine" or "Retribution" or "Self-defense" even though the catalogers may feel in their heart of hearts that this is true. Nevertheless, in spite of !
that, they follow the standards and methods as the professionals they are. The cataloger may believe fervently that abortion is murder, but they do not catalog something as "Murder" which is about "Abortion" even though the author of the item may refer to it consistently as "baby murder".

It is this sort of guarantee that I say is missing on the web, and where we can make a very important contribution. As I have gone to great pains to explain in other posts, organizing "text" is not the same as organizing "concepts". Library methods for finding these concepts are passé, but not the task, and not the final product. I still say people want the groupings retrievable under the heading "Iraq War, 2003- " (with far more flexibility) and these groupings should be much easier to find. Naturally, Google-type searching will always be there and will be increasingly useful as well. It is our task to merge these methods somehow to create something genuinely new and more powerful than anything before.

Unfortunately, I don't think such matters are much appreciated by the catalogers, who are busy just doing their jobs correctly and according to standards, which is the very life-blood of the entire system, yet sometimes they lose sight of the underlying purpose of their work. It's too bad that many do not understand the consequences of what they do. Many rarely or never get a chance to work much with the public who uses the tools catalogers make; when you see the troubles that people suffer with information retrieval, it can really open your eyes in many positive ways.

<snip>
I'll push this even further; librarians have a much more important role to play than *any* journalist, simply by being paid by the people to help the people. Where are the super-librarians who enters the political sphere of freedom, rights and democracy? And where are the library organisations in educating the people about all of this?
</snip>

This is very well said and I completely agree.

By the way, the scientist's suggestion was quite interesting: the public should use Wikipedia! I agree (this is a 180-degree turn around from what I thought five or six years ago!), and we should be there as well. How? I'm not sure, but discussing these sorts of matters would take some very productive directions.

Tuesday, May 11, 2010

RE: Digital Information Seekers: How Academic Libraries Can Support the Use of Digital Resources; Briefing Paper

Posting to NGC4LIB

George Wrenn wrote:

<snip>
I've been researching one area of online content, streaming/downloadable academic lectures, believing we should add more records for this kind of material to our catalogs (I have an article coming out in C&CQ in October which looks at how much cataloging of online academic lectures has been done at top U.S. research universities; I've also put together a bibliography: http://humboldt-dspace.calstate.edu/xmlui/handle/2148/564).
</snip>

Thanks so much for the bibliography. It will be immensely valuable to me. The academic lectures have been an interest of my own, and I have struggled over how--not to get "control" over them, since that is impossible for any single institution--but at least to lead people in the right directions. A review of my own attempts: I started with the "tried and not so true" method of making catalog records for each "site as a whole" and of course, this attempt failed since such generalized records are not findable. (How do you catalog the whole of the University Channel in a single record in a way that is really useful for our public? I never figured that out.)

So, I made a separate tool, based on very simple technology, at http://www.galileo.aur.it/opac-tmpl/npl/en/pages/news/latestvideos.php which attempted to at least make it as simple as possible to discover what are the newest lectures available. I later added keyword searching as well (at least where I could). Finally, I figured out how to add it to my Extend Search tool http://www.galileo.aur.it/opac-tmpl/npl/en/extsearch/extsearchall.php?q=economic+crisis, (select Videos --> Educational Videos) where I have attempted to get them all to work together as simply as possible.

<snip>
Yes, we should provide access to more online content through our catalogs, while also making it easy to find those resources in our catalogs!
</snip>

This I agree with, except for one word: "to find those resources in our catalogs". Is it necessary that the metadata be in our catalogs, or just findable through our catalogs? Obviously, my attempt with the Extend Search I have created presents a different philosophy and of course, it can be improved in 10,000 ways (one of the best ways would be to have full authority control over each site).

I think a great example of a new direction is the Google Public Data Explorer http://www.google.com/publicdata/home, where you are interacting in all kinds of interesting ways with statistical data through the Google site, but Google is not hosting any of this data, just working with the tables that is maintained by these different agencies.

To give an example in our case, here is this lecture, http://tinyurl.com/3xnddk8, which is a continuation of the argument a Harvard professor makes in a very popular book and related PBS Series (this video series is also available online for free), and on this site, there is an open discussion about his talk as well, some of it very negative. Someone interested in this topic, or in his book, or in his television series (note: author, title, subject), would be *very interested* in all of this. Now, to let our users know about these materials, do the records *have to be* in my catalog?

I don't think so, and you don't even need deep cooperation among the different sites, as Google Public Data has where Google has been given permission to reach into the statistical data held on other sites. But, *if* trained catalogers could go into the metadata attached to the above public lecture, and add our authority controls, the resulting record would not have to be copied tens of thousands of times into separate local catalogs where things have to be updated continually--you could simply point to it in a variety of ways.

This is why I think the very purpose of the local catalog has to be reconsidered in a fully networked environment--how it works and what information it contains. Someone who is interested in the book by this professor needs to be aware of the video that they can watch online immediately, but they also need to know about this public lecture given after the fact, where he may have to field tough questions afterwards, plus the criticism available on the site itself. The catalog record must become a part of this "web".

How can all this work in some kind of *reliable and ethical* manner to be genuinely useful to our public, because yes--I and many others have very strong personal feelings about the economic crisis, but librarians are ethically compelled to not let our personal feelings get in the way of providing access. This vitally important role should not be brushed aside today, since it is something that is becoming more and more important on the web today. Librarians are the experts in this field. I don't know the best ways to do it, but I believe thinking in these directions provides a general guideline for creating a catalog that would become very important for the public.

I'll try to add some of the sites you mention! Thanks a lot!

Monday, May 10, 2010

Digital Information Seekers: How Academic Libraries Can Support the Use of Digital Resources; Briefing Paper

Posting to Autocat and NGC4LIB

Apologies for cross-posting but this report probably is of interest to both lists.

A paper has recently come out at: http://www.jisc.ac.uk/publications/reports/2010/digitalinformationseekers.aspx titled: "Digital Information Seekers: How Academic Libraries Can Support the Use of Digital Resources" and its conclusions seem pertinent to catalogers. There is a two-page summary as well. Among the conclusions, are:

  • Library systems must do better at providing seamless access to resources such as full-text e-journals, online foreign-language materials, e-books, a variety of electronic publishers' platforms and virtual reference desk services
  • Library catalogues need to include more direct links to resources and more online content
  • High-quality metadata is becoming more important for discovery of appropriate resources

While I fully agree with all of these conclusions, I think that they reflect the underlying fact that the definition of the "library's collection" is changing in a very fundamental way. Our public wants to be able to search everything available to them in one, easy interface and they should not have to search the library's catalog separately. This is understandable since it is the same as librarians always wanted from their integrated library management system; and yet, this "seamless access" inevitably blurs the boundaries of the library, itself, just as the ILMS blurred the boundaries within technical services. In the case of the "seamless access" some of the things that get blurred are: what is held locally as opposed to what requires an ILL? What online books re paid for by my library, vs. those available for free in e.g. the Internet Archive or Hathitrust? How will this change when the really substantial number of books still under copyright in Google go live? When merging the search of not only journal indexes, but full-text article databases with information on books, both full-text and metadata; what journals does the library pay for vs. open access?

Of course, the public doesn't care about these matters: they simply want the stuff they want, but it's clear that allowing for "seamless access"--although I agree with it-will blur the boundaries between the resources the local library supplies and what it is responsible for and may be able to modify, vs. non-library resources that must be taken as is.

This seems similar to how the ILMS blurred the distinctions between different departments of technical services: where the ILMS merged areas of processing and as a result, cataloging and acquisitions departments tended to merge. To those outside technical services departments, they saw no problem, just much better access, but it caused major restructuring to technical services for quite some time with a lot of trauma in some cases. In the "seamless" environment, what will be the purpose of a separate library catalog? While I continue to believe there will be a major purpose, it will have to be reconsidered.

The report notes that "high-quality metadata is becoming more important" but precisely what it is that makes metadata "high-quality" is not discussed, especially what it means to be high-quality in this seamless environment. It seems to me that something has to change somewhere at some very basic levels. The only part I found (p. 19) where it was discussed in some depth, there was more focus on analysis than anything else, i.e. providing metadata at the chapter level, and: "Catalogues probably would better serve users with better delivery, more links, and more online content. This is indicative that access to resources, not necessarily discovery, is the major issue in the current information-seeking environment." Of course, these are not the areas that I would consider to be the determinants of "high-quality" metadata, but what I think may turn out to be completely irrelevant.

The parts about "power searching" and "power browsing" are especially important, and actually mirror my own observations on how students-and even how I-use these electronic tools. This "power searching" must be studied more deeply and utilized somehow in the new systems.

Anyway, this is an excellent report, and I think gives a strong blow to the FRBR user tasks.

Thursday, May 6, 2010

Making it too easy for patrons actually makes it harder

Reply to a private message. Excerpt:

I find the constant drumbeat about accommodating the born-digital generation in not exposing them to anything not "e-" to be utterly destructive to humanistic education. Why are we buying into this? Why is it not exposed as a means of control? Because no matter how agile a person may be with the technical, the electronic, the digital world -- this is not the same as being able to think analytically, to see what really has or has not been done before, and how and why things worked or failed before, to understand what is really different and what is just dressed in a "modern" shell. While at the same time (yes, it *can* be done <snark>!) learning to be flexible in learning how to navigate, anticipate, manipulate all we have and will have in the future to learn. We are enabling a completely controllable society to be put into place, out of fear of being called irrelevant, old school--or just plain "old" for that matter, out of touch, clinging to the past, blah, blah, blah. The stupid generational divides
make me want to pull my hair out with frustration and a very deep boredom. I mean, what are we talking about here, life spans of 50 to 85 years? And whether one is 25, 40, 55, 65 etc. really makes an amazing difference in knowledge, thought processes, world view, mortality, you name it?? It's all variations on the "never trust anyone over 30" idea that one would think information professionals would not buy into. But librarians are among the worst for strata of all sorts, including generational strata.

Thank you so much for your note.

About the rest of your message, I couldn't agree more. For some reason, it seems as if we want to make it "easy" for people today, especially for the students (i.e. the digital generation), but often what it means to make it "easy" just harms them in the long run. These young students aren't stupid, but they don't know how to work. What do I mean by that? Even though they may spend a fabulous amount of time on a paper, and they have exhausted themselves in the process, it often comes to almost nothing. So, I don't think that in many cases, it is lack of effort, (although there is plenty of that!) but somehow, they haven't been given the skills to get tough when they need to and they don't seem to have the skills to work effectively. I think in a lot of cases where students don't even put in the effort, it's that they are realists: they know they don't have the skills, whatever they do will turn out to be a disaster that will make them feel awful anyway, so CARPE DIEM!! It seems as if somebody else, either mommy or daddy or teacher, has always done the hard parts for them, or cut them some slack, or something. (By the way, the ex-soldiers I have worked with are completely different. They know how to deal with it)

For example, I have built some online tools that are incredibly easy to use: I mean, just clicking a mouse button. That's all. And even that proves to be too much for some of these people! If it were just the slobs, OK, but this includes some who are seriously working! When I point this out to them (no, I *don't* tell them that they are worthless, lazy good for nothings, whose parents should disown them out of shame, but I approach them with gentleness and kindness) and they very readily (too readily?) admit that they are wrong, that what I have made is incredibly easy to use and the fault lies with them.

As a result of all of this, I am revising my former assumption that it is our job to make library tools easier to use for our public. While that would appear obvious, I don't know if it's correct, because I can't imagine that what I have built can be made that much easier to use. It can be made better certainly, in 10,000 ways, but if it can be made easier, it can't be by that much. So my thinking is that even if "everything" were fully standardized and under full-authority control and perfectly done, and all the user had to do was click a button.... it would still be too much.

In my normal, long-winded way, I am coming around to your point. I agree that something, somewhere is wrong, but I am less sure that I know what is "wrong" and where it actually is. That's one reason why I came out so strongly against changing the abbreviations: it won't make any difference, and I think everyone knows that. But determining what really *will* make a difference is far more difficult to figure out. I certainly do not have the answer, but my experience tells me that reliable search results and following high-quality standards will be a major part of any solution.

It is a question that must be considered very, very deeply, and I don't think RDA is part of the solution!

Anyway, join the Cooperative Cataloging Rules!!!!

Wednesday, May 5, 2010

RE: Writing out what we now abbreviate

Posting to Autocat

Hal Cain wrote:

<snip>
On Tue, 4 May 2010 04:26:38 -0400, James Weinheimer wrote:
>We should get away from thinking that the solutions to these kinds of "problems" are by figuring out what text to type. This is printed card thinking and we are not working with cards anymore.
>

No it's not. Cards have nothing to do with it. It's thinking that thinks it matters how we record the characteristics of the item in hand so that the results are clear and comprehensible to as great a proportion of of the end-users as possible.
</snip>

Hal, I'm afraid we have a serious disagreement here. I think it is absolutely vital for librarians and catalogers to stop thinking that the text that is entered into a database or a web page is static and cannot be transformed. That is card thinking. Today there are incredible things that can be done using all kinds of tools from scripting to style-sheets to browser add-ons and who knows what else? Look at Google Translate, and think about how a much simplified tool could reformat abbreviations. And not just for English speakers, but properly done, such a tool could work for all languages who could look and work with exactly the same records.

Anything in a webpage can be transformed if you want it to be transformed and there are lots of possible ways of doing it. It can be done on the server, or it can be done on each client's computer. This is a basic change in how people can work with our records (and how I hope they want to work with our records, if we're lucky) that has yet to be thoroughly understood and addressed.

<snip>
With respect, James, you're arguing against your own often-stated position, that our data can go anywhere. Taken outside the context of the catalogue (or national bibliography derived from catalogue records, as the British, Australian, Canadian and New Zealand ones have been) those smart tools will no longer work. Therefore we have to do the job in the cataloguing process; I would hope that some smart tools will be incorporated into future interfaces, but I am not optimistic: in part because system designers working for vendors are not good at listening to customers, but also because as a profession we are unable to reach a consensus about what should be in our workstation facilities to streamline our work.
</snip>

But these parts are not where people experience their major problems when they work with catalogs. Our public relates everything from Google, the tool they have the most experience with, they definitely love it, and they really and truly "think" they know how to use it. Trying to explain the difference between Google-tools and a traditional catalog is not easy since they are so conceptually different. I have discovered that simply explaining how to think in hierarchies (and thus in concepts) is not easy in a time of keywords, but trying to get people to do so is... Well, let me put it this way. While I have had some who were very interested and motivated in learning, I have not had anyone I can yet honestly call a "success." Whether that is all my own fault, I don't know, but I have learned that it is not a simple matter to explain and use a library catalog.

People certainly do not understand everything they retrieve in Google, but that doesn't seem to bother them much. What I am worried about when records are exported out of our catalogs and into new tools is ensuring that the records still function, i.e. subject headings and other links won't just stop working. But on a wider level, what will it mean for these links to "work" in that kind of environment? I don't think we know yet.

The question for me is: where do we focus our energies? Do we put them into "fixing" areas where people will really notice, such as increasing productivity to include electronic resources in a comprehensive way so that we don't have to tell our public when they want academic, scholarly webpages, "Well, you'll have to use Google to find those things." (That is a disaster!) Or do we use our resources to type out abbreviations?

I agree with you that "the first objective of cataloguing is to serve the user's convenience," and "If we don't serve our primary users well, we won't be around to massage our data into forms useful to others -- we'll find ourselves out of a job."

And this means to not ignore zillions of worthwhile materials on the web out there and believing that everybody understands our heading structures, while we focus on abbreviations! Let me tell you: that's what gives catalogers a bad reputation! And when the Google Book Search makes everything available, watch out!

I'm not saying we shouldn't type out abbreviations in full. We can do it, I really don't care since I'll just make some stupid macros or something. I just don't think we should fool ourselves that it's going to make any difference at all to our public.

RE: Preferred access point - some queries

Posting to NGC4LIB

Henry Lam cites RDA:

<snip>
RDA does not use the word 'Main Entry', explained in the FAQ 4.8 in the RDA
website (http://www.rda-jsc.org/rdafaq.html#4-8):

"The concept of main entry as used in a card catalogue is no longer applicable in online catalogues, and this term will not be used in RDA. Nevertheless, there is still a need to choose a preferred access point for a work or expression in order to create bibliographic citations, and to collocate works and expressions in the online catalogue. Section 2 of RDA will provide instructions on constructing the preferred access point
representing the work or expression."
</snip>

Of course, this means that the functionality of the old main entry continues, and it is simply renamed to "preferred access point." Concerning the need for a *single* preferred access point to create bibliographic citations, I do not know what citation rules they are referring to, since all of the rules I have seen stipulate that when you are citing, you should cite all of the main authors, with various limitations on numbers (from three to seven or so). Sometimes, they mention editors as well.

There are also differing formats for entry. Although the first author is always entered under surname, the rest of the authors may also be entered under surname, or they may be in regular order. (For a very quick summary of these formats, see: http://www.lib.wsc.ma.edu/citation.htm)

So, if one of the purposes of the bibliographic record is to create automatic citations (which I have never seen stated explicitly anywhere, but I agree with it since that means we would really be beginning to understand how our records can be useful to our patrons in the modern information environment), there is still no need for a *single* preferred access point since none of the citation rules require it. It is also important to note that the actual forms of the name entered in these bibliographic citations come from what is found on the item and do not mention anything about using authorized forms (that I have seen at any rate). This would relate more to the 245$c, statement of responsibility.

The other purpose given for a single preferred access point, "to collocate works and expressions" doesn't make a lot of sense either, since all sorts of innovative displays could be created using multiple main authors.

For the process of cataloging, it would be much easier to have to distinguish only between main authors and secondary authors instead of having to follow the detailed rules for determining main entry.

I'm afraid that the continuation of the *single* preferred access point, i.e. the continuation of the policy that the 1xx field cannot be repeated for each main author with the result that in the 7xx, additional main authors are mixed in with editors and other secondary contributors, is just a continuation of MARC format, which in turn perpetuates the limitations of card and printed catalogs, where there was a need for a single main entry.

Tuesday, May 4, 2010

RE: Writing out what we now abbreviate

Posting to Autocat

On Tue, 4 May 2010 00:50:24 -0400, Hal Cain wrote:

>As for customers who work in other languages than English, well, they already require notes in a different language. Once the English forms of the formal phrases I refer to above are in use, can't a simple search-and-replace operation in MarcEdit deal with them?
>
>The age when Latin was the foundation of Western teaching and learning is past, and it ain't coming back.

We should get away from thinking that the solutions to these kinds of "problems" are by figuring out what text to type. This is printed card thinking and we are not working with cards anymore. If there is a problem of comprehension of something such as "ca." there are a hundred solutions today if we utilize the power of the tools at our disposal.

For example, for these especially troublesome abbreviations, we have probably all seen the boxes pop-up when you run your mouse over a word (an onmouseover event). These can be links into explanatory pages, or simply a popup that explains "ca." as "approximately" or "v." as "volume", or whatever we want. Another idea would be to replace these abbreviations with codes so that when 260 $a has a special code for "s.l.", it can display however somebody wants in each catalog, and therefore can be used in many more venues.

But perhaps it's not a problem for our public at all. From my experience, people don't read cataloging records as thoroughly we would like to believe, so our public don't see any problems. Our records are pointers to items people want and not little bibliographic essays that people read. If you ask people, and point to parts of the record and have them focus on those parts, then there are lots of things they do not understand but most of the time, they simply ignore these abbreviations and other obscure parts and continue on without any problems whatever. For those very rare occasions when it is important that they need to know for some reason, e.g. what "s.l." means, then they should be able to find out.

There are many, much more serious problems we are facing with the public using our catalogs in this new information environment, but I can't imagine that if we change all the abbreviations in the catalog, it won't make people want to use our catalogs more. Half of the results I see in Google are incomprehensible and/or misleading in some way until I click on the item and
get some clarification. We need to focus our energies on the serious problems facing us and less on aesthetics.

Monday, May 3, 2010

RE: If Academic Libraries Remove Computers, Will Anyone Come?

Posting to NGC4LIB

Laval Hunsucker wrote:

<snip>
The point seems to be that you don't grasp, or perhaps choose not to acknowledge, the difference between information ( the process, or perhaps even result, of being informed -- a phenomenological/cognitive matter, the difference between a prior and a subsequent state of understanding in a given human mind, or in Bateson's terms "the difference that makes a difference" ) and what you're now calling "materials" : what I called "documents". This is in fact a large and ( pragmatically as well as philosophically ) significant distinction.

The former cannot be organized for anybody else by librarians, or by whosoever. The latter can, obviously, be organized for ( access by ) other parties. This is what librarians have been doing, and doing fairly well, since at least the time of Ashurbanipal.

It only, at best, muddies the waters -- while fatuously bolstering our ego's, I suppose, and that seems sadly in fact to be the purpose -- to say, in organizing materials (documents) and/or their surrogates and metadata etc., that we are in the process of "organizing information". Quite absurd. If you, as did Daniel, carefully consider the term "hypostatization", you may better come to see what kind of manoeuvre is involved here. It was no accident that I chose to use exactly that word in my response to your previous post.
</snip>

I hesitate replying to this since the topic is rather arcane, but I simply cannot accept the idea that librarians do not organize information.

I believe I understand what you are saying, it is just that I disagree with it. I don't believe I am committing any fallacy here (I may have committed fallacies elsewhere, but not here!) The basic question is to determine "what is information?" While I agree that on some level there are feelings and ideas that necessarily must remain only inside people's heads, they nevertheless must be shared with others to be considered as *information*. After all, "information" comes from "inform" which strongly implies sharing of some sort from one person to another.

If there is no sharing of this "stuff that necessarily must be inside the heads of others," and you are left entirely with the "stuff inside your own head," that to me, is precisely the definition of superstition and bias. It is the very antithesis of information. Many people may believe that they have their own unshakeable "information" within their own minds about other ethnic groups for example, but of course that is not "information" so much as prejudice and bigotry.

Therefore, this "stuff that is in my head and in the heads of others" must be shared if it is to become information, but how? Since I personally do not believe ESP exists, the only ways of sharing this "stuff inside our heads" is either orally by use of language and texts, or visually by painting, music, dance and so on. If you don't talk about something, it cannot be information; even if you do talk about these matters with friends, if it is not recorded in some way, it will soon be lost. Look at all the wonderful stories of people's lives that have been lost because they have never been recorded.

Each method has its problems, of course. My cats, for example, can communicate only very basic feelings to me. A human may not be able to write or speak well. I personally cannot sing to save my life, but each person has strengths and weaknesses in how they can share the ideas and feelings they have, and once it is received by others, this is when it becomes information. Different people will react in different ways to this information: it may be "good," "reliable," "poor," "stupid," or whatever. Of course, it's not easy to find the relevant "stuff" out there in the first place.

The methods people have to share information are changing radically. They use virtual means and radically new tools to create and share, and people find this exciting, just as I do.

Mediating a part of this sharing (not all of it) is the world of librarians, and I think we must enter it as completely as possible, much more than previously. We will not have the control over people that we once had, but we can still have many of the same controls over the materials themselves, so that people can find relevant resources, and in this way, complete the "information process" i.e. where people receive the "stuff recorded by others" and thereby it becomes "information." The part that interests me most is: how do people find "stuff recorded by others" that is relevant to their searches? But of course, there is a lot more involved.

We have to understand information and find its limitations and possibilities. I consider this task to be a highly practical one, where theory takes a definite back seat to trial-and-error. We must accept that almost all bets are off about what people supposedly want from their information. For example, FRBR proclaims what it knows what users want, but those user needs are demonstrably obsolete today.

I don't think I need to go on, and I apologize for going on as long as I have, but I still maintain that *librarians* are the experts at this very important task of organizing *information,* which is the "ideas and feelings" that people share. Of course, librarians still have a lot to learn. In any case, it is important to understand and accept our successes just as much as our failures.