Saturday, June 30, 2012

Re: [ACAT] Currency of subject headings (was Tell all your associates ...)

Posting to Autocat

On 27/06/2012 10:09, Jardine, Heather wrote:
We (as a UK PL) have never used subject headings to provide subject access and still maintain a classified catalogue with a (locally-produced) subject index. Both classification and subject index are revised and updated as often as we possibly can. Unfortunately there is little evidence that colleagues and users make much use of either, preferring keyword searching no matter how often we demonstrate its failings. In my view, classification without a subject index is little use - but given user preferences and the high cost of creating and maintaining such a thing, not helped by the lack of support for a subject index within our online catalogue (so it has to stand apart), I don't know for how much longer this will continue to be possible. [My own views, only].</snip>
Charles Cutter, in his chapter "Library Catalogues", part of the huge "Public Libraries in the United States of America" from 1876, Cutter discusses the advantages and disadvantages of classified vs. dictionary catalogs: how one type is good for some questions, while another type is good for other questions.

The real problem of classified catalogs was how to direct people to specific areas, he gives an example that someone looking for "Cookbooks" would have to look under "Productive Arts" which people would never do, so there must be an alphabetical index to find your way into the classified arrangement. He says that once you are in the correct place in the classified arrangement, it is definitely more useful than a dictionary one, so for instance, if someone is looking for "Badgers" and there is no book specifically on badgers in the collection, the searcher is at a loss in a dictionary arrangement, but if you are in a classed arrangement, it is much easier to go up or down in the classification to help you find a more general book that may contain the information you need. The problem is: it is a two-step process to use a classified arrangement: first you look it up in the alphabetical index and then you go to the classified arrangement. So, to find "badger", he notes how you may be directed to look under Science -- Natural History -- Zoology -- Vertebrates -- Mammals, and so on. Not so simple for searcher or librarian.

In the U.S. they decided simply to skip a step and not place the cards themselves in a classed arrangement, which was perhaps all right since there was a trend toward open stacks, which could provide the classified arrangement on the shelves. (see: p. [526]+

I think his analysis is still sound today, but what can modern technology provide? It seems to me that the actual arrangement of the records is much less important than before and there should be more effort in providing the searcher with the correct query and letting the catalog arrange records in various ways. I find that using the AAT from the Getty is very useful for this. So, let's imagine you are interested in "knuckle dusters" and find this: to discover that the term is "brass knuckles". If this were connected to a catalog or full text, you should be able to get into those records or full text from here, but it is still important for the searcher to see the classed arrangement, or here "Hierarchical position" that there are weapons -- edged weapons -- fist weapons and that if you click on any of the little triangles, e.g. fist weapons, there are all kinds arranged under there, and find different kinds of fist weapons.

The same methods should be doable with the LC authority records. Here is "brass knuckles" in the NAF but the default arrangement there is alphabetical (dictionary), which puts the searcher into the middle of a list with "Brass Knuckle Boys (Musical group)" and people surnamed "Brass" which is completely useless for the searcher. But if you get into the record itself, you see some nice references:
Topical subject heading:     Brass knuckles
Variant(s):     Brass knucks
    Knuckles, Brass
    Knucks, Brass
See also:     Nonlethal weapons
This last is a BT that puts you into the classified arrangement but to see it all, the searcher must make separate clicks. Cutter talks about all of this.

To compare, there is dbpedia dbpedia has the same limitation (that I see) as the NAF since it also does not provide a classified arrangement like the AAT but you have to click on all the BTs, in this case Blunt weapons -- Weapons.

It seems that keyword searching that leads into a classified arrangement should be very useful for searchers, and certainly is more useful than the dictionary arrangement, as Cutter pointed out so long ago. Keyword avoids the need for the two-step process of looking up your topic in the alphabetical index to find out where your topic classed, or at least makes it much simpler.

This would be worthwhile to research, if people are not doing it already. There is less and less reason for the dictionary arrangements.

Friday, June 29, 2012

Re: [ACAT] Currency of subject headings & Bibliographic conservatives

Posting to Autocat

On 28/06/2012 04:35, Frank Newton wrote:
To me, the word "dump" is kind of provocative. Isn't provocative similar to manipulative?
Although I like to consider that once in awhile my writing is not too bad, I don't think I can manipulate people with a single word! Provocative it is, but that is to get people thinking. Besides, several times I have heard and read those attitudes, perhaps--or perhaps not--with that precise word used, but giving voice to the attitude nevertheless.

But continuing with the subject headings: there are two purposes to the subject headings. First, is as a point of collection for records that display similar traits, so this consists of making sets and subsets and sub-subsets. The second purpose is what should be the labels of these subsets.

Concerning the labels of the subsets, e.g. "Future, The, in motion pictures", there will never be general agreement. In this sense, I am reminded of a passed-away old friend of mine, retired from the telephone company, with whom I would have deep conversations when I was younger. Politically, we could not have been further apart, but he had quite a mind. One thing he said was that he was convinced you could say anything, and he emphasized ANYTHING, and he believed you could find three people somewhere, who would agree with it!

My own experience has proven him right, but I add my own corollary to this: you can say anything--ANYTHING--and you can find three people somewhere, who will violently disagree with it. I am absolutely convinced that my saying "I love my mother" or "I really like apple pie" will set somebody off somewhere!

Relating this to subject headings, people will never agree on what the label of the heading should be. This is why the USE FORs are so critical for the system to work, and one of the reasons that compels me to say that our catalogs have been broken for a long time. Today there is also an international dimension that catalogs never considered in the old days. People from all over the world can see your catalog records in a variety of ways, and if we enter the "linked data" universe, there will be even more ways. So, there will never, ever be general agreement as to the form of Leo Tolstoy's name. But there will be much more agreement as to the records/items/materials that are collected inside the "set of all materials by Leo Tolstoy, Lev Tolstoi, Lev Tolstoj, [pick your favorite form]".

Today, systems allow for much more flexibility as to the form(s) of the label attached to a set of records that is displayed to the user by tools such as VIAF. So far, the sets themselves can only be made competently by humans, in fact, expert humans known as catalogers. Perhaps someday, automated means can manage it all but I remain suspicious. In the meantime however, I have a feeling that merging the automated methods with our methods would lead to something far more powerful than doing things separately. 

The problem is to get the public to understand the power and advantages of our subject analysis, and sadly, that is very difficult to do.

Thursday, June 28, 2012

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 27/06/2012 19:02, Brian Briscoe wrote:
While I agree with Jim that our current catalog interfaces have many problems, I don't believe that all of the logic behind them is, to use his term, "broken." The case for controlled vocabularies is very solid and the weaknesses in keyword have been convincingly proven. As a matter of fact, not all library users are keyword-searching literate. There are still many who continue to use the old-fashioned library OPAC interface that they learned in the past. That is why, IMO, it is important to continue to keep access to the library's "old" catalog interface available for users as an alternative search on the landing page.

I have no problem with updating the terms used in LCSH and our other taxonomies/folksonomies/thesauri to more currently-familiar terms, but then those "old" terms must be retained as "See" references.

There are some pretty revolutionary things going on in library information access systems. Unfortunately, the RDA debacle has overshadowed them for attention. But there are good things happening. This is not the time to throw up our hands and cry that all is lost. This is the time to focus our attention and our message on things like controlled vocabularies that allow for real collocation of information. Machine matching and relevancy algorithms have not shown sufficent progress in this area to this point.

Librarians will never provide the fastest information. We never have. But we can provide the most accurate information. And that is the niche that we should be focusing on.
I am not making myself understood. I am 100% in favor of controlled vocabulary, and just as much in favor of the subject headings with subdivisions. What I say is broken is the dictionary part of the catalog, or expecting everyone to look things up by following arrangements based on left-anchored alphabetical order, which just doesn't happen anymore. What are the alternatives? Classed arrangements should certainly be tried, as Mac suggested. I personally haven't seen anything I have liked very much, and in any case, there has been relatively little done with authority files, at least that I am aware of. The best that I know of is the AAT, e.g. the heading for Adirondack chairs, showing the hierarchies, use fors, all kinds of notes, and so on. It's nice. But whatever is selected for the authority file, it all needs to be incorporated into the actual catalogs.

Concerning the very idea of an "authorized form" (i.e. single) to the exclusion of others, that can be reconsidered as well. I have mentioned in an earlier posting ( Thomas Hyde's catalog at the Bodleian and the unique method he used for "authorized forms", which included the Use Fors. I could see the same thing being tried today, so Dostoyevsky would display something as:
Dostoyevsky, Fyodor, 1821-1881, or, Dostoievski, Fédor Mikhailovitch, 1821-1881, Dostoievski, Fiodor, 1821-1881, Dostojevski, F. M., 1821-1881, Dostojewskij, Fjodor M., 1821-1881, Tʻo-ssu-tʻo-yeh-fu-ssu-chi, 1821-1881 more...

"more..." would link to the entire authority record.

Like I said, there are lots of ideas from the present and from the past.

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 27/06/2012 16:03, Brian Briscoe wrote:
On Wed, Jun 27, 2012 at 8:40 AM, Dawn Grattino wrote:
Not at my library! Even the "advanced" search option gives only keyword. All she'll get is the final hit list of items that her terms call up. Even doing an author/title search yields stuff by other people and/or with different titles! The best one can do for subjects is limit the keyword search to the subject line. Then you get an item list that will include a lot of crap along with what you're looking for.
It sounds to me like your whoever set up your ILS or discovery layer did a really poor job of meeting the needs of its users. I wish your case was rare, but I see too many other catalogs that, in chasing the Google model, do indeed throw out the baby with the bathwater and create a tool that combines the worst of both keyword and controlled searching at the expense of the useful facets of either. That is why library catalogs get criticized.</snip>
But even then it's broken. The subjects, along with all headings, must have the cross-references included somehow because nobody will ever know an authorized form. How many people would possibly think of World War, 1939-1945? Or if you are going to search for IBM that you have to search for International Business Machines Corporation, while if you want sub-bodies of IBM you may (or may not) have to search under International Business Machines Corporation. The cross-references are absolutely critical.

Without cross-references, even if you know the heading for WWII, how can you know that "World War, 1939-1945 Battles" doesn't work and you need to search under:
World War, 1939-1945--Aerial operations.
World War, 1939-1945--Campaigns.
World War, 1939-1945--Naval operations.

I discussed a lot of this in one of my open replies to Thomas Mann

The great example of "Future, The, in motion pictures" illustrates long-dead card catalog thinking perfectly, which is gone, forever, whether we like it or not. This ridiculous-looking heading actually made perfect sense in the card catalog. And even if your catalog can display the subject headings in alphabetical order, such as LC's does, such an attempt to recreate the dictionary catalog is retrograde and just too weird for today's public. Even "surname, forename" is considered too much by many, as we see in Worldcat displays and Wikipedia

This doesn't mean that it's the end of the catalog but it very definitely is the end of the *dictionary* catalog. The dictionary catalog broke the day keyword was introduced and was never fixed. We need to acknowledge that and get on with the task of harnessing the power of the dictionary catalog--and there very definitely was (is) a power there that has been lost--to actually work again today. Unfortunately, since that power hasn't worked in decades and has been mostly forgotten by the public, and it is so alien to how people think today, it is difficult to get people even to agree that it is worthwhile salvaging.

Earlier on this list I mentioned that the subdivisions could possibly display as a word cloud under the main topic That may be worth a try. Here is an example of what the subdivisions under E.A. Poe could look like: All of these would be clickable. There are a lot of possibilities, once we confess that our current methods have been broken for a long, long time.

Wednesday, June 27, 2012

Reality Check: What is it that the Public Wants today?

[Links to everything are at the end in the References Section]

Hello everyone.

I want to thank the Committee for allowing me the honor to speak at such an important event and to include me with such a prominent group. Also, I want to thank everyone who is attending. I just wish I could be with you in person, but this is almost a miracle!

A major part of my talk will summarize four presentations--all available online--that I suggest everyone should watch and that I have found to be very important because they can help show the way forward for libraries.

First however, a few numbers.
Of course, most people want specific webpages rather than entire sites, just as most want journal articles and not entire journals, so to make those 555 million sites useful, we must increase this number by a large factor to turn it into webPAGES. Other formats are incredible too, such as 72 hours of video uploaded to YouTube every minute! These numbers reveal a very definite trend: that non-traditional resources are going up at a phenomenal rate when compared to traditional resources.

We can expect the same phenomenal increase in the creation of metadata, which, after all, is supposed to mirror more-or-less the resources themselves. Much of this metadata will be made automatically or by people who haven't got the slightest idea what they are doing. If all of this metadata comes together into the same pot, which it probably will eventually, (I call this the Metadata Macrocosm) these trends suggest that the records librarians create will disappear into that huge morass like a drop of water into the ocean.
I think this is reality based on looking honestly at the raw numbers.

With that cheerful note aside, the first talk I would like to discuss is by David Weinberger, “Too big to know” where he mentions some of this. 

In one part of his talk, he says: “metadata is not what it used to be” and when he mentions “what used to be metadata,” he is talking about library cataloging! In his opinion, today EVERYTHING has become metadata because with full text, you can search, e.g. “Call me Ishmael” or any sentence out of the book, and retrieve not only Melville's Moby Dick, you get everything by Melville, about Melville, his friends and whales and so on. As a result, for Weinberger, metadata has become something quite different. For him, metadata is what you know and data is what you are looking for. Therefore, everything can function as metadata. He concludes:

I think he is pretty much correct and he makes a very convincing case. This is the world we are entering, if we are not in it already.

Weinberger is describing a type of “information overload”. How can we control it?

To continue with another talk, web guru Clay Shirkey said:
The problem is not information overload. The problem is filter failure.

What he means is that people have always complained about having too much information. Already in the year 1500, the average literate person had access to more books than he could read in a lifetime. That happened very quickly since printing didn't really get going until around 1470, so we are talking about only 30 years! “Information overload” is nothing new.

As a consequence, in Shirkey's opinion, too much information is not the problem. The problem is that people have also relied on methods to eliminate the information they didn't want. This has been the job of publishing houses and editors. Just the sheer cost of creating and distributing information resources, not to mention buying and storing them, have all kept the amount of information down and thus have served as types of, what he calls, filters. In Shirkey's opinion, when people complain of information overload, they are actually saying that the filters they have always used have broken down.

Some have concluded that the way to control information is with the Web2.0 tools such as Facebook, where you “befriend” someone and this person, or one of his friends, or one of their friends, or one of their friends, may mention a good resource that may be of interest to you.

One of the many potential problems with such methods is that you may find yourself trapped in what is now known as a “Filter bubble” that is, where all the information you see comes from sources that pretty much agree with you, your friends and your friends' friends, so you slowly become unaware of other sides of arguments. Remember, the dream of Tim Berners-Lee and his Semantic Web is to have mechanical “intelligent agents” gather our information for us automatically and present us with their results, working much like a thermostat. Each person would have his or her own little personalized “research team” working constantly and diligently to bring us exactly what we want while we go about living our lives.

I have serious problems with this dream of intelligent agents for information, but I will not go into them here because I have discussed this issue in podcasts and postings that can be found on my blog. To sum up my own opinion: his dream is my nightmare.

In any case, the filter bubble makes perfect sense to me. And when added to Weinberger's comments on metadata, plus Shirkey's filter failure, librarians may just want to throw up their hands and run away screaming, OR, they could see a perfect opportunity for an information field based on ethics, such as librarianship, to provide help. How could this work?
As a first step on the way toward a solution, I would like to mention another talk: The Paradox of Choice. This theory posits that we are all suffering from an intellectual paradox and this paradox can be boiled down to the following inferential statement:
a) Freedom is Good.
b) Freedom means I have more Choices.
c) Therefore, the more Choices I have means I have more Freedom.

While this may appear logical, it turns out that in reality, when confronted with too many choices, people feel exactly the opposite and in fact, they just shut down. You see all this stuff and you are overwhelmed. When there are too many choices, people worry about making the wrong choice, a stupid choice, falling for a con job, or something else.

From all of these observations, being made in public forums and discussed in the general media, it is clear to me that the public wants help. There is need for something.

I think that Noam Chomsky, who is a highly controversial figure, but also an accomplished scholar, summed up the public's situation quite well when he said:
I think what Chomsky is discussing is a variation of the same topic as everyone else: the filters we have always had are broken, but he points out that even when they worked better, it was still never enough just to walk into a library. You also needed some kind of a guide, such as a librarian, to help you find materials that are meaningful to you.

To summarize all of this:
The most hopeful part is that none of these people are librarians. These voices come from the public who consider these issues genuinely important. I also think that the values and experience of librarianship address all of these concerns. How can libraries respond?

I suggest that individual libraries, and the entire library field, begin to consider themselves primarily in terms of filters, that is, instead of including, tending toward excluding. Google includes; libraries should be doing something different. It's rather strange that current technology makes it easier to include than to exclude, but that's the way it is. Also, in a sense librarians actually have been “filters of filters” since the beginning, through selection, reference, cataloging, which all serve as filters in various ways.

But what kind of filters can libraries provide? I will leave the important, and tremendous issue of selection aside here. What is it that is unique that catalogs do? I do not think we can honestly say that they give better access, since that is based on judgment and is impossible to prove. But catalogs can provide standardized methods of access that are reliable—and reliable in all sorts of meanings of the word. Reliable selection that guarantees you will see all kinds of opinions; reliable cataloging so that you can find something the same way you found it yesterday; reliable access so that if a site you found disappears or changes, you can still access the information. Of course, experts will be best at using these tools, but that goes for any information tool including Google, which is a lot harder to use than many think.

Above all else, we must acknowledge that the traditional library catalog serves the needs of the library managers: selectors, acquisitions and reference staff, and it allows reliable search results for experts. That has always been the library catalog's main purpose and it does a pretty good job.

There is nothing wrong with this. Such a task is absolutely critical because if librarians cannot do their jobs, nobody can use their libraries, but it is wrong then to conclude that this tool, so necessary for librarians, is also the tool that the public needs.

So what is needed?

It has been my experience that catalogers have a tendency to concentrate on individual records, individual fields and subfields, and often lose sight of the entirety of the catalog. The public looks at the catalog completely differently: they spend little time on an individual record because once they find something of interest, they stop looking at the record and off they go to the resource itself, but the public does spend much more time on the catalog as a whole, that is, looking at the result sets. Therefore, I think we should attempt to reimagine how the public could perceive the result of a search.

In a paper I gave recently in Oslo, I tried to reimagine the individual catalog record and how it could possibly look and work in the future. This time, based on what I have mentioned in this talk, I would like to reimagine how a search result could be made more meaningful in some way. How could this be done?

I think that the field of statistics may offer some valuable insights. How has statistical information been portrayed over the years? A lot has been happening.

At first, it was all tabular, and then statisticians would select some information they thought was interesting to make a few graphs or charts. That's how it's been for a long time, new types of graphs have been introduced along with color, but it was all essentially the same. Today with online databases however, some brand new methods can be applied, such as found in Google Public Data Explorer. Let's take just a moment to see this tool.

[Short discussion of how it works]

Google Public Data Explorer allows displays that could never exist before. They are animated, and far more readily understandable and compelling than the older displays. Suddenly, it is easy for anyone to understand what a time series is. Also, individuals can select the information they find interesting to make their own graphs.

What if we compare this experience to library search result displays? What have patrons seen over the years? The result set has always been a listing of individual records, displayed in various ways. Here are some examples of a search for “Stonehenge”.

[added the headings just for demonstration purposes]

In all the library catalogs, although there may be complete or brief displays, you can sort them in different ways, and now there are facets, we see that people wind up looking at a listing of individual records, not that much different from what people saw probably even when they were in the Library of Alexandria.

In the Worldcat display, people get 5,691 records. That is quite a “paradox of choice”! Which one does someone choose?

With faceted catalogs however, there really is something different: suddenly, the library catalog provides statistical information! This is where the experience of statistical displays can play a part. While the facets in the library catalog are wonderful, my experience shows that people still have problems understanding them. People may click here and there, but with little understanding. This leads me to suspect that people relate to the facets as they would to any tabular display of statistics.

I can imagine someone looking at this Worldcat result and thinking: "There are 434 ebooks and 76 microforms," and relating to it the same as looking at the statistical table and thinking, "In Barrington, there are no deaf and dumb, 1 blind, 2 insane and 2 idiots, while in Bristol there are 5 deaf and dumb, 8 blind, 2 insane and 1 idiot."

It means little to them and there needs something more. What more can be done?

Since now we are dealing with statistical information, one method would be to try to display the tabular information graphically, such as with the graphs or perhaps even with displays similar to Google public data explorer, and that certainly should be experimented with. I have no idea how that would turn out, but are there other options?

I believe that there are and I confess I have held something back. There is another method to display statistical data that comes from Narrative science This tool takes statistical data and generates a textual interface that is not all that bad. It is used now for box scores for Little League Baseball games, and Forbes Magazine uses it for many business reports. Let's take a moment to examine this generated text.

The first is for a Little League baseball game and the second is from Forbes Magazine.

So, how does Narrative Science work? All it does (ha!) is provide an alternative interface that displays the results not in table or graphic form but in words. Google Public Data Explorer could display the same information but it would be graphical and animated.

Could a textual interface such as what we see at Narrative Science, provide a different understanding of a search result in a library catalog? Here is how I think something could work in the faceted search result for Stonehenge.

Catalogers realize that the library catalog and related library files furnish almost all this information right now, but getting at it is not easy: first, you have to know it exists, second, how to access it, third, and most important: you have to know how to read it. It wouldn't surprise me if, by just perusing the tabular data, expert statisticians could mentally visualize something similar to what we can all see today in Google Public Data Explorer. Why shouldn't there be something similar for library catalogs?

Without any doubt, these summaries would be much better if they were written by experts in the field, but it is clear that can't be done, just as a professional reporter will never write up the results of a Little League baseball game. Technology can provide a practical answer.

I believe that such developments could help many patrons, by providing them with a level of context they have never had except the rare few who have had an experienced librarian sitting next to them explaining what they are seeing. By turning a complex result that untrained people find only semi-comprehensible into something much less threatening, it lessens the paradox of choice, provides the beginning of an intellectual framework, and gives people some new kinds of filters that may actually help them.

If something like this did prove to be popular with the public, there may be much more demand for other tasks such as selection and reference. I could see reference staff playing a very important role in such a system.

This is only one suggestion, but I believe that these are the sorts of efforts that would, even if only partially successful, make much greater differences in the lives of the patrons than RDA and FRBR could ever hope to do. Such a project would take advantage of the powers of modern systems, and would cause little disruption to the library's everyday work. And I think it would actually be a lot easier to do something like this with the catalog than what we saw with the Little League box scores. That was incredible. Give me access to the XSL sheet and I could probably do the first sentence right now. Maybe more.

I shall end with this:


Re: [ACAT] Currency of subject headings (was Tell all your associates ...)

On 26/06/2012 23:48, J. McRee Elrod wrote:
Terri said:
I do the original cataloging for Kresge's theses and dissertations. I get perturbed when assigning subject headings because SACO just isn't keeping up-to-date with subject headings.
In library school over six decades ago, we were taught to use the headings from Wilson periodical indexes when more current headings were needed than in Sears or LCSH.

Why not 650 7 $a<Current term>$2wilson? That would create more uniformity amongst us than each of us using our favourite term in 650 4, 650 7 $2local, or 653 (an example of redundancy in MARC).

Perhaps we should also consider 650 7 $a<Current term>$2wiki for Wilipedia.

The codes "wilson" and "wiki" are not in the MARC21 source code list, but there is a *long* list of other codes representing sources which might be more current in their respective fields than LCSH. We find the great number of possible lists daunting, so don't use any of them, apart from LCSH and the new LCGFT.

If we *really* wanted to improve matters, we would adopt classed subject catalogues with indexes having current terms. We are too locked into our Anglo practices.

And they accuse *me* of opposing change!  
One of the strengths (or weaknesses?) of LCSH is that it is based on literary warrant, i.e. on the materials (primarily books) when they are received at the library. See: LCSH: structure & application.
"3.2 Literary warrant
The Library of Congress collections serve as the literary warrant (i.e., the literature on which the controlled vocabulary is based) for the Library of Congress subject headings system. The number and specificity of subject headings included in the Subject Authority File (the machine-readable database containing the master file of Library of Congress subject headings from which the printed list, the microform list, the CDMARC SUBJECTS, etc. are generated), are determined by the nature and scope of the Library of Congress collections."

Of course, now this policy goes beyond LC collections, but still applies. "Literary warrant" relies on the literature in the collection, which means in practice primarily that of terms found in titles of printed books received and being cataloged. This has certain consequences. Literary warrant is not the only method for generating these terms, e.g. there is "scientific warrant" which uses terms in current use within the field. These are normally generated by the real experts in the field, which includes articles and non-published materials--even conversations can be used here.  Obviously, terms get included much more quickly since you don't have to wait for the literature to be created. There are other methods too however, as outlined in this nice article here:

The reliance on literary warrant for physical materials in the library's collection is obviously evidence of how the library is living in another epoch. Back in the days when all information was printed and distributed, the problem was less pressing but the method is showing its age since it is perceived as too slow. The question is: what is the best way to change? One possibility would be to move from "literary warrant" toward "scientific warrant" but I want to emphasize I do not like that at all. It too easily turns into chaos. Nevertheless, once librarians decide to include web materials, such as scholarly blogs, into the mix somehow (which must happen sooner or later)  something will have to change.

Also, Jerri is completely correct with the observation: "Let's get real no one is going to do a search for Future, The, in motion pictures." But, while they will never look for it that way, they do want the materials collected under that topic.

A lot of people look at something like this and conclude: "The subjects clearly do not work. Just dump them." I do not agree, but everybody can see that the current situation is not useful for the public.

What is a practical solution? The first task would be to get the syndetic structures to work in a keyword environment (which should have been done 20 years ago), and then I think we could use logfile analysis, refer to other thesauri and even folksonomies to add terms used, figuring this is a type of "user warrant" and perhaps we could even open it up to the public so that they could add cross-references for these terms.

Should the public be able to create headings? Perhaps on a preliminary basis, to allow for a type of "scientific warrant" but this could be potentially difficult since it could easily lead to chaos.

A few ideas.

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 26/06/2012 18:36, Mitchell, Michael wrote:
Oh, I agree. We will need to be there to help sort it all out for our constituents (whatever form "it" becomes) and we will have the special skills necessary to do so if we keep on top of our game. As many have said, including Jim, "bibliographic" organization will only become more important as we go along.
Now if we could only get these many "Google-style worshipping librarians" to understand our unique qualifications and stop dumbing down our services. If I wanted to work in a bookstore or develop indiscriminate keyword Web search engines for profit I would have figured out how to do that instead. As I've said before, our students know the difference in usage and outcome between our catalog and Google. I think it is our obligation to refine and accentuate that difference rather than strive for convergence.
Yes. The need is to demonstrate it; to prove it to a very skeptical audience. In addition, there is the widespread problem, mentioned by Jerri: "I'm not so sure that "our students" care if there is a difference even after having information literacy classes. I've run into several cases of satisficing when working the reference desk. No amount of demonstrating how to search the catalog using our carefully crafted subject headings gets through to these folks. They want it the easy way and they want it NOW!"

Many people say they don't care about this, but I am not so sure. If you ask students (as I have) "Do you want to be a good citizen of your republic?" I would meet with blank stares. So then, I would go on and say, "Well then, I guess you want to be good servants of a king or good slaves of an emperor. In that case, you don't need to learn anything since you don't need any decent information at all." I would normally get their attention for at least a couple of seconds.

Being able to get decent information is not just about citing a few sources for a paper nobody cares about so that you can pass the course and then go on to get your degree so that you can get a job beyond flipping burgers. Students are not stupid--they know nobody really cares about what they write. This is why I think it is important for them to understand that being able to get decent information is critical in a democracy (republic) where the people at least "claim" to have the power. We have all seen many times what happens when people have access to lousy information--they end up powerless against those who do have the information.

I don't believe this is a politically right or left statement--just a statement of fact. The information services such as Yahoo and Google are giant advertising machines and therefore they do not and will not furnish this since they want, above all else, to make the customer happy. Library attitudes are also alien to people in IT services, who are focused on getting their machines to work. It is the task of librarians to try to ensure access to reliable information to the bulk of the population.

But we have to prove our advantages, and we are stuck with these lousy card catalogs in electronic form and focus on the useless intricacies of RDA and FRBR. There is so much librarianship could offer to the populace if given the chance.

Tuesday, June 26, 2012

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 26/06/2012 14:50, Mitchell, Michael wrote:
I'm pretty sure all most of our students WANT is snacks and a date but in the meantime, if they WANT to pass their course, they HAVE to get 3 scholarly references. As long as we can provide those three scholarly resources we'll be here I think. Without us, access to the scholarly databases is severely limited and, without us, nobody is around to explain to them what a scholarly resource actually is. The really successful academic libraries I've seen (limited I know) provide large "study" areas that serve double duty for snacks and dating. I'm not going to sweat it.
Even that assumes a lot. This assumes that the requirement to write the so-called "research paper" will continue, that open access publication will not become the norm and that publishers will maintain control over scholarly communication. Again, I am not talking about 5 years in the future, but 10 or 20--that is, for a young person considering getting an MLIS.

When I do searches in Google Scholar, already I am seeing a *lot* more available in some fields in that wonderful right column (where the open access materials are supposed to be, although there is some "publisher spam" there), but that of course, is only a single tool out of so many. So, when I search "metadata" there is a lot available for free, while if I search "cicero criticism" there is much less but I see some even there, which would have been almost nothing just a few years ago. Still, the same search in Scirus, limiting it to free materials, opens up an entire world.

Much of the newer scholarship are online projects and cannot really be captured in physical form, although there may be articles about the projects. For instance, there are some magnificent 3-D visualization projects of ancient Rome.

The University of Virginia is much in the news lately in a scandalous way, but Siva Vaidhyanathan, a major web advocate and faculty member at UVa wrote this in the Chronicle:, defending her and detailing the projects they have undertaken there. UVa has always been a leader in this regard, and I find the entire discussion rather strange. But nevertheless, scholarly communication is set for some major changes, and librarians must be ready to respond.

I personally think that the trends in "social search" indicates that finding information that is both reliable and not biased will be even more difficult in the future than it is now or has been in the past. I also think people will come to appreciate these skills of ours, but we will need the right skills and some much better tools.

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

I don't want everybody to end up crying into their beer(!) over the end of the library field. It is not the end at all, but the field must evolve into something different. From this point in time, it is impossible to foresee how much it will change and in what directions. Some individuals approach the coming changes with fear but others meet it with excitement and exhilaration, along with the fear. Unfortunately, the changes are occurring during the time of economic distress and that makes it more complex to adapt but not impossible.

I think catalogers should take the current situation very seriously and acknowledge that the usefulness of what we make is being questioned in many quarters in many ways. Catalogers should do their utmost to see matters from *those quarters*, that is, from non-librarian, non-cataloger points of view. I don't think anything is inevitable, but people must see that the public (i.e. the people who use our catalogs, what they expect, and our collections) are changing: what they want and need. And what they expect. Here is an example. A report came out recently from Sage, and the announcement is called, "Providing evidence of value remains an elusive goal for academic libraries" It was written there "Findings from three geographic areas, the United States, United Kingdom and Scandinavia, indicated that there is no systematic evidence of the value of academic libraries for teaching and research staff."

This represents a challenge. It is evidence that the current situation needs to change. I don't know if I agree that the suggestions for change offered by the authors of the report represents real change or not, but their recommendations are still good. Librarians need to keep their minds open to all kinds of complaints and suggestions, and take them to heart, even if they wind up being hurt or just don't like what they hear.

Once we discover what the public wants, we then have to try as best we can to supply what they want and need. It will be work, and some may find out some things that are rather unpleasant, but it is absolutely necessary.

But I keep forgetting: the FRBR and RDA initiatives have done decades worth of in-depth research on the public, examining what the public wants, how those wants are changing and how to fit those changing wants into an ever-changing technological landscape. And they have proven that the basic needs of the public are and will be fulfilled by the immutable laws of the FRBR user tasks, which are always and forever. So everybody will be OK!


Saturday, June 23, 2012

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 22/06/2012 15:09, Sandra DeSio wrote:
Unfortunately the search engines have an advantage over us in that they have access to the full text of every work (i.e., website) they index available to them. That's what patrons want, to be able to do a keyword search in our catalogs and be able to find the information they want contained within the work itself, not just within the title, author, or subject fields. And that is where our problem is. We can't reproduce every work full text within our catalogs due to copyright restrictions, so we can't give them what they want, and no cataloging rule is going to change that.  </snip>
Yes and no. There is a Google Books API that can be included into a library's catalog in all kinds of ways (easier to implement in open source catalogs than in proprietary ones). Another idea, although I know that people will say, "Oh my God!" but--it is a fact that there is nothing stopping the library field from building their own type of Google search engine. Please keep reading: But don't make a competitor. Make a tool that is limited only to sites that have been chosen by a selector. That makes everything *much easier*. We could work with the Internet Archive and that fabulous Wayback Machine to make sure everything is archived. How would the catalog interoperate with a full-text search engine that we had control of, and the Wayback machine? I don't know, but I think it would be fun to find out! Yes, there would be costs, but if it were not a single library paying for it, but a bunch of them, it would probably be pretty manageable and the public would probably love it. "A website of reliable, selected information" All of the software to create and maintain something like this is open source.

Plus, I think libraries have something great, if they would use it: the subject headings and the syndetics of the authority records. Yes, I know much of it is broken today and have been broken ever since keyword was introduced, but they are still there. These represent real power that the Googles do not have, and cannot have because a human brain is required, at least for the foreseeable future.

There are a wealth of possible projects that could make a difference to the public, but we must use our imagination, and as the Romans used to say: "Fortes fortuna adiuvat", or "fortune favors the bold".

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 22/06/2012 14:27, Roe,Kevin wrote:
I wonder if part of the reason for the neglect, as it were, of the importance of the study of cataloging in schools today is the fact that most people view shared cataloging as the answer. After all, if we can get catalog records from vendors, why do we need someone on staff to essentially redo the work? Catalogers are akin to telephone operators in some ways. We direct dial now, and use the operator for special needs only. And in libraries, we get our records from vendors and simply assume they are fine (and many are...)

Never mind the fact the errors abound in shared cataloging. Records are only as good as the person who created them, and I've seen some really bad ones in my career. That is why we still have a full-time degreed cataloger on staff that looks at every record we add to our catalog, and I would put our records up against anyone's.

I agree with Mike when he states that making cataloging an option is wrong. The fact is that it will continue to become less prevalent in library schools until error-prone records that exist actually start to affect the research being done, both complex and simple. We have, sadly, become "Googleized" and have come to rely on Wikipedia as the go-to encyclopedia of choice for so many. A sad turn, indeed.
One of the real problems is that unfortunately, libraries and catalogers had been more or less resting on their laurels for quite some time and were blindsided by the full-text search engines. Here was something that they had never before encountered: real, genuine competition. It gave cataloging a solid whack to the head and they have never really recovered. Unfortunately, the response was the traditional, knee-jerk one to come up with new cataloging rules(?!), instead of doing the research to find out what the public wanted and to redesign the catalog into something that the research showed people would want. The value of RDA and FRBR have never proven but it is the direction that so-called "forward-looking" catalogers are supposed to go in, while we are all supposed to just hold our breath and cross our fingers, hoping everything will turn out OK.

On the other hand, our competitors, the full-text search engines, are not holding their breath. They are doing massive amounts of research to find out what people want. And they will give it to them.

For years, nobody questioned the need for cataloging but now almost everyone is questioning its value and catalogers have yet to demonstrate the advantages. And to demonstrate it to people who don't understand much at all. While I think we can demonstrate its value and much more besides, our current catalogs are just past it, so we need research projects, and research projects entail costs, which means a budget line, and that is difficult to get in today's financial climate....

Thursday, June 21, 2012

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 20/06/2012 21:13, Myers, John F. wrote:
Ruth Simmons, just before assuming the presidency of Brown University, had an interview with Morley Safer on 60 Minutes. Her thoughts on education were so profound, and contrary to current expectations, that I transcribed them from the VCR and have them posted in my office (and I may have shared them in this forum before):
MS: You've said that you want to see everyone in America go to college.
RS: Yes.
MS: Won't that produce an awful lot of disappointed people?
RS: Why should they be disappointed?
MS: A lot of people who are with college degrees who are flipping burgers or pumping gas.
RS: Education does not exist to provide you with a job. This is, this is where we've gone awry. Education is here to nourish your soul.
MS: But you've got parents who are sitting there spending $40,000 a year, right?
RS: Yes, and that's cheap for what they get, absolutely. Education transforms your life.

And FWIW, this exchange opened with Safer asking, "Why does a child from the wrong side of the tracks decide to study French literature?"

Try to predict that career arc or ROI -- "wrong side of the tracks", to French literature, to President of prestigious liberal arts college, to President of Ivy League University, to Board of Trustees of another Ivy League University.
I actually knew Ruth Simmons just a bit when I was at Princeton. I remember her as an exceptionally nice person, and she was scheduled to speak to the professional staff of the library the day after it was announced that she was to be president of Smith College. She was so happy and proud. Hers was one of the most moving speeches I have ever heard and I will never forget it. I have often thought back on that speech.

But I cannot agree with this statement of hers--it harkens back to the idyllic, collegiate days the 19th century, or at least pre-WWII when affluent young people would go off to college for a few years to pick up a bit of higher culture before they began work at daddy's business. If you preferred, and you had enough to live independently, you could become a professor of philosophy or something, and find employment at a university. That didn't hold for the vast majority of people, and it still doesn't today.

If the point is to "nourish your soul", is it really necessary to go into hock for tens of thousands of dollars at a very delicate point in young people's lives? If people want to nourish their souls, they certainly do not need a degree--certification is completely irrelevant. Certification is what college is all about, after all. If it is nourishing your soul, you could just find a syllabus, or ask a faculty member to give you a reading list and would they be available to answer a question once in awhile. Then go off to the library. That's what people did for years. Today, there are entire courses online and I have taken a few myself. In the past, many would just puzzle things out themselves as many people have, such as Abraham Lincoln. So far as I am concerned, figuring it out for myself has been where my own learning has really taken place; not in a 15 week class with a test at the end.

To return the discussion back to the professional degree such as the MLIS or MBA: *of course* people do it to increase their job prospects. When schools of higher education say that that is not the reason for college, that is simply disingenuous. That's certainly why I did it. It is a completely laudable reason and no one should feel bad about it. But higher education must acknowledge that that is why the 99% is giving them all that money, listening to their lectures and accepting being graded. For the 1%, it is an entirely different matter.

It is only logical that a magazine such as Forbes would let people know whether they can expect to get a good return.

Once again, I will always maintain that a good grounding in library skills is very important, and the example someone gave of becoming a private detective is really interesting. But the question "does traditional librarianship really have a future?" is still vital for many young people who are facing some of the most important decisions of their lives.

Wednesday, June 20, 2012

Re: [ACAT] Tell all your associates, don't go to library school.

Posting to Autocat

On 19/06/2012 21:27, Flynn, Emily wrote concerning the Forbes story that the worst master's degree is an MLIS (
However, on InfoDocket today, the Colorado Library Research Service Blog was featured with their 60-second poll "The Value of an MLIS to You":
And Annoyed Librarian already has had her say on the matter:
Degrees and jobs are what we make of them, as well as what people are able/willing to take.  </snip>
There are more blog posts on this report too. While I may not like what the report says, the question is not so much the current state of librarianship, but this article is for someone considering getting an MLIS, so it assesses future prospects in terms of ROI--return on investment. So, is traditional librarianship a growth field? Will it hold steady? Or does it seem as if it will decrease?

Of course, this is predicting the future so you may as well try to divine it from one of the methods mentioned here: As an aside, in ancient times the Roman augurs would try to predict which future course of affairs would have the blessings of the gods by watching the flights of birds. They really are fascinating and I never tire of watching them. I found an example: If you are close, huge parts of the sky can fill up with birds! Of course, augurs couldn't really predict the future with these birds or with any other methods, they could only make educated guesses, and this is what the Forbes article is doing.

If you are trying to give useful information to someone who wants a career, the question: "does traditional librarianship really have a future?" seems perfectly natural. And does it have such a future that someone should risk making a bet of going into debt for tens of thousands of dollars? I think that already, the library is seen by members of the public more and more as a kind of "community gathering place" and one option to try to get something when it's not online for free, but libraries are seen much less as a place to go for information. Especially current information. That is a change that should not be underestimated.

So long as publishers maintain their attitude of no reasonably priced ebooks, printed books will remain the best option and therefore, libraries will be needed as a place for the physical copies. But this obviously cannot go on forever and sooner or later, publishers will be forced to give what the public wants: ebooks and other materials that are much more reasonably priced. But these providers are, and will be, far more interested in selling to *individuals* than to libraries since they will be able to sell many more copies and for other reasons, such as direct advertising. Therefore, a service such as Amazon Prime (including movies and TV shows, but it could include music and other resources as well) could be much more attractive. Just let people check out more than 1 book a week, especially students. Right now, it's $79 a year, which could even go down as more people joined or even entire communities. With competition from Google Books + Videos, ITunes, and Microsoft will do something plus many others I am sure, matters are set to change radically and fundamentally, while prices will in all probability, not go up radically. The question "does traditional librarianship really have a future?" seems even more pointed.

I think library skills and library ethics do have a future and that future could even be bright, but the field must change in radical ways. What is the most radical proposal from libraries now? While there is the Digital Public Library of America that some are trying to build, it is good project, but I see it as just another version of what the other, *big* information agencies are creating, so I can't see the DPLA as all that radical, just going along with the flow. So what is the most radical proposal? RDA and FRBR. Give me a break!

I can certainly understand how someone, especially outside the field, could say that the prospects for an MLIS graduate are poor. We need new directions.

Friday, June 8, 2012

Re: [RDA-L] "Work manifested" in new RDA examples

Posting to RDA-L

On 07/06/2012 20:42, Brenndorfer, Thomas wrote:
You still don't get it.

Everything you're doing is based upon some data element somewhere that a user must act upon. It doesn't have to be traditional bibliographic data for the FRBR user task to apply. You're still looking for things, finding them because the relevant data was somewhere, still having to make discernments and decisions about what you're looking at, and still having to make some decision about suitability. For example, one the data elements for an expression entity that satisfies the "Select" user task is "Award." So if you find something, anything that won an award, and that's important to you, then you are "Selecting".

There's nothing 19th century about doing that.

I believe I have demonstrated, as much as anyone has, that I get it. I understand FRBR. It's really not all that complicated. It is just that I don't believe it. It has never been demonstrated that it is what the public wants, not even by Panizzi himself, but the limits of his technology and his environment constrained him to come up with his type of catalog, which we have inherited.

Certainly, we can ascribe some kind of transcendent meanings to find, identify, select, and obtain, along with the entities, and say that these are constants that people have needed and wanted, and will remain so for as long as humans stay human. Therefore, no matter what are the advances in "search" and how those results are presented to humans; no matter how intellectual products are created, how those products are metamorphosed and how we perceive them, someone can always label it all as "variants of FISO WEMI by their ATS". Of course, that is the same as maintaining that astrophysics is actually a subtype of astrology or that biochemistry is really a variation on alchemy. That the periodic table of the elements actually displays various aspects of fire, water, earth and air. Someone could make such statements and probably even make an interesting case for them. Yet, it would be obvious to us that anyone who would maintain such attitudes today would actually be talking about the state of his or her own mind and nothing about the actual materials themselves.

So yes, maintaining the primacy of the FRBR user tasks is evidence of an earlier way of thinking that stems from the 19th century and in many ways from before that. And that's OK but so long as we maintain such an attitude, we voluntarily limit our possibilities when compared to those who are not constrained by such presuppositions.

Re: [RDA-L] "Work manifested" in new RDA examples

Posting to RDA-L

On 07/06/2012 18:49, Brenndorfer, Thomas wrote:
I think part of the problem is that James believes that FISO (find, intentify, select, obtain) applies only to "traditional access points of author, title, and subject".

That's incorrect.

Any element, big or small, belonging to any entity can be the target of these elementary tasks.

When people fill out any form, it's to provide bits of data so people can find the form, identify what's on the form, and do things with the data, such as select (which is the basis of limits and filters and other kinds of operations).

When James is saying entities are changing into all kinds of strange new things ("entirely new and never seen before") he's ignoring the main point -- just get the data you need so we can continue to do useful search and retrieval on these strange new things. As anyone who has set up a database will say -- design the system to get the job done, and that means deciding what data are important and what needs to be related to what.
That is the theory, and how it was supposed to work in the old days. While it is true that the entities are changing into strange new things, "find" is not an entity. It is a behavior of the people, and based on the powerful new, and constantly changing capabilities of systems today, people are able to find things in an entire variety of ways that only the wildest science fiction writers could have imagined 30 years ago. As I said before, Google "the russian that killed the old pawnbroker" That information is nowhere in the bib record, nor does it need to be. It works, that is if you are looking for the novel Crime and Punishment, but if it is the name of some musical group, you may have a problem.

I have seen a video where, "it's a book about Napoleon but I don't remember the author or title, only that it had a really neat green cover". You Google "napoleon book" go to images, click on the color green (you'll find it) and you get books about Napoleon with green bindings. This kind of search was a librarian's joke not that long ago. Now it can be done!,isc:green

The modern information agencies are collecting vast amounts of information about each person that even we don't know about ourselves, to use it to find things I am not even aware I am searching for. This is what Tim Berners-Lee's "mechanical agent" is all about. Whether I love it or hate it is irrelevant; it is happening now.

But we must see that it is something profoundly different from what we had before.

Re: [RDA-L] "Work manifested" in new RDA examples

Posting to RDA-L

On 07/06/2012 18:38, Stephen Early wrote:
James Weinheimer wrote:
Find is morphing into something that is really entirely new and never seen before. And with the resources themselves that get more mashed up and vivisected both manually and automatically, it's increasingly difficult to even say what an "item" is, which has major repercussions on what is a work, expression or manifestation, which I still say are all based on physical materials. And finally, focusing on the traditional access points of author, title, and subject is almost forgotten by the public. Certainly they do it, but they do it through natural language searches which goes far beyond ATS in the expectation that the system will sort it all out. And very often, it does.
A couple definite examples of the mashed up resources please, with explanation as to how and why they don’t fit WEMI so that those more deeply involved in this discussion may be able to review them and then clearly agree or disagree with your points.
You may have done this before, but I mostly skim over these posts. However, I’m sensing a lot of repetition of arguments without a lot of progress being made.
They are all over the place. There is Google News, which mashes up all kinds of things. Programmableweb is a good place to find mashups:   where you can find a lot of them. To understand what they are, there is a great, and short youtube video by ZDNet that explains it really well It explains what an API is and how people can put them together to make their very own mashups.

Here is one called Apartable It takes all the APIs found here which is Facebook, Amazon, and several from Google to make something brand new that may actually help someone find an apartment. I am sure that all of the information exists on those separate websites, and nothing much is on the site, which merely brings it all together. If it's good, people may be willing to pay for a service like this.

Here is Google Public Data Explorer, which uses the statistics held at Eurostat, the US Census Bureau, etc. to mashup new views based on Google's graphs and map capabilities. All of these are dynamic, i.e. they are generated on the fly, so it is difficult to call anything an "item." These mashups are bits and pieces of all kinds of things brought together to make something very personal, very often just for you. Anybody can make these mashups now and they do.

It is important to realize that the bibliographic world is headed precisely into these directions, whether it will be using the so-called linked data or through other means. Our records will be available through APIs (Worldcat has some now) and webmasters will be able to include our records into whatever they make. Developers cannot do so now because of our lousy formats but once we turn output in XML, they will be able to.

Some believe that the reason web developers do not use our records now is precisely because of our obsolete format and once we do have better formats, developers will want our data. I do not agree since I believe the challenges faced by libraries are much more profound than a simple problems in formats, but I admit I would like to be wrong. In any case, we need to change our formats.