Friday, March 30, 2012

Re: [ACAT] New rule implementation

Posting to Autocat

On 30/03/2012 08:33, Hal Cain wrote:
<snip>
But no more RDA. The die is cast. Get on with it. Let it rise or fall according to its value as implemented.
</snip>
For many libraries, it is not yet cast since they are left with no options at all. Even if they want to implement RDA, many cannot justify the expense.

What are they, and their catalogers, supposed to do? My own suspicion is many will opt to outsource it and "save" money--on the books anyway. Some places are outsourcing the entire work of libraries which has become quite controversial within communities, but outsourcing cataloging won't cause much of a ripple among the public.

Re: [ACAT] New rule implementation

Posting to Autocat

On 29/03/2012 22:12, Kevin M Randall wrote:
<snip>
James Weinheimer wrote:
That is news to me. From everything I have seen, the data model is based on entities, and the entities are based on URIs. Entities in turn have specific attributes. This is how entity-relationship structures work. I have repeatedly tried to point out that in the traditional catalog, the purpose of the so-called work or expression was *only* for arranging the cards (or unit records, or manifestations, or editions, whatever you want to call it). Today, we would call that a "relationship", or as I prefer to think of it, as a "query". The traditional catalog had meaning due to the *arrangement* of the records that described individual resources. FRBR took a different route and began on completely different assumptions.
To say that the "purpose of the so-called work or expression was *only* for arranging the cards" is begging the question. Why do the cards need to be arranged? They need to be arranged in order to reflect the relationships between the works, expressions, and manifestations. Records in a database lack any specific arrangement, and are arranged on the fly in response to queries against the database. For that reason the explicit identification of WEMI in the metadata are even more critical, since the actual arrangement in the database has no inherent meaning. I don't understand at all how "FRBR took a different route and began on completely different assumptions." Different route from what? Different assumptions from what?
</snip>
The "different assumptions" means the assumptions are that the *users* want to be able to do the FRBR user tasks, which I will restrain myself from enumerating once again. The very action of creating entities according WEMI assumes that it will produce something of value to someone--otherwise, the entire task is absurd. It still remains to be demonstrated that the majority of the public, in the majority of their searches, wants WEMI so badly. And it also needs to be demonstrated that these same WEMI tasks cannot be done with the records and the systems that we have now. If this cannot be demonstrated, what I am maintaining is that there are many other much more fruitful avenues that we can choose to devote our scarce resources to make our records more valuable to the public.
<snip>
Improving metadata by retyping abbreviations is certainly a highly debatable point and puts the question to the RDA/FRBR mantra that they are not about display.
Please, enough with everyone using the abbreviations issue to argue against RDA. This is getting to be quite tiresome. The improvements to metadata that RDA calls for have to do primarily with identifying discrete elements, their attributes, and their relationships to other elements. Abbreviations, capitalization, punctuation, etc. are relatively minor issues in RDA, and unfortunately no matter how many times that gets stated in these forums, it never sinks in. And in regard to saying typed-out words vs. abbreviations is about display, that is wrong. It's about data content, not data display. Those *are* two different things.
</snip>
I agree: this is a very tiresome discussion about abbreviations. It will be far more tiresome for those poor souls who will have to change everything. But it is only one, only a *single one* of the many very tangible points that will demand significant resources that libraries do not have. It also illustrates how RDA wants to change things manually instead of using automated means and it also ignores the issue of the millions of records we already have. This is why I keep bringing it up.

And *of course,* the issue of abbreviations is about display. It cannot be about anything else. How could it possibly be about searching or access? So, it's about "data". Exactly what does this mean? If we were to change our formats to have special subfields for "ca." or "p." (that is, turning the actual number of pages into data) that might be one thing, (incidentally, to change our formats in this way would take a *very long* time) but it still needs to be demonstrated that this added complexity is actually worthwhile. For instance, we would need special fields for "paging" or "volumes" or "parts" or "columns" or--dare I say it: fascicles(!), including a myriad of other possibilities for different formats, each of which are enormous. While the complexity would rise enormously, what use would that be to the public? Or even to librarians themselves?

Various cataloging agencies handle these issues in all different kinds of ways. I have a certain amount of experience in this, and I know that other agencies do things in ways that ISBD/AACR2/RDA catalogers would consider completely bizarre. This is the reality of the totality of today's metadata universe.

I repeat, this discussion simply emphasizes that there is no valid business case for RDA. It seems as if many wish that we did not already have millions of records in our catalogs that are just as valuable to the public as those we create today. If we were starting from scratch, as in most businesses that can essentially ignore practically everything from 10 years ago, that would be one thing, but library catalogs are different. That is, unless we prefer to ignore the records that we already have. That would truly be the saddest consequence.

A couple of responses to other messages:
To Marc Truitt: I agree but will state that all we can look forward to is split bibliographic records among the library community, with unfortunate consequences for everyone. Realize that in the Semantic Web/Linked data universe, library records will be only the smallest fraction of what is available to the public. And that fraction will decrease in the future.

Unfortunately, this discussion should have taken place ten years ago or more. My own opinion is: better late than never, which is apparently what some would prefer.

And to Allen Mullen, Aaron's post, although sincere, is not a business case. Making an actual business case is a complex task, showing the various options (for there are always options), explaining each in detail from procedural to budgetary issues, and explaining why the one you choose is the best. As an example, the UK government has adopted what is called Prince2, which I have had some experience with, and the very first part  is coming up with the business case, so that it can be reviewed and everyone can see the advantages vs. disadvantages. The site is at http://www.prince-officialsite.com/.

In essence, it is the process of defining the problem to be solved, reasons why this particular problem is important to address over other problems, the various options for solution, the different benefits explained, and discussing the risks and costs of each option, and finally why the one you have chosen will have the best results. For instance, one solution may be "the best" in almost all ways, but you determine it will lead to bankruptcy, so you probably shouldn't choose it. How many people and organizations have chosen such an option anyway?

Even though it is called a "business" case, it could also be called something like practical problem solving. There are many ways to do a business case, but in all of them, the business case is one of the very first things that must be done--without it, nothing else can even begin, and this only makes sense.

To still be expecting a business case from RDA after all this time says a lot.

Re: New rule implementation

Posting to Autocat

On 29/03/2012 21:15, Aaron Kuperman wrote:
<snip>
If an when a post-MARC (or thoroughly rewritten MARC) format supports the FRBR WEMI structure, the changes will have meaning. Right now, we have very limited capacities to link online resources with library resources, or we are doing a poor job at describing the diverse formats now in existence (book exists in hard copy, or as a ebook readable only from within the library's firewall, and is connected to a publisher website, and is connected to an author's blog, and is connected to various student/fan websites, not to mention a variety of printed and electronic versions of the text). Addressing 20st century publication patterns with AACR/MARC is hopeless. RDA/MARC is probably no better. RDA combined with a new FRBR post-MARC format is a solution. We do have a problem. RDA is the first step of the solution. And we will need a lot of patience and hope that the rest of the solution is going to arrive. The need for patience and hope is the real issue we need to discuss.
</snip>
Once again, this assumes that FRBR structures provide what the *patrons* want, and this has never, ever been demonstrated. It is simply assumed. I will state, once again, that sometimes, people do want to find different versions of specific works and expressions, but never mind: modern catalogs allow this *right now* without the need for the FRBR structures. They do this by using methods never dreamt of by people probably only 15 or so years ago (when FRBR came out). Technology is changing that fast! What will it be 10 years from now? If we stay focused on the 19th century structures as outlined by FRBR, where will we be in 10 years? As a consequence, FRBR absolutely should be rethought completely, but it is not.

As a result, we remain with vague promises and protestations not to lose the faith--that things will work out in the fullness of time. That we should believe that those in control really do know what they are doing and what will happen in the future. Of course, the simple fact is, *nobody* knows. Nobody. At this point in time, we should not have faith that anybody knows what the future will bring.

I understand that those who believe will say that we should have patience and hope, but this is happening at the same time that libraries and librarians are being cut to the bone.

All of this just seems to me to emphasize the inability to make a rational business case for RDA, which has not yet been done, as was stated in the report from LC/NLA/NLM.

Re: [ACAT] New rule implementation

Posting to Autocat

Before I reply more specifically, I disputed a statement that RDA is a major step into the future. I agree that AACR2 does not deal with many of these issues, but neither does RDA. As a result, RDA is not this wonderful step into the future. Specific replies below.

On 29/03/2012 19:37, Kevin M Randall wrote:
<snip>
There is nothing inherent in the WEMI model that requires separate records for works and expressions. Now, one could very well make the argument that the separate records would allow for much greater processing efficiency. But even having separate records does not necessarily require more work on the part of the cataloger, if the systems are designed for optimal efficiency. Instead of arguing against something that has the potential to greatly increase workflow and user discovery of resources, we should be putting effort into creating better tools. The "sad results" that I see right now are trying to hang on to our existing way of creating metadata using MARC syntax and AACR2 data definitions and expecting that by using those same methods we'll be able to increase efficiency and have more powerful resource discovery.
</snip>

That is news to me. From everything I have seen, the data model is based on entities, and the entities are based on URIs. Entities in turn have specific attributes. This is how entity-relationship structures work. I have repeatedly tried to point out that in the traditional catalog, the purpose of the so-called work or expression was *only* for arranging the cards (or unit records, or manifestations, or editions, whatever you want to call it). Today, we would call that a "relationship", or as I prefer to think of it, as a "query". The traditional catalog had meaning due to the *arrangement* of the records that described individual resources. FRBR took a different route and began on completely different assumptions.

All this is theoretical, but I think it shows that the FRBR structure itself should not be accepted without some kind of discussion. Aside from that though, if everything can be handled without individual URIs (i.e. records) for works and expressions, if there are to be entities, the information in the FRBR-entity records must be somewhere. Everything could be held within the individual manifestation records, that can be done, I have no doubt, but we have essentially the same structure as we have today. Perhaps the work, expression *and* manifestation could be held in the item records, but that seems to be a step beyond.
<snip>
At the same time, RDA pretends that a website, whose title can change constantly and without any notice, can be dealt with in the same way as a physical item, that is: manually. Of course this penchant of virtual resources for change extends to each part of the record. How are catalogers supposed to deal with such incredible maintenance?
And in this AACR2 is better than RDA how?
</snip>
Not at all. This is part of the argument that RDA is not an advance over AACR2.

<snip>
Our authority structures have not worked ever since the computerized catalog was introduced, especially when keyword searching was added. So, our cross-references, along with the subject-subdivision method of browsing that worked fairly well in the card catalog, broke down completely. I think this would possibly be the most promising area for work since our subjects provide something found nowhere else in the world. But I have discussed that elsewhere.
This has always been a problem but has absolutely nothing whatsoever to do with AACR2 vs. RDA. To say that we shouldn't bother improving the metadata because most systems aren't properly using the existing metadata doesn't make any sense. If we wait for the perfect system to use our current metadata, we'll never get anywhere. We need to do *both* things (improve discovery layers *and* improve metadata).
</snip>
Improving metadata by retyping abbreviations is certainly a highly debatable point and puts the question to the RDA/FRBR mantra that they are not about display.

I am saying that if catalog departments are supposed to devote substantial amounts of their resources to solving "problems", we should be solving problems that make a difference to the public instead of to ourselves. This means that we have to demonstrate that our labor and expense is worthwhile to the public. This has *never, ever* been done. I emphasize never, ever. This is part of making a decent business case, which RDA steadfastly refuses to do. RDA prefers to promise a "radiant future" that may come about, but don't blame people if their faith has completely evaporated. I don't think it's so "radiant" anyway.
<snip>
RDA is also silent about how to work with other databases out there, such as the IMDB. In fact, the rules call for us to duplicate the work in the IMDB and possibly, other databases.
RDA helps us define the metadata, with the idea that we should be able to reduce or eliminate duplicate work. RDA is designed to allow for taking in and repurposing existing metadata, or (even better) linking to existing metadata.
</snip>
More promises and vague theories. Why is RDA being implemented with no business case? Has anybody seriously answered that? Without any answer, people can only come up with their own theories.

Re: [ACAT] New rule implementation

Posting to Autocat

On 29/03/2012 16:05, John Hostage wrote:
<snip>
The catalog codes that we used all through the 20th century were based on a stable technology that served us well for a century or more: the card catalog. Many of the criticisms of RDA fail to recognize the fundamental changes that have happened since AACR2 was introduced over 30 years ago: the card catalog has gone away, the electronic catalog is not the only nor even the most significant tool for people to access information, and the World Wide Web happened.
</snip>

My problem with RDA is that it is still mired in the world of card catalogs. Manually typing out abbreviations is only one example. The underlying goals of RDA and FRBR are what the library provided people almost 200 years ago. The only difference is that now, there seems to be the fervent hope and faith that if we throw our records into the Semantic Web mash, that will improve matters.... somehow.

If libraries, and their catalogs, are to make a difference to the world at large, there must be a realistic look at the situation. Expecting catalogers to spend more time typing out abbreviations (yes, it *really does* take significantly more time in the aggregate), and pretending that there are always separate *things* that are works, and expressions that need separate records, and then creating those records, even when the vast majority of resources have only been published once can only lead to very sad results. Plus, we pretend that the public wants and needs to navigate through those works, expressions and manifestations, even though the most popular search engines have never allowed anything like that. Plus, it can be shown that in the correct computer system, even those so-called "user tasks" can be done right now.

At the same time, RDA pretends that a website, whose title can change constantly and without any notice, can be dealt with in the same way as a physical item, that is: manually. Of course this penchant of virtual resources for change extends to each part of the record. How are catalogers supposed to deal with such incredible maintenance? Also, there has been no discussion of workload. What would a catalog department do if a selector said, "I want you to add all the courses in English that are in ITunes U. And I am still considering the ones in French and German."

Our authority structures have not worked ever since the computerized catalog was introduced, especially when keyword searching was added. So, our cross-references, along with the subject-subdivision method of browsing that worked fairly well in the card catalog, broke down completely. I think this would possibly be the most promising area for work since our subjects provide something found nowhere else in the world. But I have discussed that elsewhere.

RDA is also silent about how to work with other databases out there, such as the IMDB. In fact, the rules call for us to duplicate the work in the IMDB and possibly, other databases.

RDA posits the FRBR user tasks, but at least now, I think people pretty much accept that those tasks are not what the vast majority of the public wants the vast majority of the time, but was what the library was designed to provide, mainly for the work of the library.

What do we get from RDA? Retype abbreviations, less access (rule of three to the rule of one), some other cosmetic changes such as changes in transcription practices that will drive librarians crazy but will mean nothing to users. New words, a new "conceptual model".

What I have tried to show here (and not only here) is that the problems faced by libraries are not with the cataloging rules. Therefore, devoting lots of a library's resources on that is like what I would see rather often when I lived in the American Southwest. Some guy would own some lousy old jalopy: it would barely start, all dented up, seats slashed, worth maybe $100 dollars. But the guy would have put $500 tires with sidewalls and raised white letters! He would keep those tires clean, too. Sorry buddy: your car is still a jalopy.

That is how many people are going to see this RDA so-called "change". Expensive tires on something that still doesn't work. And it is others who will make that decision, not us.

There is so much that needs to be done, and can be done. It would be really exciting for experienced catalogers to try to solve these problems that will really make a difference to the public. But they'll be too busy typing out abbreviations, sinking under the workloads, and thinking grand thoughts of FRBR so that we can enter the Linked Data Universe.

But we don't need FRBR to go there.

A lot of the discussion hinges on: what is the purpose of the library catalog in today's networked world? If there were at least some bit of clarity on this, perhaps the solutions would become clearer. But in any case, RDA will make no difference to the public.

Thursday, March 29, 2012

Re: Enlish words added to bibliographic records

Posting to Autocat

On 28/03/2012 17:50, J. McRee Elrod wrote:
<snip>
James said:
If we used the power of the systems to change abbreviations automatically based on simple find and replace focused on specific fields/subfields, how much labor and how many costs could everyone avoid!
Would this not be simpler to program if both legacy and RDA records had "ca.", "fl.", "S.l.", "s.n.". etc. as opposed to words and phrases in a variety of languages?
</snip>
Absolutely. The main point in these matters is consistency. If we can say that wherever there is "[beginning of field or a space]ca[period][space] " in 100/400/600/700/800$d, display as "somewhere around this time, give or take a few years" there is no problem. Computers do this all the time and what they are designed to do.

And sure, there will be the occasional "s.1." instead of "s.l." or spaces or caps or whatnot, but fixing those errors are trivial compared with changing everything manually(!! Still can't get over that one!). Once we get away from the consistency, e.g. changing to "approximately" with all of the possible misspellings, then it will be much more difficult. Multiply this by all of the abbreviations that are to be changed and programming difficulties increase enormously.

As I said before, RDA should be renamed to "Retype Designated Abbreviations." As if typed-out abbreviations are going to make a difference to the public! Yeah, they'll just come running back! I think I read some research somewhere that they discovered this is why the science, technology, engineering and mathematical sciences (STEM) don't use libraries anymore: our cataloging abbreviations.

Of course, unless there is a huge and brain-consuming retrospective conversion project of typing out the abbreviations in the old records, everybody will still be stuck looking at the abbreviations anyway. Whoops! I guess we lost the STEM people again!

Sorry for my lack of seriousness, but you have to find laughs somewhere. :-)

Why not make the highly valuable and unique syndetic structures of our authority files actually *function* once again in our current information environment and maybe even the environment of the future?

That just might make a difference to the public!

Wednesday, March 28, 2012

Re: appoximate dates in personal authority records in RDA

Posting to Autocat concerning changing "ca." to "approximately" in authorized forms of names

On 28/03/2012 02:06, J. McRee Elrod wrote:
<snip>
Brian Brisco posted:
But, Stephen, I think his point is that not everyone speaks English. If we really want to move toward making our description more open to our remote users who reside all over th world, then we should be doing a little more than making our catalog records even more Anglo-centric. This is a major step backward.
I'm so pleased to not be the only one shouting this in the wilderness. So much of RDA puts us in an Anglo silo. Why the Canadians involved went along with this I don't know. With our Quebec and European clients, we simply can't go that route. LAC may can afford duplicate records, but SLC can not. We will stick with "ca." and "fl." in access points. If non English libraries are supposed to change "approximately" etc. to their language, why can not English ones change "ca." to whatever they want, 'approximately' or 'about'? Seems to me the Latin abbrevations could be treated as codes, which would make the display for legacy records the same as for RDA records.
</snip>
And of course, this is all based on the assumption that text in a database is exactly the same as text on a card: that it can only be changed manually, and that some great gurus will decide what will, and what will not, be entered and how it will look, while everyone simply must accept their decisions.

Today, there is no longer a reason for thinking any of that. One person may like ca. another prefers "approximately" another likes "more or less" and another likes "around the year". If we used the power of the systems to change abbreviations automatically based on simple find and replace focused on specific fields/subfields, how much labor and how many costs could everyone avoid! [This is poorly worded. What I mean is that the computer can now display text however we want, so long as it is input consistently. Therefore, the computer can be set to look for "ca." it finds in a 100/700/800$d subfield, and display it however someone wants. --JW]

People should throw away their feather pens and typewriters and enter the 21st century. *Far* more complex tools are being made now. The idea of spending precious, and ever-decreasing library resources changing abbreviations such as "ca." to "approximately", and doing it manually(!!!!) is just another bit of evidence that there is absolutely no rational business case for RDA.

There are *real* problems with the library catalog and cataloging itself, but these issues keep being ignored for trivialities such as these that will make no difference to anybody. That is, except to the catalogers who have to spend their time on such drudgery.

Friday, March 23, 2012

Comment to: What do library users really want? March 22, 2012

Comment to: What do library users really want? March 22, 2012 Heather Pretty

I am honored that you are taking such an interest in my paper. You bring out one aspect that I have not discussed at much length, and it has probably been a mistake on my part, which is to outline what are some actual steps people can take in the future. This is actually why I mentioned evolution, but that involves some nuances. I am not an expert in evolution but there are some ideas there that can be of help.

If we look at evolution, we see that it works by taking very small steps. It wasn't that there was a trilobyte one day and a couple of years later, we have Socrates and Einstein. Such huge steps in evolution do not take place, so what we interpret as “progress” in evolutionary terms takes place in very small steps, very haltingly, and with lots of errors. Such a process, I believe could be very useful for almost anyone looking at the changes we are facing in the information environment today.

Also, when images of evolution take place in the popular mind, it is pictured as “the law of the jungle,” each individual for him or herself, dog eat dog. So, the image of evolution is literally one dog who viciously attacks another dog, eats him and forcibly takes his mates, all for himself. Very bloody. While these sorts of things occasionally happen in evolution, experience shows something else can occur just as easily. When an environment is under stress, it turns out that there is a great deal of cooperation among individuals that occurs, and that a group that cooperates and helps one another in various ways actually has a much better chance to survive than any single individual. This only makes sense.

When I consider these matters, and try my best to translate them into what librarians are facing today, I believe that it would be much wiser to do a couple of things:
1) take small steps when making things that are useful to the public. Concentrate on making small changes—even changes that you may think would help only a single searcher—because we should assume that any method or tool that can help one individual can actually help many more. And then, as these small improvements that can help others add up, those multiple improvements will eventually become one great big improvement.  This has been what has happened in many situations.

An added advantage here is that with small steps, any mistakes can be more easily admitted and the mistake fixed or discarded. If you have devoted months or years to a project, it becomes too difficult to admit that it went in the wrong direction in the first place.

2) The key to it all is to communicate and share your improvements and failures, so that the small improvements do not stay isolated. By sharing, others can take your small improvement and use it themselves, changing it when necessary and sharing their changes with others who can further improve on those improvements. This is the process that takes place in the animal kingdom and also within humanity. 

It is almost impossible to foresee which one of the changes will survive and which ones will be discarded, but each one on its own is an honest attempt at improvement. Yet, if people hoarde their tiny improvements, the process is doomed to oblivion. There will also be steps forward and backward.

RDA and FRBR are attempts, in my own opinion, to make giant leaps forward and these can be compared to the first fish coming out on dry land and saying its problems would be solved if it could slither along the ground like a snake. But that is only how it seems to that first fish becomes it doesn't yet know that much. As it learns more, the fish may actually want four legs, or maybe two legs and two arms, maybe wings so it can fly. At any case, all of those developments are far in the future, and the fish should concern itself with survival in a new environment.

Now that there seems to be general agreement that the FRBR user tasks are not actually what users want to be able to do, the purpose of FRBR has turned into entering the Semantic Web, which is offered as a solution. Of course, the Semantic Web barely exists today and nobody can possibly know if it is a solution for anything or not. Or even if it is part of a solution. Therefore, it is an act of faith to maintain that it is worth the costs (for it will entail major costs), but we cannot know.

I believe the costs and risks are too high to pay for just a simple act of faith when at the same time, librarians could be introducing all sorts of innovations that the public could use today.

Maybe all this seems a little vague, but perhaps not. We should concentrate on helping people now, today, in whatever ways we can (as we do) but then to communicate what we do so that we help more than the few locals, and others can get ideas of their own.

With the web, this can be done easier than ever.

Monday, March 19, 2012

Re: [RDA-L] Card catalogue lessons

Posting to RDA-L

On 19/03/2012 17:30, Jonathan Rochkind wrote:
<snip>
Legacy data is always a problem. 

But if we never start doing different, we'll never have any different.  If you start adding additinal info (like relator codes), there may be a reason to not have the UI expose it until a certain percentage of your data is so 'enhanced'. There can be automated as well as manual cooperative means to enhance.

But if you never start enhancing, you're just making your legacy problem even bigger.

Your argument still amounts to "we've never done it, therefore the catalog is the wrong tool to do it." If it's something important to our users, and we can afford to do it, shouldn't we start doing it?

Arguments against might be that it's not something important to our users, or that we in fact can't afford to do it. But "we've never done it so we can't" is a poor argument, and that's what "but what about the legacy data" amounts to.
</snip>
"We've never done it" is actually a very good and realistic argument because it certainly affects what our patrons expect and whether we can afford to "fix" it. If we enhance without any hope of making anything useful to people until maybe 20 years down the road or so, that would result in a complete waste of our resources and would be *very* difficult to convince upper echelons is worthwhile.

Staying with the example of relator codes, especially for films, many would say (and I would agree) that our adding them is a complete waste of resources because it duplicates information found elsewhere. Why devote resources to create a product that can only be inferior to what is already there? This would not be making our legacy problems even bigger: we should instead be concerning ourselves with what we can really and truly do that isn't found elsewhere. What is it that is unique that library catalogs provide?

I think there is a lot library catalogs can provide this way, but certainly not relator codes.

Re: [RDA-L] Card catalogue lessons

Posting to RDA-L

On 19/03/2012 15:24, Mike Tribby wrote:
<snip>
On 3/17/2012 6:42 AM, James Weinheimer wrote:
Why is the local catalog definitely not the correct tool here? Because of a few facts: There is LCRI 21.0D where it is stipulated that LC will not put in relator codes. They are also not required in BIBCO.
Jonathan Rochkind responded:
This is awfully circular. You started out saying that it was a mistake for the local catalog to try to do this, it was the 'wrong tool' for this job. When someone asks why, you say, basically, because the way we do things makes the catalog fail at this. Right. So, um, why not do things differently? Your answer looks like simply "because we never have, so we never should"
Unless, Jim's point has something to do with the unlikelihood that enough records will have the relator codes included to be a really good source given LC's heavy output of records. Unless LC and BIBCO change their policies (and amp up enforcement) or there is a concerted effort by other cataloging agencies to add the relator codes to LC's and other BIBCO records, there are other, better places to find the desired information. Doesn't mean the catalog couldn't support this kind of thing, just that as currently constituted it might not be the optimum source.
</snip>
Mike is absolutely right. We can always add search options for information that isn't in the catalog, but it will always retrieve zero.

This is an example of why I keep saying that we have to look at the catalog through the eyes of the *patrons* and not our own eyes. Patrons are not going to know, or understand, that you have begun to start encoding authors as "producer" but have this coding only on .001% of your records, that a search for "producer" will not only be useless, it will be *worse* than useless because people will come away thinking, after getting a zero result for Mary Pickford as a producer, that you don't have those materials, although you very well might. It was just that the relator code was left off of all of the older records.

So, what choices do the catalogers have? Well, a project can be started that will upgrade the records, but as long as you are looking at a record, you may as well recode all the names and not only the producer. Now, you are talking about a lot of work that will take significant resources away from doing new materials. So, these are the sorts of things that people work on "in their spare time" which means, it takes many many years, or more likely, just never gets done.

The only other choice would be to hope the public doesn't notice, but first, they will notice because they are not stupid, and besides, I think this is an unethical attitude toward the catalog, since I think that we should be in the business of telling the truth whenever possible. If somebody searches for Mary Pickford as a producer and retrieves zero, but every one of these films is in your collection http://www.imdb.com/name/nm0681933/#Producer, and her name appears in the records but without the relator code, then the catalog is lying when someone does this search.

So, you decide to tell the truth, and try to make the public aware that when they search by relator codes, they are only searching a tiny, tiny fraction of everything that is there. Of course, they won't understand this, they won't understand why, and they will come away with a very poor opinion of the catalog. When they complain, they will complain to public services, who will agree with them, and they will all complain together.

This would be another example of cataloging setting itself up for failure, at the same time alienating the public and the rest of the library. A reference librarian could see such a response in a second, and that is why I say they are so sorely needed in these matters today. What is the *truth* for searching producers of films? Do not use the library catalog because it is the wrong tool. But you are lucky that there are other, free and easy tools available.

In this environment, we have to ask ourselves seriously: what constitutes a "better" or a "worse" record? Throw out wishing and imagining grand things, but what these records are, and can be, in reality. Because reality will rear its head sooner or later.

Re: [RDA-L] Card catalogue lessons

On 17/03/2012 15:17, Brenndorfer, Thomas wrote:
<snip>
Why is the local catalog definitely not the correct tool here?
Catalogers go to great lengths to record the very same data as in relationship designators in the form of notes and statements in the record. That's the whole "justify the added entry" concept.

Decisions about what headings can become added entries are decisions based upon identifying the role played by the creating or contributing entity. As a catalog user I find it extremely frustrating to have to scan notes to understand why a search result came up attached to a particular person's name. It's great when I eventually find the note, but this is not a user-friendly design, and looks more like inefficiencies built upon inefficiencies.

A good data entry system has vast impact on the efficiency of cataloging. The issue of time spent on entering relationship designators quickly becomes moot when one factors in:

a) the time spent already on making those very same decisions and entering comparable data,
b) the ease with which some newer systems allow catalogers to enter data, validate data, or harvest data and integrate it, and
c) the overall goal of FRBR and RDA which is to create a universal baseline of understanding of what is going in catalogs.

This last point means that catalogers, systems developers, and other data providers all understand the same model, the same techniques, and the same application possibilities. That's the core business case for FRBR and RDA, and it's the same reason why the rest of the data management world relies heavily on entity-relationship models to design data systems that solve real world problems. Why ignore a tool (entity-relationship modeling) that the entire world of data systems now depends on?

And understanding this includes understanding the possibilities for retrospective conversion. It's far better to start thinking of retrospective conversion when there's a good new home for that data than just to recycle that data from one inferior system into another similar kind of inferior system (having gone through three ILS's in the last 12 years, data migration issues are expected, but it's frustrating when the benefits of a new system only come down to some low-hanging fruit, and the larger scale benefits are still out of reach of many library systems, and so the same weaknesses of AACR/MARC data abound).

AustLit ( http://www.austlit.edu.au/ ) did it correctly when old databases were FRBRized-- much wasn't difficult, and the rest was handled by good tools and trained data specialists. And when it was done, it was done, and everyone was better off because of it.
</snip>
Now we finally get some mention of the business case. And we see that "entity-relationship" models are the justification. "This last point means that catalogers, systems developers, and other data providers all understand the same model, the same techniques, and the same application possibilities. That's the core business case for FRBR and RDA, and it's the same reason why the rest of the data management world relies heavily on entity-relationship models to design data systems that solve real world problems. Why ignore a tool (entity-relationship modeling) that the entire world of data systems now depends on?"

Creating a data model from scratch is quite different from creating one based on models in long use. We have decades worth of data, many times representing records created over 100 years ago. As I have mentioned before, FRBR has turned the relationships of the traditional catalog into entities. In the card catalog, there were not explicit "works" or "expressions". In some cases, you might find a card that would explain in more detail how the cards were arranged, http://imagecat1.princeton.edu/cgi-bin/ECC/cards.pl/disk6/0488/B5379?d=f&p=Bible+...+Texts--Anglo-Norman+%3E&g=51627.500000&n=2&r=1.000000&thisname=0000.0002.tiff but you would not have separate cards giving you information about the "Work" of the Bible and then another card telling you about the Bible in English.

The works and expressions were theoretical constructs that determined how the separate cards were arranged in the catalog. These arrangements in turn, were derived from the arrangements found in some printed book catalogs (but not all) and based on the rules defined by Panizzi and Cutter. With FRBR, these relationships between and among all the separate records were turned into "entities", and these formerly theoretical constructs that guided arrangement suddenly turned into "things" that had to "exist" and therefore every record needed associated work and expression "entities" even when there were relatively few cases of such arrangements in the catalogs (less than 20%) because the vast majority of everything written exists in a single manifestation.

I agree that *if* we were setting up a brand new database today, we may want to consider creating an FRBR-type structure, but this is not the case. We have massive amounts of data that will need converting to this new model, so then the questions must be asked: how much effort will it be to create and maintain these new structures, and will the costs be worth it? The example of adding the "producer" attribute is only an insignificant part of this when considering all of the relator codes for past and present, and relator codes is itself only an insignificant part of FRBR structures. Yet it serves to accentuate the enormity of the problem. Are there any alternatives that accomplish the same goals? And I think a vital question is: will it make any difference to those who will use the system?

I think I have shown that there are alternatives that accomplish the same goals, and only research can determine whether creating the FRBR structures will make any difference to today's public. Yet it seems as if the utility to the public may *not* be the main purpose now of FRBR, but just to get into the linked data universe. There are many ways to get into the linked data universe, if that is the goal, and if we want database interoperability there are alternatives that have been going on for some time.

It would be useful to discover if the database interoperability that currently exists has made a difference, for instance, in the Google Books interface is the "Find in a Library" http://books.google.it/books?id=df6Ug_9CQccC. I wonder if this has made any substantial difference to library circulation? After all, although this is not linked data, this is one demonstration of how it could work in the linked data universe. Does anyone know if it has increased circulation in their libraries?

Saturday, March 17, 2012

Re: [RDA-L] Card catalogue lessons

Posting to RDA-L

On 16/03/2012 23:26, Kevin M Randall wrote:
<snip>
Finally, why is it wrong to expect to search a library catalog for Steven Spielberg as a producer only? Because it's not the right tool, whether anybody likes it or not. You don't use JSTOR for the latest news on Putin's election. Lexis-Nexis is not the best tool for in-depth research on biblical archaeology. You don't use a hammer for a ripsaw or a motorcycle for a pickup truck. People can learn this although they may not like it.
Excuse me, but I find it absolutely preposterous that the library catalog is NOT the correct tool to search for locally-held DVDs for which Steven Spielberg was producer only, not director. Please explain this outrageous assertion.
</snip>
I shall focus on this statement which you claim is so outrageous.

Why is the local catalog definitely not the correct tool here? Because of a few facts: There is LCRI 21.0D where it is stipulated that LC will not put in relator codes. They are also not required in BIBCO. Consequently many, many libraries follow these directives. Whenever a library accepts LC or BIBCO copy, they can, but do not have to, decide to add the relator codes themselves, but doing so demands local work that necessarily takes away from other tasks catalogers could be doing and detracts from productivity. Therefore, this is the kind of decision that should only be taken by an institution, not by individual catalogers.

There is also the little problem that relators are not in the records that have been created in the past. If a catalog is not designed to provide reliable results, e.g. the producer code is added into only the newest records resulting in perhaps only 5% or 10% of all producers of movies being coded, to allow such a search on the records produces a false result, so that the searcher must be made aware of the fact that this search only encompasses a relatively small part of all of the movie producers. Perhaps the result does not have to be 100%, but it certainly has to be 80% to 90% to provide some sort of result that the searcher can rely upon. So, if any particular local catalog has decided to devote the resources to add the publisher code where it is supposed to go, that is fine, but that goes beyond the normal bibliographic standards and other local catalogs can decide otherwise.

There are many points that someone can consider to be "outrageous". For instance, ending series authority control, that RDA requires only a single author to be traced, that 245$b is optional, even that RDA is supposed to be accepted without any concern for the serious practical consequences, that is, without supplying a business case. Also in my own opinion, to expect catalogers to do even more with dwindling resources is rather outrageous and can lead to nothing good.

Non-librarians may find the fact that they have to search multiple separate databases and indexes for journal articles to be outrageous. Another point is that for non-roman languages, searchers must refer to some transliteration tables that they often find incorrect or semi-comprehensible. Non-catalog librarians are often shocked that a variant ISBN does not automatically require a new record, or when people discover that authors with pseudonyms have to be searched separately under each pseudonym (separate bibliographic identities) except for pre-20th century authors (sometimes!) they find that outrageous. My own pet peeve has been that translators have been very rarely traced. (At the same time, I did like that it was less work!)

I could go on, but I won't. I will only say that in the future, as people will be able to do more and more with full-text, people will doubtlessly find more and more "outrageous" problems with library catalogs.

These are some of the ways the tool we are creating works. As with every tool, it has its limitations. At least with producers of movies, there is the IMDB.

Re: [RDA-L] Card catalogue lessons

Posting to RDA-L

On 16/03/2012 16:40, Kevin M Randall wrote:
<snip>
And that kind of cooperation [e.g. with IMDB -- JW] is *exactly* the kind of thing that linked data, and data definitions such as the RDA elements, are intended to make possible! Without having precise data definitions, it will never come about. And our current data *can* be transformed. It won't happen overnight, but nothing does. For an example of data transformation: most libraries have been able to completely get rid of their card catalogs, haven't they?
</snip>

I am aware of that. The question is: do we need RDA and the FRBR structures to implement it or could it be done with what we currently have? These are the questions we should be asking.
<snip>
Strange, in a lot of your past messages I thought you were arguing that we needed more integration of information so people wouldn't have to have so many different places to search. I must have misunderstood you. And I must have misunderstood you when you said above "It would be much better to try to cooperate with projects such as IMDB in some way". I guess by cooperate you don't really mean to be able to have the data connected? And if not, what *do* you mean? (And for the life of me, I cannot figure out *why* it would be wrong to expect to be able to search a library catalog for Steven Spielberg as a producer only, not a director. If I want to find Spielberg-produced DVDs in my library, shouldn't the catalog make it easy to do that?)
</snip>
Well, to be honest, I am trying to figure out what the local, library catalog should and shouldn't be. Some apparently want it to be all things to all people. Others want much less for it.

I am trying to figure out why someone, as more and more digital content emerges and our acquisitions budgets go down, why they would actually open up the local catalog instead of staying with Google, or Facebook, or whatever the big site will be then. There must be a reason, and I hope it won't be just to find out if the library has a copy of the book they found on Google or Amazon so they won't have to pay for it.

If the library catalog is not to be all things to all people (which is what the Facebooks and Googles seem to want to be and entering a race like that would ensure that libraries would lose) then why would somebody use the library catalog? Why *should* they use it? The public must find definite, tangible advantages there that they will not find in the Googles. Therefore, while the definition of the "library's collection" must change to include the materials available on the web, it doesn't mean *everything* on the web. What does that mean precisely? I don't know, but it means limits of some kind. Does it mean "connected data"? Not necessarily, but may include various types of federated searching.

Finally, why is it wrong to expect to search a library catalog for Steven Spielberg as a producer only? Because it's not the right tool, whether anybody likes it or not. You don't use JSTOR for the latest news on Putin's election. Lexis-Nexis is not the best tool for in-depth research on biblical archaeology. You don't use a hammer for a ripsaw or a motorcycle for a pickup truck. People can learn this although they may not like it.

Re: [RDA-L] Card catalogue lessons

Posting to RDA-L

On 16/03/2012 15:42, Kevin M Randall wrote:
<snip>
James Weinheimer wrote:
Of course, I don't agree with this reasoning. I don't think it is essential. Adding the relator information is additional labor for no tangible gains. While I agree that the public has terrible problems with our catalog records, this would be ranked near the very bottom. Working on this distracts our efforts from the real problems with our catalogs.
Sorry, but contrary to being "near the very bottom", I believe that the lack of relators is one of the major problems with out catalog data. Maybe some users are happy to have to slog through statements of responsibility and notes in order to find out how a particular person relates to the resource being described, but why should they have to? Why should we expect the user, wanting to find things where Person X is acting as writer, or as editor, or as illustrator, or as publisher, or as performer, or as producer, or as director, or as composer, or as librettist, have to get a result list that includes many things where Person X is *not* involved in the role being searched? How happy would we be with Internet Movie Database if we weren't able to have searches limited to a person's particular role? I know that *I* wouldn't be.
</snip>
That's fine. You can have your beliefs and I can have mine. But if relators are to be added, it will result in diverting resources from other things catalogers could be doing, such as raising productivity, adding more headings with the cap off of the rule of three, doing better subject analysis...

I only hope that nobody ever searches our catalogs for someone as an editor because they *will never* and *can never* get results they can rely upon. How much would it cost to add all of those relators for all of those millions of records? What an incredibly ironic waste that would be!

It would be much better to try to cooperate with projects such as IMDB in some way, or better yet: let people know that you don't search a library catalog for this kind of information, just as you don't search catalogs for lots of things, like journal articles, datasets, or most websites. Nothing wrong with that. It's just the wrong tool. I think people could understand, just like they seem to know that you don't search IMDB for the latest information on gall stones. Still, I have had people who want the latest news on a some current political issue say that they should search JSTOR!

But if somebody can show that lack of relators is really important, and so important that it rises above all of these other possibilities, AND that our current records can be updated in some kind of way that will allow for at least semi-reliable results for the users, OK. It seems to me that sooner or later somebody should ask the patrons what they would prefer.

But we shouldn't just accept all these things without question as RDA would have us do, especially in the climate we have today. They still cannot show that it makes good business sense. It seems as if they don't care about the consequences to the people involved.

Re: [RDA-L] RDA/FRBR and the Business Case; Was:RDA as the collaboratively created way forward[?]

Posting to RDA-L

On 16/03/2012 14:47, Brenndorfer, Thomas wrote:
<snip>
The world is moving on and leaving FISO behind. For instance, "find" is turning into "search" which means >creating an "intelligent agent" for our information needs. That is what Tim Berners-Lee wants and is one of the >primary goals of the Semantic Web, and supposedly, one of the main reasons for RDA and FRBR in the first place.
No, that's flat out wrong-- the Semantic Web is about bringing back "find" because "search" is not enough. Quote from Tim Berners-Lee himself: "Does this mean that they [search engines] will start to absorb the whole RDF data model? If they do, then they will be able to start pulling all of the linked data cloud in. Will they know what to do with it? Because when it's data in a very organized form, I think some people have been misunderstanding the Semantic Web as being something that tries to make a better search engine - i.e. when you type something into a little box. But of course the great thing about the Semantic Web is that you can query it, you can ask a complicated query of the Semantic Web, like a SQL query (we call it a SPARQL query), and that's such a different thing to be able to do. It really doesn't compare to a search engine." http://www.readwriteweb.com/archives/readwriteweb_interview_with_tim_berners-lee_part_2.php Querying that kind of structure requires an entity-relationship model. That's why RDA was written to support the entity-relationship framework.
</snip>
Sorry, I disagree that I am flat-out wrong, but it's not a search engine that TBL envisions. It's an intelligent agent that does all the searching and sifting for you based on the totality of what is on the Semantic Web, including whatever is there about you. You might have to reset it once in awhile but otherwise, it runs for you constantly. Search engines don't have much if anything to do with it. The intelligent agents do it all for you. Lots of people absolutely love this idea, but as I mentioned in the podcast, in a lot of ways, it gives me the creeps.

We don't know if this is what people will want. Nobody does. Perhaps after Google rolls out their Semantic Web tools, we may all get a better idea if it actually works or not. In the meantime, put our records in RDF so that they can be shared--that's fine. We don't need FRBR structures to do it. We should have done it long ago. Give it a try and see what happens. It may be useful and we'll learn a lot. Or it may not make any difference at all. As I said, nobody knows, so let's not bet too much on it.

Re: [NGC4LIB] Discovery and science's "new era"

Posting to NGC4LIB

On 12/03/2012 22:12, Laval Hunsucker wrote:
<snip>
In _Nature_ 478.7369 (on p.321), Chris Lintott concludes his review of Michael Nielsen's _Reinventing discovery : the new era of networked science_ (Princeton University Press, 2011) with the comments that the author "convinces us that radical change is a real possibility", and that this book "will frame serious discussion and inspire wild, disruptive ideas for the next decade."

Nielsen foresees a new scenario for creative scientific work, and for determining scientific success and recognition -- one in which, for example, the traditional system of scientific ( journal ) publishing does not, to put it mildly, play a decisive role. Scientific communication, and the course of scientific progress, are going to become a whole 'nuther ballgame, so to speak.

I was just wondering whether anyone on this list who has read Nielsen's book might have any comments on what he or she believes such a scenario may entail for how library and information services will ( have to ) adapt, and for the way in which they will ( have to ) function differently from the current situation. [ If, indeed, there will even still be a place for such services, if the scientific enterprise becomes so fluid, and only active scientists will be aware of what is actually going on. ] Can we look forward to "wild, disruptive ideas" for adapting research librarianship and information services for a radically new environment ?  Is that a kind of imminent "next generation" down the road ?

Or is Nielsen ( himself a physicist / computer scientist ) just a daydreamer, and Lintott ( an astrophysicist ) too naïvely credulous ?
</snip>

Thanks for pointing this out. I haven't read the book, but I found a talk of his and watched it: http://www.ustream.tv/recorded/2685625. Quite enlightening.

You ask a very good and provocative question about, will there even be a place for librarianship in that kind of scenario? I would like to think so, but librarianship must be radically reconsidered into something like, how to facilitate the incredible collaborations that Nielsen mentions and that we are only beginning to see?

A few thoughts:
  • I can see that many for-profit companies would want to get involved since this could be monetized for them in all kinds of ways. Librarianship, with its ethics and values, could play a trusted role in that the scientists could be assured that the librarians would not be in it just to see how much money they can get out of it.
  • I do not believe that information can organize itself. Therefore, from these collaborations there will be information, documents, datasets, and so on that will need to be searched, navigated and referred to. All of this information will probably have associated information with it: e.g. the members of the project found a specific document to be: valid, interesting but not immediately useful, wrong, stupid, etc. This evaluative information would be good to capture as well.
  • Of course, there is the assumption that everyone involved will be able to find any information they need themselves. That may not be true--even if they know how, perhaps they will just be too busy to do it themselves--and they may have need to turn to an "information specialist".

I'm sure there are other possibilities as well.

Re: [ACAT] RDA Implementation Date Set

Posting to Autocat

On 15/03/2012 22:44, Brenndorfer, Thomas wrote:
<snip>
Yes, the RDA MARC fields now exist, and are being populated in authority records. The German National Library has been using RDA-like elements for some time now. And a major point about the FRBR elements is that they derive from the desire for international co-operation -- the new elements weren't invented whole cloth, but reflect realities on the ground in different countries and are there to facilitate co-operation and integration of bibliographic data (and that's DATA-- independent of the different record formats that house that data, which may not interoperate in different systems).

And March 2013 is not that far away for us to get started as soon as possible on the rest.

Apart from finding a workaround for the GMD, most of the immediate RDA changes resemble just a bundled upgrade of routine MARC code changes (which most ILS's have had time to incorporate at a basic level, at a minimum) and some long overdue AACR2 changes (which have been waiting ready to go for some time, and so catalogers have had ample time to learn them). Everything beyond that will be incremental and iterative, but the goal for co-operation and finding greater use for the data is still there.

As the Dr. Peter Chen video showed, there is a hunger in system developers to go to the next level. One part of the discussion was about the importance of good, robust, abstract data models to push the envelope in designing new tools to solve business problems. Many of the tools are there -- what's needed is the data built on a good model, and it sounds like there will be takers. http://channel9.msdn.com/Shows/Going+Deep/Dr-Peter-Chen-Entity-Relationship-Model-Past-Present-and-Future
</snip>
Well, I don't want to go overboard on this small point of agreement we have reached. For instance, "populating" authority records should not mean that our catalogers will do anything more than they are now. Expecting more is a sure way to eventual oblivion. And, I still see zero reasons for adopting RDA. The changes should be achieved by cooperating through our *systems*, and not through changing cataloging rules that will have no impact on our public.

Certainly our rules will need changing, but as I have tried to point out many times, we don't yet know how our rules need to change. Change our workflows. Change our formats. Change the definition of the "library's collection." Admit that our subject headings as they are now are a disaster and do something to get them to work again. Find a decent way to let people see the powers of the syndetic structures in our authority files. Fine. I am for all of those attempts. But absolutely no one--and I repeat *absolutely no one*--knows what the public needs or wants today, and what they will need and want in the future. So get into the Semantic Web. Great. Do it quick and dirty and cheaply. But don't expect too much from it, no matter who claims it is any kind of a solution is completely wrong because they *do not know*. They cannot, unless we believe in crystal balls. If somebody turns out to be right, it will just be luck. And because everything is changing so quickly, "the future" includes only five years from now.

The rules we have now definitely work for the purposes of librarians, just as they always have. To make our records useful for the public will take a lot of research and do not justify the expense of changing our records to those forms now. The FRBR structures are based on a 19th-century view of the information universe, and it will take quite a bit of work to show that those same structures are needed in the 21st century. We may find out that all of that is true, or not. The proof remains to be seen.

Still, that doesn't mean we need to stand still. See what can be done, and find out what are the opportunities through cooperation using the tools at our disposal today. Those possibilities are almost endless. Then, we can see what will be worthwhile to change. Certainly the changes proposed by RDA remain completely dubious as so many of the official reports maintain.

Friday, March 16, 2012

Re: [RDA-L] RDA/FRBR and the Business Case; Was:RDA as the collaboratively created way forward[?]

Posting to RDA-L

On 15/03/2012 21:33, Casey A Mullin wrote:
<snip>
What I'm reading in Mr. Weinheimer's criticisms is not a rejection of FISO itself. (I personally find FISO so intuitive as to be axiomatic.) What he often addresses are not these tasks themselves, but the **methods** used to fulfill said tasks. To be sure, the left-anchored browse environment of the card catalog, where title, author and subject were the only methods of entry, is a far cry from the methods users have at their disposal today. Today's discovery environments offer a dizzying array of methods for users to encounter and interact with content. That much is certain. But these are simply innovative methods by which to fulfill more basic tasks, which FISO encapsulates pretty well. Such methods, however innovative, do not supplant those tasks or render them irrelevant; they in fact facilitate them.
</snip>
I am not rejecting FISO. What I am saying is that FISO is becoming like stone tools when there are all kinds of power tools available. The world is moving on and leaving FISO behind. For instance, "find" is turning into "search" which means creating an "intelligent agent" for our information needs. That is what Tim Berners-Lee wants and is one of the primary goals of the Semantic Web, and supposedly, one of the main reasons for RDA and FRBR in the first place.

I have tried to elaborate on this in some of my podcasts. "Search" using all kinds of incredibly detailed information about you, and your friends, and their friends, and your browsing habits, and it analyses unbelievably deeply into everything you look at--the documents you read and write, your email, the webpages you look at, things that you, yourself don't even know--will all be used to provide you automatically with the information these algorithms determine that you "need". These are some of the facts of information today. They are happening right now and have been happening for quite awhile, and a huge amount of wealth is at stake. At the same time, I believe that the vast majority of people will like these new tools, just as much as they like Google today, and these companies will make absolutely sure they are attractive and extremely simple to use. They will continually improve.

While I am personally very suspicious of all of this, many more prefer it and say we must embrace it. But as I mention in my podcasts and papers: it doesn't matter at all what I think. This is the world we are entering and I can't stop it. No one can, especially not librarians! Therefore, the choice is simple: we must find ways to adapt or not survive.

Once "find" has metamorphosed into something that is almost incomprehensible, the ISO part obviously becomes confused. Even today, when searching in Google, the only point where you can identify and select is after you have obtained it, which turns everything topsy-turvy. Again, this is a simple statement of fact and a few seconds of working with Google will show how true it is. This has been the case for well over a decade and is not going away. We begin to understand how the traditional FISO may actually be predicated on physical materials that are not immediately available, and have very little to do with full-text materials that are available at the click of a button.

In my little cartoon of the conversation between the patron and the library catalog, I also tried to show that even in the past, ISO was overblown and people did that part at the shelves because the information in the catalog record distinguishing "manifestations" was essentially meaningless to them.

Therefore, FISO has been an ideal that has existed primarily in the minds of catalogers and has never corresponded all that closely with reality. Certainly, with full-text online materials, it must be rethought. Too bad perhaps, but absolutely necessary.
<snip> 
Now is not the time to question the basic premise through which our profession persists. Now is the time to double down on the distinct value we provide to the information universe: structure, validity, and intellectual rigor. This is what we do well. This is what lamentably few others in the information universe are providing. This is what is needed, more now than ever, for the continued advance of our civilization.
</snip>
I think that now is precisely the time to question the basic premises. If not now, when? While I have no doubt that our records do provide added-value that is found nowhere else, we must reconsider what that added-value really and truly is. Do we really think we can compete with "search"? If so, how? What do our records provide that "search" does not and will not? Where does our value-added really lie? I have tried to address some of this in my papers and podcasts, but they are only suggestions and may be totally wrong.

It seems to me that if we want to find out where our value-added is, then we must first face facts: admit that we don't know what the user tasks or the user needs are, and then try to discover what our patrons genuinely want.

Re: [RDA-L] Card catalogue lessons

Posting to RDA-L

On 16/03/2012 00:33, Brenndorfer, Thomas wrote:
<snip>
RDA's relationship designator is "defendant" to specify the relationship of the person to the work. Unlike anything prior, RDA tells it exactly like it is.
</snip>

Catalogers did this before. Here is an example of "defendant" in the Sacco and Vanzetti case: http://imagecat1.princeton.edu/cgi-bin/ECC/cards.pl/disk20/4810/K3568?d=f&p=VanWyck&g=29631.500000&n=89&r=1.000000&thisname=0000.0090.tiff

and there were lots of others. Here is "joint author" in the case of three authors: http://imagecat1.princeton.edu/cgi-bin/ECC/cards.pl/disk4/4230/J3081?d=f&p=Smith,+Arthur&g=26350.500000&n=8&r=1.000000&thisname=0000.0008.tiff, but there were lots more.

Our predecessors did all that work but decided that this information was of very marginal use for the public and therefore was not worth the effort. (Graphic arts retained some for their records) In addition, there was the ISBD statement of responsibility, plus guidelines to create notes when the SR was inadequate, and this provided much more specific information than the relator information.

It has not been demonstrated that the public needs this information in their searching any more than before. The only reason I can come up with (aside from purely theoretical, academic ones) is that in some of the possible futures of the Semantic Web, our metadata records will actually decompose and melt into the general "alphabet/semantic soup" and therefore, the link between the heading and the SR will be lost. This will happen especially in a WEMI data structure. Therefore, it is essential, when the SR is not so readily seen as in today's records, that the public will get a better understanding of the heading and what it really means in relation to this work/expression/manifestation (i.e. semantics).

Of course, I don't agree with this reasoning. I don't think it is essential. Adding the relator information is additional labor for no tangible gains. While I agree that the public has terrible problems with our catalog records, this would be ranked near the very bottom. Working on this distracts our efforts from the real problems with our catalogs.

Once again, if there were evidence that it does make such a major difference to the public, that would be one thing, but there has been nothing. We are all just supposed to simply believe it. Yet, I can't believe this will make a difference to anyone--especially when we will not be going back and adding relator codes to the millions of records we have now.

Re: [ACAT] RDA Implementation Date Set

Posting to Autocat

On 15/03/2012 21:06, Brenndorfer, Thomas wrote:
<snip>
Great! So you're on board in creating the elements that can receive and link to this data. That took a while, but it's heartening to see you start to see the light.

The only thing I would drop would be the "should learn" for people using the catalog, as that's inconsistent with your entire argument that people don't want to learn. People shouldn't have to learn too much, but go about their tasks with alacrity!
</snip>

I believe I have been consistently in favor for true cooperation (Cooperative Cataloging Rules?). My problem is: why not cooperate now instead of expecting catalogers to change all of their records, their cataloging rules, systems, etc. instead of just doing it now? What we saw with the Elvis example can be done right now. Why wait? Still, it is good that after all this time, we may have found some point of agreement.

I will also agree that my argument about people not learning what you can find in a catalog is rather inconsistent, and that is why I prefaced it for "the sake of argument". Still, I am so crotchedy that I believe people *should* know that there are different tools for different purposes and what they are for. That's why in my information literacy classes, I would compare learning the tools for research: catalogs, indexes, full-text journals, web search engines etc. to learning the tools of a trade--specifically, how to be a butcher. (My father was a butcher) The very first thing you learn is your tools and what each is for. And you do not use a boning knife for a cleaver, or vice versa! If you don't use a ground meat machine or a band saw just right, you may not go home, or you may make it home but minus a body part or two!

I don't know if it worked or not...

Re: [ACAT] RDA Implementation Date Set

Posting to Autocat

On 15/03/2012 20:00, Brenndorfer, Thomas wrote:
<snip>
Typical RDA authority record being created today:

046    ‡f19350108‡g19770816
053  0 ‡aML420.P96‡cBiography
100 1  ‡aPresley, Elvis,‡d1935-1977
370    ‡aTupelo, Miss.‡bMemphis, Tenn.
371    ‡aGraceland, 3764 Elvis Presley Boulevard (Highway 51 South)‡bMemphis‡cTenn.
374    ‡aAmerican rock and roll singer‡aguitarist‡aactor‡s1955‡t1977
375    ‡amale
400 1  ‡aPresley, Elvis Aron,‡d1935-1977
400 1  ‡aCrow, John,‡d1935-1977

Many of these new RDA elements (which are now pouring in with new and updated authority records-- RDA has already been implemented for many fields in authority records if not bibliographic records) are the types of elements used in disambiguating services as seen in Wikipedia or IMDB. One has to have the data, and data in the right form (see field 046 for dates) to make the new applications, and retrospectively adding data is quite feasible in many cases, as seen in the just posted PCC documents on RDA and authority records: http://files.library.northwestern.edu/public/pccahitg/index.html
</snip>
This is a great example of what I think is a truly missed opportunity. Does anybody really and truly believe that it is the best use of diminishing cataloger resources to rework this kind of information into the authority records we already have, or should we spend our time working on new resources? What would our patrons prefer?

Personally, I believe that if people want this kind of information, they should learn that a library catalog is definitely not the tool to use and never has been, much as you shouldn't use a screwdriver as a chisel, but for the sake of argument, let's say that we will include this information. We can either do it in the 19th-century way, updating everything manually line by line, record by record, although the same information exists in different databases online, or we could do it in a smarter, 21st-century way, using the power of our systems, not by duplicating labor already done, but by bringing together (among other sites) dbpedia.org  http://dbpedia.org/page/Elvis_Presley and MusikBrainz http://musicbrainz.org/artist/01809552-4f87-45b0-afff-2c6f0730a3be/relationships and other things. They are both supposed to be in Linked Data. (angel choir!) So, if catalogers are manually updating the records that are "pouring in", it is truly a tremendous waste of cataloging resources!

So, how can we get all of this information to work together, possibly to build a brand new, cooperative tool?

I must say that it has relatively little to do with catalogers, and much more to do with systems people, and all that is needed is new formats, absolutely not new rules.

By the way, I looked at the MusicBrainz page and that page claims that Elvis had "relationships" with Tuesday Weld and Ann Margaret. That is impressive, but I believe there were probably more! Maybe that will be one of the future jobs for a cataloger: to complete those kinds of lists. :-)

Thursday, March 15, 2012

Re: [ACAT] RDA Implementation Date Set

Posting to Autocat

On 15/03/2012 18:08, Jennifer B Young wrote:
<snip>
Absolutely! Making our data more portable to the next iterations is what RDA is trying to start.
</snip>
I don't believe anyone is disputing the need to make our data more portable. The dispute is if RDA or FRBR are needed to do it.

That has never been demonstrated. And it's been years of waiting. If implementing RDA and FRBR cost little or nothing, that *may* be one thing, but it will be very expensive, and beyond the abilities of many libraries. This is not a fake statement, as I think many believe but is *really true*. What are these catalogers and their managers supposed to do?

At the very least, there should be some tangible advantages that those who will be responsible for asking for the training, subscriptions etc. can point to, to justify the outlays to their superiors, especially at this time. But nobody has done it. All anybody can talk about are some vague promises of the tremendous advantages that lie far off in the future. That is why I keep mentioning a business case which deals with practical concerns.

If there are no tangible justifications, and administrators are faced with these decisions of how to spend restricted budgets, I agree with Ann that many head librarians may decide to just outsource everything and have done with it. There need to be some very clear reasons for these administrators *not* to do that. Otherwise, it will have serious personal consequences for many people.

This is not being alarmist--just stating some facts.

Re: [ACAT] RDA Implementation Date Set

Posting to Autocat

On 15/03/2012 15:36, Brenndorfer, Thomas wrote:
<snip>
So, how does the following search not fulfill these criteria? http://www.worldcat.org/search?q=au%3Ahomer+ti%3Ailiad 
Several issues: The nature of the relationships between all the results in the search is not readily apparent. The facets function as the "shared characteristics" type of relationship, but there are many more types of relationships that are coded for, but lost in the mass of records (what about series, as just one example?). WorldCat's "Editions and formats" expansion under each record helps somewhat, but is very flakey. I select the "Alexander Pope" translation, but the subsequent results show "Samuel Butler" results. Next to useless.
</snip>
It works fine for me. When I click on Alexander Pope, I get books with Alexander Pope in the record. http://www.worldcat.org/search?q=au%3Ahomer+ti%3Ailiad&dblist=638&fq=ap%3A%22pope%2C+alexander%22&qt=facet_ap%3A. The indexing also takes into account the two forms "Alexander Pope" and "Pope Alexander." I am sure that adding a series would be a simple matter for the programmer but as we all learned, LC determined that users don't need series and that is why they eliminated series authority.
<snip>
And what goes into the title TI search? Variant titles found in authorities? What about when uniform titles aren't used when the title proper is supposed to be sufficient (as it was in the card catalog)? The text string "Iliad" could be part of a huge number of titles unrelated to the requested work. And so on.
</snip>
No, not so long as "Homer" is included. When there is no uniform title, it should be searching the normal title fields: 245, 246, etc. Naturally, I don't know exactly what it is doing in Worldcat, but I know that in Koha, it searched the Zebra index (Lucene-type indexes are flat files by the way) for instances of the title and correlated those instances with related instances of any other terms you search.

Now, if you would search "Iliad" without "Homer", you would then have to click on "Homer" to get the same result, but that is pretty easy, too. http://www.worldcat.org/search?q=ti%3Ailiad (Once again, I believe "Format" takes up far too much space on the user interface, but that would be a very simple problem to fix. We discover that there are four forms of Homer's name in Worldcat, apparently) This is wonderfully simple and completely logical. Very little or even no training at all is needed for the public to do this.
<snip>
So no, not "everything works". There's lots of room for improvement.
</snip>
Of course it works, that is, if you want to FISO/WEMI by their AT. Subjects in Worldcat could be vastly improved and are, as I have seen in other, similar sites. Still it can all be done right now. Isn't that interesting?

But yes, there is always room for improvement. Still, so long as catalogers create records correctly and consistently, such as adding uniform titles when required, things actually do work.

What do I think people would *really* like? For those who can deal with ebooks, whose numbers will doubtlessly increase, I think people would love to know, when seeing a record like this: http://www.worldcat.org/oclc/669713496 to be aware of materials like these: http://www.archive.org/search.php?query=iliad%20homer%20alexander%20pope where they can get it all for free. I can assure you that on a tablet, those scans look pretty good and are easy to read.

Yes, there are problems with the metadata in the Internet Archive, for instance I found one publication not by Pope, but that is another matter. Something like this could actually make a real difference in someone's life. I would have been overjoyed to know about all of these wonderful materials.

Re: [ACAT] RDA Implementation Date Set

On 15/03/2012 14:05, Brenndorfer, Thomas wrote:
<snip>
RDA Chapter 17, which covers the primary relationships between work, expression, manifestation, and item, is quite problematic in MARC. This is because MARC is a carrier for a collocated flat file construct.

Library of Congress in its policy statements has abandoned use of Chapter 17: "Do not apply chapter 17 in the current implementation scenario."

What does Chapter 17 allow:
"The data recorded to reflect primary relationships should enable the user to:
a) find all resources that embody a particular work or a particular expression
b) find all items that exemplify a particular manifestation."

Regardless of whether or not a user "wants" all the resources embodying a work, the system should at least allow the user to find them.

Similarly, a user may not want every single copy exemplifying a manifestation. They will want the copy that has these attributes: "checked in", "at this location", "non-reference". They are identifying and selecting copies when they do this. It's not a big deal, but the system should be designed to support that user task. As for the broader FRBR entities --- NO!! --- many current implementations __DO NOT__ do FRBR well, even though the objectives are plainly obviously in their usefulness.
</snip>
So, how does the following search not fulfill these criteria? http://www.worldcat.org/search?q=au%3Ahomer+ti%3Ailiad

If catalogers have done their work correctly and added the proper uniform title somewhere in the record, this search will retrieve the entire "work" of Homer's Iliad, and it will do even more since it will pull out, as some call it, the "superwork." It will include all related works, such as the movie Troy with Brad Pitt. You can also limit by any expressions. If I want Pope's translation, I just do a simple click. True, getting actual manifestations, except by date, can't be done in this interface but I have no doubt that could be done pretty easily by adding "publisher", if it was desired.

It can be improved, perhaps to add into it somehow, "Selections" for refinements in the uniform title. This search even picks up "Homerus" so perhaps some level of fuzzy matching is included now.

I think these capabilities are absolutely great--it is hard to imagine anything much simpler and easier, it needs practically no training and everything works with the records as they are now. These are the sorts of capabilities that demonstrate the way systems development should work. Certainly it can all be improved, but there is nothing wrong with that.

It seems that anything that FRBR envisions would only copy these capabilities in some way. FRBR/RDA may eventually create something as good and as simple... as what we see today. Of course, the kind of indexing we see in Worldcat will continue to develop as well.

Re: [ACAT] RDA Implementation Date Set

On 14/03/2012 22:09, Kevin M Randall wrote:
<snip>
FRBR implies no such thing at all. The only way to find such an implication is by misreading FRBR. Because FRBR does. not. deal. with. display. Period. How much more clearly can that be said?
</snip>

Well, that kind of argument certainly convinces me! I guess that ends this topic for everyone forever ....  :-)
<snip>
FRBR will help by telling the catalogers, metadata technicians, catalog/discovery layer designers, etc. what kinds of things are contained in bibliographic metadata and how these things might be related to each other. RDA applies the FRBR model of bibliographic data in defining specific elements and specific relationships. It is up to the catalog/discovery layer designers to take those metadata elements and figure out the best ways to make the metadata function so the resources are findable by the users; FRBR doesn't say anything about how to do this, and it was never meant to.
What will people be able to discover that they are not able to find today? What will be so incredibly different from what we have now?
Our current metadata, as currently coded in MARC records, are a hopeless jumble in comparison to the elements defined in RDA. People will be able to discover *so* many more things than they can now in our current catalogs. (Well, I suppose they *could* discover them now, if they wanted to spend half their lives trying to make the connections between things that are only implicit in many cases, if they are there at all.)
Yes, we can go into the linked data universe. Big deal, as my podcast discusses. We can do that now, anyway. 
We can't get very far with the current state of the metadata.
</snip>
All of this, even though it can be demonstrated that our MARC records are not a hopeless jumble. Sure, I have nothing against getting rid of MARC format--that's fine. Changing the format should have been the first step, but some have complained about my suggestion that we merely abandon ISO2709!

Still, please demonstrate how current systems do not allow all of what you are stating. Nobody is proposing any new "access points"--there will be the same authors, titles and subjects as always. Yes, some of the relationships may become more explicit, but it remains to be demonstrated that additional detail in bibliographic relationships will be of any genuine use to the public, while the millions of records we currently have will never be upgraded. On the other hand, I believe that improving subject analysis, perhaps by analyzing down to 10% instead of current 20% may be useful for the public--but first we would have to make our subject headings themselves actually function in the web environment because they don't work today. Subjects are a different area however.

Once again, I see a faith in some sort of wonderful future that is extremely vague; that FRBR and Linked Data are genuine solutions to the problems we face. At least I do not share that faith and prefer to be shown.

Re: [ACAT] RDA Implementation Date Set

On 14/03/2012 22:58, Tim Skeers wrote:
<snip>
I have long dreamed of producing a Bollywood-style Ranganathan biopic, perhaps starring Shah Rukh Khan in the subject role. The centerpiece would be a spectacular song and production number based on the Five Laws. Of course I would need, at minimum, a Hindi-speaking collaborator and a good choreographer.  The RDA/FRBR opera might best be realized through another joint effort by Robert Wilson and Philip Glass, although its length would probably greatly exceed the five hours of Einstein on the Beach. 
</snip>

I can't do that, but how about this in the meantime?

RDA Implementation: Day One Celebration
<a href="http://www.grapheine.com/fred-f66.html" title="agence création boutique en ligne">agence communication</a>
http://www.grapheine.com/bombaytv/illustrateur-en-045334482cadac9afce5e6510f25d515.html