Wednesday, June 6, 2012

Re: [RDA-L] Mini Tutorial: Keeping order in RDF and ISO Common Logic/IKL (was Re: [RDA-L] "Work manifested" in new RDA examples)

Posting to RDA-L

On 06/06/2012 02:43, Simon Spero wrote:
In situations where only some authors are given numeric rank, and the rest are ordered by some other principal (e.g. lexicographic order, or no order specified), we can just state the constraints on authorship are, and leave the ordering to be determined by the computer.  We could then indicate that JohnSmith was principal investigator; that no-one goes behind Golgo 13, and the relative contributions of all authors,  then calculate appropriately ordered lists of authors based on context (which might be that of the query, or that of the work, or some other set of rules.

This is where the advantages of representing data as logical propositions, rather than as strings should become immediately obvious to anyone who has ever done work on  scientometrics.   Also, many people may be disappointed to learn that their college courses in philosophy might turn out to be of practical use.

It should be clear why no one should reasonably expect catalogers to enter this sort of information directly.  It should also be clear that the Rules for a Knowledge Based need to be developed with direct input from Subject Matter Experts  who understand the  theory behind the practice.  Most important of all, it ought to be obvious that any new Bibliographic Framework needs to consider all the changes to work flows and practice that can be helped or hindered by different choices, and which cost/benefit tradeoffs need to be made.   
A couple of points here. First, if there is an order imposed, it should possibly be based on the manifestation instead of on the work. I have seen author order moved around on different manifestations and it should probably not constitute a new work.

But second, the question should not be "It should be clear why no one should reasonably expect catalogers to enter this sort of information directly" but rather, what the catalog can actually provide. Since there are literally millions of records that do not have the t.p. order in the encoding--it is only in the statement of responsibility--any search that utilizes that limits to "order on t.p." (or whatever) the result will necessarily be limited only to the set of records that have that information, i.e. a tiny, tiny percentage. This is similar to the earlier thread on "Card catalogue lessons" where there are unavoidable (and probably insurmountable) practical issues with adding the relator codes. Sure, you can do it, but it doesn't solve anything for the *user.* Here is one of my postings.

If we were building a catalog from scratch, I would agree that almost anything can be done. Or if we were dealing with a corporate database or almost any other type of database except a library catalog, we could perhaps get away by just archiving all the old records and start a brand new database, but the fact is, the greatest value that catalogers have now is precisely this huge database that has been built up over many, many years by our predecessors. Libraries do not have the same options as businesses that often consider anything over 5 or 10 years old is semi-obsolete information and less valuable. Libraries are different in this way.

So, the first thing that someone who wanted to do scientometric research using library catalog data would have to understand is: it *cannot* work. Why? Because that information has never been input. *Any* results they got would be fatally tainted, just as I mentioned in the previous thread, searching for Mary Pickford *as a film producer* will retrieve zero, which would be a false result because you can find her without limiting to film producers. How can you possibly explain that away?

If there were the necessary funding in place to pay people to update the information in the records that already exist, that might be another factor but I have heard of nothing like this. It would be a tremendous waste of money to do so anyway when so much needs to be done.

I believe we are in a very delicate time right now. Libraries should be very careful to avoid setting themselves up for failure.

