Posting to RDA-L
On 05/03/2011 04:18 PM, John Hostage wrote:
Although any string could conceivably be used as a “code”, Mac demonstrates the difficulty of using such strings as codes. If it depends on entering punctuation and capitalization correctly, it is unlikely to be “data that has been entered consistently” over many years. (OCLC has 4483 records with “S.1.”, i.e. the numeral one, in the place of publication.
Part of the problem is that RDA can’t decide if it’s about machine-actionable metadata or catalog records. If it said that if the place of publication is not identified, enter a code like “xx” or “00” or whatever, that would be more likely to be entered consistently and simpler for exchange of data, and it could be used to generate whatever was wanted for display, whether “[S.l.]” or “Place of publication not identified” or something in another language.
Sadly, I don’t know that anyone has systems that are capable of such transformations.
Textual strings are actually extremely easy for a computer to manipulate and can handle whether text capitalized or not, or if there is or is not a space, and so on. This is actually child’s play, especially when you are talking about the very few number of abbreviations used by catalogers in catalog records. The concern is that the text is entered consistently, and everybody understands that “consistently” also has a certain amount of lee-way. Almost all the variations on the abbreviations can be predicted, and for the few that we can’t predict, e.g. typos in an abbreviation, then we are no worse off than we are now.
I think it’s important to concentrate on what can be done now, today, with the data and tools that we have, and not put our hopes on something that we *might* have 5, 10 or 20 years from now. In any case, as I have mentioned several times before, changing the cataloging rule for spelling out abbreviations *will not* prevent our patrons from having to know what our cataloging abbreviations mean: since we will not be doing a huge retrospective conversion of them, our patrons will still see the abbreviations in *every single search* they do until the end of time.
If we want to try to “solve” the abbreviations problem, it absolutely must be done through automated means. Either that, or we have to hire flotillas of “data entry” people to physically retype in all of the abbreviations. If we are going to spend that kind of money on people, I think we could all figure out better places to put them to work.
This is simple reality.