On Fri, 28 Jan 2011 13:21:44 +1100, Bill Constantine wrote:
I am trying to reconcile data from different data sources to search and display and sort consistently.
One problem I am having is that some data sources can handle initial articles nicely eg MARC, and other data sources I have been told should be able to do it easily but tying down a programmer to go back into what they thought was a perfectly good program is difficult, and yet other data sources don't seem to have the capability to do anything in regards to initial articles.
Should I persist with wrestling with initial articles or are these an artefact of the old days?
My own opinion is that *if* we want to cooperate with other bibliographic/metadata-type organizations, it will mean change. When people search the "new and cool" information tools, e.g. Amazon, Google, the Internet Archive, while there are many sort options, there is not a "sort by title". In the databases, I can't remember (or quickly find) an option for sorting by title, although there are options for "relevance," date, ratings, etc.
Programming it to work automatically is one of those tasks that seems simple enough to begin with, but can make your heart sink when you start to get into it. *If* all you have are, e.g. English language records in your catalog, then it may be pretty simple but if you have a more expansive catalog, matters get much more complex. For example, here is Princeton's list of initial articles: http://library.princeton.edu/departments/tsd/katmandu/catcopy/article.html. I don't think such a task can be automated.
So, to maintain consistency, everything would have to be coded manually, which seriously limits possibilities of cooperation. By this I mean that we could not accept a record from another database that ignored initial articles, even though it may conform to some kind of future standards for linked data, because somebody would have to code the initial articles.
I think this is one of those decisions that demands research and analysis. The practice may be an anachronism, a holdover from printed and card catalogs, or perhaps not. Still, it is my opinion that something like this should not get in the way of efficient cooperation with other metadata communities. If an automated solution could be found, OK, but cooperation and efficiency are far more important. (By the way, the practice of a title-*added* entry was rarely done in the old, old days. When the title was the main entry, it was normally traced, but sometimes not even then)