Monday, November 15, 2010

RE: All our eggs in one basket?

Posting to RDA-L

Karen Coyle wrote:
We do not have a single source of data today. We have publisher web sites, Books in Print, publisher ONIX data, online booksellers, Wikipedia, LC's catalog, WorldCat, thousands of library databases, a millions of citations in documents.
There is the question of "is this data authoritative?"...
Also, if the informational world were amenable, a lot of this information *could* come from the item itself. For example, metadata could be harvested from the <meta> fields of a web page. See as an example, the metadata in the Slavic Cataloging Manual, now at Indiana University Look at the "Page Source" mostly found under "View" in most browsers and you will see some metadata for this item. Spiders could be configured to harvest this data.

Or, in an XML document, a lot of this could come from the information itself, e.g. a title of a book could be encoded as "245a" or "dc.title" (although I would like some way to distinguish a title proper). The ISBD principle of exact transcription would fit in perfectly. Also, as information is updated, the updates could be reflected everywhere immediately.

The mechanics of much of this exists right now. The main problem is that there is very little agreement over coding or how data is input. For example, see almost any NY Times article, and look at the <meta> fields there. This can give an idea of the possibilities, as well as the challenges in getting control of all of this.

