Saturday, January 28, 2012

Considerations on Linked Data (Was: Showing birth and death dates)

Posting to RDA-L

On 27/01/2012 22:47, Tillett, Barbara wrote:
ISBD came out of the card catalog environment and was a tremendous tool when we were exchanging bibliographic records. We are no longer exchanging catalog cards. "Exchange" is being replaced by "re-use" of data in environments that can access a shared database (think of the way many of us use OCLC or SkyRiver). We are moving on to accessing records in shared datastores or through web services in the cloud, hopefully saving a lot of time and effort of catalogers by sharing the workload to create descriptions that can be augmented over time and maybe eventually eliminate the need for our multiple, redundant, local databases.

Even better will be when we can move beyond MARC and use linked data with URLs to identify entities and then display whatever language/script the user wants. We have seen the proof of that concept with VIAF-the Virtual International Authority File.

This is a nice view of one possible future, but I do not see how this makes any difference with using "1978-" or "born 1978". I agree with an earlier post that stated the difference is more or less pointless. So if it's pointless, why change practices from what we have now? It only adds to complexity since people will be seeing "1978-" for a long, long time, just as they will be seeing [s.l.], [s.n.], [et al.] and so on forever because the abbreviations in the old records will never, ever change. At least I hope they won't be changed since projects to change those abbreviations would be the biggest waste of cataloging resources I could imagine, even after the economic environment improves. (I have mentioned before that it is a rather simple task to program the computer to render these abbreviations however we want automatically--so long as they are input consistently. Once the consistency goes away, it becomes much harder)

Abbreviations are some of the simplest parts of our catalogs. If people really do have such problems with abbreviations (and I have never seen any research demonstrating it), how are these same people handling the hard parts of our catalogs, such as subject access? Perhaps improving the harder parts of the catalog would have a greater impact on the public.

But concerning linked data:
Accessing bits and pieces of bibliographic records in the cloud using URIs may be a good idea, or maybe not. Eliminating the need for multiple, redundant local databases may also be a good idea, or maybe not. There are many questions that would need to be decided before entering on such an arrangement. One of the most critical involves intellectual property. I think we all know that struggles over intellectual property are becoming more complicated and more intense as the internet grows and becomes more important in each person's life.

In a linked data universe, intellectual property rights become garbled, so it seems to me that if you rely on another agency for critical parts of your records, you may not "own" those parts. For example, in an FRBR universe, what if your work and expression parts come from another agency, and all that is local are your manifestation and item records? That other agency then has tremendous power over you, therefore the relationship would have to be made very, very clear, so that the agency you relied on didn't decide to suddenly shut you down, or say that they need a bunch of money from you. Or start telling you what you can and cannot do.

When I look at the famous diagram, with dbpedia in the center of the linked data universe, it has occurred to me: what if dbpedia disappeared or started demanding money to continue operations?

 And we shouldn't reply that nothing like that could ever happen, because we all know that it can. Many libraries (and librarians) have already been seriously burned by losing rights to scanned images of materials in their own collections--losing the rights to their own metadata would just be too ironic!

This is, or at least should be, such an important consideration, that I personally do not know if the linked data concept, although very nice and convenient in theory, is all that great once it is transferred into reality. I remain highly skeptical until this is resolved.

There are many other practical issues with linked data as well, but perhaps not quite so vital as this.

1 comment: