Friday, January 14, 2011

RE: Browse and search RDA test data

Posting to RDA-L

Jonathan Rochkind wrote:
<snip>
Many ILS use the MARC _schema_ (aka "vocabulary", aka "list of fields and subfields") as their internal data model, if not the serialized transmission format. The MARC 'schema' is kind of implicit, defined as a byproduct of the transmission format, which is in part what makes it so cumbersome to deal with.
And, unfortunately, it's actually the schema, NOT the transmission format, that is a problem with MARC. It is, as everyone keeps saying, easy enough to change the serialized transmission format to something else (MarcXML, an tab delimited spreadsheet, even RDF (based on marc tags!) if you want, no problem) -- which is exactly why it's not a barrier. The barrier is the lack of power in the actual 'vocabulary' -- a flat list of numeric tags each of which has a single flat list of no more than ~35 single character subfields -- is the barrier. And somewhat harder to change across an ecosystem developed assuming it.
</snip>
I completely agree. It's just I consider this step #2. By switching our focus to providing MARCXML as a primary transmission format for our records, we will still be stuck with a completely flat everything--which is bad--but it could be done probably with not much pain, and it will at least be in XML when we, and hopefully others in the world, can gain a bit of flexibility to begin to play in all kinds of different ways, especially compared with what we have now.

To wait even longer to find agreement on anything more is tough. I think we are running out of time. Look at the debate just over capitalization!

One baby-step at a time....

By the way, the Koha open source catalog stores the MARCXML records and uses them through Zebra indexing (exactly how I'm not sure), plus there are various mysql relational tables.

No comments:

Post a Comment