FW: [NGC4LIB] The CERN Library publishes its book catalog as Open Data

Concerning The CERN Library publishing its book catalog as Open Data
The whole dataset can be downloaded from http://cern.ch/bookdata

This is really great and I hope that other libraries will follow.

*But* the question will be how to incorporate all of this together in a coherent way. The standards of CERN are quite different from Anglo-American standards. Below is a record taken at random, with the record pf the same item in LC. After a quick look, I see that in CERN there is no size; in the LC record the place of publication reflects AACR2 practice of adding a place within the country of the cataloging agency; there are differences in the date of publication vs. date of copyright; no statement of responsibility and no edition statement in the CERN record; the paging itself is different. These last are important for AACR2’s determination of copy vs. new edition. CERN’s subjects reflect their narrower collecting focus vs. LCSH’s broader focus, e.g. “Python” vs. “Python (Computer program language).” Noel Rappin’s name does not have the date of birth as occurs in the NAF. There are several other differences, including some of differing cataloging philosophies.

None of this is to find fault, but rather, while the sharing is great, that is only a first step. How can we use these records in the best, most efficient way for our own purposes and for our users? Of course, some of these problems can be solved with URIs, but I don’t believe everything can. Do we just settle for a mashup or can we do something else?

Jim Weinheimer

<controlfield tag=”001″>984645</controlfield>
<controlfield tag=”005″>20071109101316.0</controlfield>
<datafield tag=”020″ ind1=” ” ind2=” “>
<subfield code=”a”>0596002475</subfield>
<subfield code=”u”>print version, paperback</subfield>
<datafield tag=”020″ ind1=” ” ind2=” “>
<subfield code=”a”>9780596002475</subfield>
<subfield code=”u”>print version, paperback</subfield>
<datafield tag=”041″ ind1=” ” ind2=” “>
<subfield code=”a”>eng</subfield>
<datafield tag=”080″ ind1=” ” ind2=” “>
<subfield code=”a”>004.438.Jython</subfield>
<datafield tag=”100″ ind1=” ” ind2=” “>
<subfield code=”a”>Pedroni, Samuele</subfield>
<datafield tag=”245″ ind1=” ” ind2=” “>
<subfield code=”a”>Jython</subfield>
<subfield code=”b”>Essentials</subfield>
<datafield tag=”246″ ind1=” ” ind2=” “>
<subfield code=”a”>Rapid Scripting in Java</subfield>
<subfield code=”i”>Cover title</subfield>
<datafield tag=”260″ ind1=” ” ind2=” “>
<subfield code=”a”>Beijing</subfield>
<subfield code=”b”>O’Reilly</subfield>
<subfield code=”c”>2002</subfield>
<datafield tag=”300″ ind1=” ” ind2=” “>
<subfield code=”a”>277 p</subfield>
<datafield tag=”490″ ind1=” ” ind2=” “>
<subfield code=”a”>O’Reilly &amp; Asociates books</subfield>
<datafield tag=”650″ ind1=”1″ ind2=”7″>
<subfield code=”2″>SzGeCERN</subfield>
<subfield code=”a”>Computing and Computers</subfield>
<datafield tag=”653″ ind1=”1″ ind2=” “>
<subfield code=”9″>CERN</subfield>
<subfield code=”a”>Jython</subfield>
<datafield tag=”653″ ind1=”1″ ind2=” “>
<subfield code=”9″>CERN</subfield>
<subfield code=”a”>Java</subfield>
<datafield tag=”653″ ind1=”1″ ind2=” “>
<subfield code=”9″>CERN</subfield>
<subfield code=”a”>Python</subfield>
<datafield tag=”690″ ind1=”C” ind2=” “>
<subfield code=”a”>BOOK</subfield>
<datafield tag=”700″ ind1=” ” ind2=” “>
<subfield code=”a”>Rappin, Noel</subfield>
<datafield tag=”916″ ind1=” ” ind2=” “>
<subfield code=”d”>200609</subfield>
<subfield code=”s”>h</subfield>
<subfield code=”w”>200638</subfield>
<datafield tag=”960″ ind1=” ” ind2=” “>
<subfield code=”a”>21</subfield>
<datafield tag=”961″ ind1=” ” ind2=” “>
<subfield code=”c”>20080407</subfield>
<subfield code=”h”>2044</subfield>
<subfield code=”l”>CER01</subfield>
<subfield code=”x”>20060920</subfield>
<datafield tag=”963″ ind1=” ” ind2=” “>
<subfield code=”a”>PUBLIC</subfield>
<datafield tag=”970″ ind1=” ” ind2=” “>
<subfield code=”a”>002647668CER</subfield>
<datafield tag=”980″ ind1=” ” ind2=” “>
<subfield code=”a”>BOOK</subfield>
<datafield tag=”964″ ind1=” ” ind2=” “>
<subfield code=”a”>0001</subfield>

LC Control No.: 2003266066
LCCN Permalink: http://lccn.loc.gov/2003266066
000 01438cam a22003494a 450
001 13108602
005 20090729142230.0
008 030303s2002 ch a b 001 0 eng
010 __ |a 2003266066
015 __ |a GBA2-Y6751
020 __ |a 0596002475
035 __ |a (OCoLC)ocm49044531
040 __ |a UKM |c UKM |d CUS |d TXA |d CUY |d DAY |d DLC
042 __ |a pcc
050 00 |a QA76.73.J38 |b P43 2002
082 04 |a 005.133 |2 21
100 1_ |a Pedroni, Samuele.
245 10 |a Jython essentials / |c Samuele Pedroni and Noel Rappin ; foreword by Jim Hugunin.
250 __ |a 1st ed.
260 __ |a Beijing ; |a Sebastopol, CA : |b O’Reilly, |c c2002.
300 __ |a xx, 277 p. : |b ill. ; |c 23 cm.
500 __ |a “Rapid scripting in Java”–Cover.
504 __ |a Includes bibliographical references (p. xvi-xvii) and index.
650 _0 |a Java (Computer program language)
650 _0 |a Jython (Computer program language)
650 _0 |a Python (Computer program language)
700 1_ |a Rappin, Noel, |d 1971-
856 42 |3 Publisher description |u http://www.loc.gov/catdir/enhancements/fy0715/2003266066-d.html
856 42 |3 Contributor biographical information |u http://www.loc.gov/catdir/enhancements/fy0912/2003266066-b.html
906 __ |a 7 |b cbc |c pccadap |d 2 |e ncip |f 20 |g y-gencatlg
925 0_ |a acquire |b 2 shelf copy |x policy default
955 __ |a ps05 2003-03-03 to ASCD |c jf05 2003-03-11 to subj. |d jf09 2003-03-11 to sl |e jf12 2003-03-12 to Dewey |a jf16 2003-07-11 copy2 to BCCD