Re: [ACAT] New data and visualizations at “MARC Usage in WorldCat” site

On 12/2/2013 4:40 PM, Tennant,Roy wrote:

The “MARC Usage in WorldCat” web site has been updated with October 2013 reports and with two new data visualizations[1]. One visualization depicts every field by format and then by tag in a zoomable starburst diagram [2] — the other depicts every field first by the tag and then by format in a zoomable starburst [3]. Actual numbers can be seen while hovering over the tag. As always, if you wish to see a report on a particular subfield, please let me know. 

Roy Tennant OCLC Research 

Thanks so much for these. It is always so interesting. After this has been done for awhile, we can put it all into a time series, and they may become very useful. In the meantime, I have played with it a little bit.

For there to be a bibliographic version, I assume that there should be an instance of one of these fields:
240$a, 250$a, 700$t, 710$t (other 7xx$t are too marginal)
(in a follow-up post, I added 130$a/730$a. Corrections included below)

Total instances of 245$a (that is, all records): 305,519,264
240$a: 9,137,082 (or 2.99%)
250$a: 11.79% (didn’t get the number of instances)
700$t: 6617885 (or 2.166%)
710$t: 705,005 (or 0.23%)

Anyway, adding all of the instances of different versions (that is, with the subfields above, and they could appear in the same record occasionally), we get: 17.176% of the total.
[When adding the 130/730, the totals were
130$a: 2,768,779 (0.9%)
730$a: 4,877,321 (1.6%)

Added to the 17.176% equals 19.676%, again, roughly equalling the 20% ]

Therefore, I conclude that of the Worldcat database, total number of versions equals 17.176% of the total and single works (with no versions) are 82.824%. This is probably the basis of the 20% versions that I have seen cited in the past. Of course, this is not necessarily true since not every new version of a resource will necessarily get one of these fields and it is not necessarily true that every cataloger does his or her job correctly and adds one of these fields when it is warranted. I have seen that often enough. I also noted that the percentage of variations in recordings is significantly greater than books, which I always suspected but unfortunately I deleted those numbers and am too lazy to crunch them again. If somebody wants to do it, they can do it themselves!

Comparing this with total holdings is also interesting:
Total: 2,063,751,707
240$a: 86,902,550
250$a: 393,758,799
700$t: 91,309,572
710$t: 4,616,325

The total of all 240$a,250$a,700$t,710$t = 27.94% of the total
[Adding the 130/730:
130$a: 33,200,705 (or 1.6%)
730$a: 35,605,963 (or 1.7%)]

So, while the instances of versions equals 17.176% of the total records, versions equal 27.94% of total library holdings.
[Adding the 130/730 to the 27.94%, we get: 31.24%]

This would demonstrate that libraries tend to buy significantly more versions, as opposed to single items. This has been mentioned before and only makes sense. If something is popular enough to have been translated or published as separate “Selections” or in a new edition, it is logical to assume that more people will want it. Still, this probably includes scads of multiple editions of obsolete textbooks and other marginal materials. So these are just preliminary analyses.