Wednesday, March 31, 2010

FW: [NGC4LIB] User interace design and trust

Posting to NGC4LIB

In this same vein, I just finished reading an article in the latest College & Research Libraries (Mar 2010, no. 2) "What are they learning? Pre-and post-assessment for LIBR I 100, Introduction to Library Research" by Jon Hufford. They did tests of students before and after an information literacy course.
What I found interesting were the results concerning users' understanding of the library catalog e.g.:

"What is the least likely resource to use to find citations to articles?
A - library's online catalog
B - electronic databases
C - internet
D - search engines
e - periodical indexes

Pre: 6% chose a. Post: 28% chose a.

Another:
"It is possible to find full-text magazine articles in the library's online catalog." True-False

Pre: 18% chose False. Post: 22% chose falso.

And most telling:
"Which of the following kinds of information can be found in the Library's online catalog.
A - journal articles
B - journals the library owns
C - books on a certain topic
d - book reviews
e - sound recordings the library owns

Pre: 4% chose b,c,e. Post: 5% chose b,c,e. (I would have emphasized "catalog records" for journals, books, etc. the library owns, not the books themselves)

In fact, from this entire report, the students appear to show a much greater understanding of the digital materials and tools than the traditional library materials and tools. This mirrors my own experience. Young people (most people?) don't know what they are searching when they search Google vs. Google Books vs. Lexis-Nexis vs. an OPAC ... [et al.]. When I try to explain the differences of what they are searching in each database when they enter text into the search box, I see they are struggling, and I find it extremely complex to explain. I still believe that the traditional library catalog is becoming increasingly strange to our patron community.

I have tried to solve some of this with the Extend Search in my catalog to at least make it simpler to search all of these different places, but you are right: there needs to be a lot of serious work in interface design.

Tuesday, March 30, 2010

RE: [NGC4LIB] Next next generation catalogs, some reality check

Posting to NGC4LIB

I have done some thinking about your findings that a small percentage of the German books were available in Google Books. It seems that there are two completely different conclusions one could draw from this:

a) physical libraries need to be nurtured, protected, and expanded;
b) it's a national emergency that cannot be tolerated and must be fixed as soon as possible.

It all depends on how people continue to understand "access." Will it mean simple availability to the text after traveling to a library and/or waiting for ILL? Or, will the "laziness factor" become just too strong and the traditional idea of libraries as a place to go for information and knowledge simply becomes an intolerable hindrance?

My own guess is that once the Google Books agreement is implemented in the English-speaking countries, we will hear glowing testimonies of all the resources they would never have been aware of before, because of the power of "modern searching" and the new possibilities of working with texts that were unknown before. Different nations will react in different ways, but most probably will react with option b), although I would guess some may react with option a). My own opinion is that not having immediate, online access to library materials will be seen as a serious defect that must be corrected. Google's incredible example has shown that scanning massive numbers of books can be done in just a few years, and relatively cheaply. Hundreds of millions of dollars doesn't sound so bad today after these incredible bailouts and the huge personal fortunes of some of the world's richest people. To put it in perspective, the price of a single B-2 Bomber is between one and two billion dollars!

As I wrote in my previous, poorly-written posting (sorry about that. I clarified it on my blog. That's why I entitled it "First Thus"!) the essence of the catalog was always based on *inclusion*--that adding a resource meant that I was giving it access to the world, and otherwise this resource would have been ignored. In the new world, I think the catalog must follow a completely different idea of *exclusion*--that is, people will go to it for well-selected, worthwhile materials. In other words, for a "seal of approval"

I realize that this would have huge consequences for librarianship as a whole and the very nature of the catalog, but it would provide a service that people really want, and it would distinguish us from "being an inferior subset of Google Books" which I think will be inevitable if we do not change our focus.

And oh yes! We should keep cataloging, and doing it better than we are now! (I have been doing some cataloging lately, and the quality leaves a lot to be desired. If we expect people to want our metadata, it had better be pretty good!)

RE: Date on rev. ed., with no copyright date

Posting to Autocat

On Mon, 29 Mar 2010 10:39:13 -0600, john g marr wrote:

>> We would use the record for the first printing of the revised edition, adding 250 $aRev. ed. if lacking, and correcting the number of pages.
>
> Presuming the number of pages cited is wrong?
>
>> We would only indicate the printing in the bibliographic record if for a rare book, and remove that if in the record being used;
>
> In this case, the change of pagination from 178 to 180 is not sufficiently great to require a new record, but if it had been a large change we would "see" 3 diff. eds. and put the printing date of the 1st appearance of the change ["XX print."] in the 250 to distinguish the change from previous eds., if no other information was present (it happens!). Otherwise, there is no reason to create new records for subsequent printings in the same binding.

This is a case where there are differing rules. LCRI 1.0, section "Edition or Copy of Monograph" is quite explicit: (See at the Cooperative Cataloging Rules:
http://sites.google.com/site/opencatalogingrules/aacr2-chapter-1/1-0--decisions-before-cataloging---rev#TOC-Edition-or-Copy-of-Monograph)
where it states:

"Also, consider that a new edition is involved whenever
1) there is an explicit indication of changes (including corrections) of content; or,
2) *anything* [my emphasis--JW] in the following areas or elements of areas differs from one bibliographic record to another: title and statement of responsibility area, edition area, the extent statement of the physical description area, and series area."

But these guidelines are different from "Differences between Changes within" at:
http://tpot.ucsd.edu/msd/catpolicies/cattoolsresources/docs/Differences07.pdf
which states:

"A5. PHYSICAL DESCRIPTION AREA
A5a. Extent of item (including specific material designation).
A different extent of item, including the specific material designation, indicating a significant difference in extent or in the nature of the resource is MAJOR. Minor variations due to bracketed or estimated information are MINOR. Variation or presence vs. absence of preliminary paging is MINOR. Use of an equivalent conventional term vs. a specific material designation is MINOR. For example:
* 351 p. vs. 353 p. is MINOR
* 452 p. vs. x, 452 p. is MINOR
* [211] p. vs. 212 p. is MINOR
* 356 p. vs. 492 p. is MAJOR"

I personally like the LCRI rules since they are clearer and more precise, while the ALA guidelines will mix up different textual variants, and this mixing will be random because there are no set guidelines such as: "a difference of x number of pages, or x% of extent signifies a new edition/copy,* but they leave the very important determination of MINOR vs. MAJOR to "cataloger judgment," or in my opinion, how the cataloger feels that day.

Again, I believe that this type of issue is an atavistic remnant of card production and today should be considered under the rubric of "how to display multiple manifestations and items." In the brave new future of cataloging, these matters can be handled in automated fashion by directing the catalog to display only the information that differs from one record to another. Then each catalog could decide on its own what it wants to display. At the same time, cataloging could become more exact while training becomes easier because everybody would just transcribe *exactly what they see.*

Sunday, March 28, 2010

FW: [NGC4LIB] Next next generation catalogs

Tim Spalding wrote:

<snip>
* What good ideas have yet to become mainstream?
* What idea trends—mobile? ebooks?—should cause us to rethink things?
* Is it time to decide that the next catalog is no catalog at all?
* Is it Google? A kiosk? A cell phone? A WorldCat metastasis? Dying because the library is dying?
</snip>

There are many questions here, but I would like to focus on the provocative:"* Is it time to decide that the next catalog is no catalog at all?"

Before discussing this however, there are other issues that should at least be addressed before the larger question can be answered. I have found that there are three points that have not really been addressed anywhere, or at least not that I have seen.

#1 Selection.
The almost universal request I have heard from my patrons, and that I read the public wants (i.e. when I go beyond any specific library patrons) is that while it is easy to find information, it is difficult to find materials that are reliable. You can hear this in the arguments about newspapers: the blogger vs. the journalist argument, where journalists say that people can't rely on what they read in blogs; it is found in information literacy courses, scholars discuss it, and you find it in the popular press as well. Whether this assertion is true or false (e.g. whether information from professional journalists is really better than what you read in a blog) I am bypassing right now. Still, it seems an obvious where where the library's task of selection can become an important part of the solution.

Traditional library selection is almost completely different from what selection needs to be today (in my opinion), and what selection will necessarily become after the Google-Publisher agreement is implemented, and when open archives begin to really take off. Traditional library selection has always been based on the principle of *inclusion,* i.e. the decision to *include* a resource into the local collection because it will be found useful to my local patrons, and I also determine that it will be useful enough that it will be worth spending the institution's money to purchase the resource, process it, catalog it, perhaps preserve it, and so on. Even when people try to donate books for free, they are very often surprised that their books are not selected because those books will not be found sufficiently useful for the patrons, e.g. all duplicates, or on a topic nobody else cares about, or any of a thousand other reasons.

Selecting something for a library's collection has "never meant" that it is necessarily a *better* resource than others out there, or that the actual selector agrees with two words out of it, or even if the resource is full of absolute lies. We must recognize that there are plenty of lies in libraries now: older materials where the information has become obsolete (e.g. books on surgery from 1880), or because even though they are full of lies, we believe they are still necessary for the patrons. In this regard, I think of the works of the Marquis de Sade, or the Protocols of the Elders of Zion. The purpose of a collection is to serve the needs of its patrons, and not to reflect the morals and opinions of the librarian. That would be censorship. (Although in some countries where librarians follow other ethics, something like this would be considered more like "social protection")

Library selection is not understood at all well by non-librarians, and we see this all the time when localities want to remove, e.g. the Joy of Sex or Huckleberry Finn from a library. I won't go into this here, but the point is: in the new information universe, selection necessarily becomes a process of *exclusion* instead of *inclusion.* What I mean by this is people can very easily find and access all kinds of materials today. What they desperately want and need are information resources that are "reliable" (whatever that may mean) and they realize they are not competent to determine the reliability of something since they are only learning about it. There is too much and they need help from experts in limiting their results.

The upshot of this line of reasoning is: I believe that first of all, library selection is going to have to change to being a "trusted area" where people can go to find information that has a least a level of credibility. (Ross Atkinson wrote a certain amount about this) We would gain many followers and provide a vital service if this were accomplished. It would be a tremendous undertaking that would go far beyond the library community (as the task of library selection does today) but the problems of selection must be faced before the idea of a catalog, which theoretically controls and provides access to what the library has selected, can even begin to be discussed.

#2 Find.
The meaning of "find" has changed in a fundamental way in the last 20 or so years. It certainly means something quite different from when I went to library school and learned to use a card catalog. I don't know what it really means today, but I do know that "finding" on Google is completely different from "finding" in an OPAC, which was different from "finding" in a card catalog. I have the feeling that for someone who considers Google searching to be "normal" (I confess I still find Google results quite bizarre), they expect something quite different from what a library catalog can provide them. So, let's for a moment consider the FRBR user tasks to be correct: "find, identify, select, obtain," the very first part has changed in the popular culture. How has it changed? I don't know, but if it has changed, it must change the rest of the user tasks, that is, if they are still valid.

#3 Where the catalog will be accessed
This concerns the laziness factor again. Just as I pointed out that I can't understand that a student, or almost anybody who has a virtual copy of a book on display in their computer, would actually stand up and go through the process of getting a physical copy of that book--even if they are in the same building--I think the same analogy is valid for library catalog sites on the web. Right now, people use the library catalogs to find the *books on the shelves* but when that goes away (perhaps frighteningly quickly!), why would anyone ever go to a separate library catalog? I think people will be "too lazy" to do that. Therefore, if we want people to use our records (which is what I want) we must create APIs that can be incorporated very easily into sites where our users choose to be, so that they can search our databases in the background. In this way, our records will in essence, go to them. This has additional problems because when you take a catalog record out of a catalog, it begins to look very strange and it will function in bizarre ways. (This happens for various reasons but primarily because catalog records are designed to function within a specific-type of catalog. When they are taken out, things tend to break)

The solution here is to ensure that the records retain their function even when they are in different environments. By this, I mean an AACR2 record working with a DC record with an ONIX record and so on. Doing this is another huge undertaking, but I am sure it can be done, and in the meantime, it will take the catalog in entirely new directions.

Before we discuss what the library catalog is to become, or even if it should continue to exist, I think these issues need to be discussed. And of course, the experiences and skills of librarians become crucial.

Friday, March 26, 2010

RE: [NGC4LIB] Observations of a Librarian on Ebook Readers

Posting to NGC4LIB
replying to:
https://listserv.nd.edu/cgi-bin/wa?A2=ind1003&L=NGC4LIB&D=1&T=0&O=D&P=96802

Bernhard Eversberg wrote:

<snip>
So, oblivion doesn't appear to be round the next corner for libraries, but of course they are no longer the only show in town for people seeking information, and neither, or less and less so, do they contain all the most relevant information on most questions people have.
</snip>

While I agree with what you say, and I would love to see the new library and it's great that it's being used so well, it sort of misses my point. To be fair, perhaps I also did not completely answer Laval's very correct assertion: "And let's be very clear : it's *their* points of view that really matter."

The focus of many of my posts is that librarian skills are vitally needed in today's climate. Laval replies (I believe, but correct me if I'm wrong) that while this may be true, what is important is whether *our patrons* believe that our skills are vitally needed. We can pontificate as much as we like, but it will make no difference if the patrons decide to ignore us and our work.

So, how do people view us? OCLC's study still makes the most sense to me: "College Students' Perceptions of Libraries and Information Resources" (conclusions at) http://www.oclc.org/reports/pdfs/studentperceptions_conclusion.pdf

Some of their findings were that students link librarians with physical books, and students think the library is a nice place to do their homework and to study. These are my experiences as well. I also want to mention something else: "How a College Library is Used" from Maxim: http://www.maxim.com/humor/stupid-fun/84085/how-a-college-library-is-used.html which may be even more true!

And I keep pointing out: what will happen when the Google-Publisher agreement is eventually approved? The decision could even come today or next week! But those millions of books will be made available eventually and if not now probably in the next few years. While students may still want a comfortable place to study (and sleep!), based on my posting about the faculty member finding it too uncomfortable to get an article he desperately needed, I think people will become lazier and lazier, so that even for those who are already sitting in the library, if they can access the books they want immediately on their computers, they will still find it too much trouble to stand up and get the physical copy.

How many of those students we see working in those pictures of that fabulous library you mentioned, if they were looking at a scanned copy of a book they needed on their laptop, would decide to stand up, go into the stacks, hunt around and find the book, etc., or in European libraries, would they order it from a desk and wait? Or would they just use what they had? I'm pretty sure I know what I would do. What if they could do that from home? Or from an Internet Café? Or from anywhere?

What does this mean for us? We have to change in fundamental ways if we want to become a part of their universe and they do not bypass us completely. I think that in the long run, there is little hope that the physical collections can remain very relevant to people, and this makes me sad, but I am concerned about the librarians themselves. It does not follow that librarians' skills also become irrelevant. And we will have to market ourselves somehow so that people realize that they do need us. That may be the most difficult task of all.

Unfortunately, the current library answer that we need to build a tool so that people can "find-identify-select-obtain --> works-expressions-manifestations-items by their authors-titles-subjects" seems to be totally irrelevant to any solution and is certainly not aimed at providing our users what they want. Even if RDA were completely finished today, all the librarians were trained perfectly and the systems installed and functioning, I can't imagine how it would change anything for our patrons at all.

We need to find solutions that people want. Otherwise we are building improved typewriters when people are using word processors and laser printers.

Thursday, March 25, 2010

RE: [NGC4LIB] Observations of a Librarian on Ebook Readers

Posting to NGC4LIB

Laval Hunsucker wrote:
[Concerning my statement that librarians are definitely needed in this new world--JW]

<snip>
It is, rather, our little illusion, kind of a last straw in these latter days. And one that's been vociferously clung onto even by some of our best "LIS" thinkers.

What I *did* once find kind of fun was trying to work out what sort of arrangements / logistics / formal structures would be required to make a viable operational reality of this particular "future for librarians", once the rest was gone -- for a lecture I gave at a meeting of library subject specialists in this country back around twelve or thirteen years ago. Since that time, though, I've lost the feeling that it's really even worth that effort.

You use the term "needed" ( "definitely needed now" ; "definitely be needed in the future" ), but I'd say that's again just looking at the world from our point of view, not from that of this professor you mention and all the others we call "patrons". And let's be very clear : it's *their* points of view that really matter.
</snip>

This is a point that I did not make clear enough in my original message, where I wrote: "... it must be made very clear to the "experts" that finding relevant information is a different task from specializing in a subject." To illustrate this, I'll compare it to someone driving a car. You may be just a regular person driving the car, or you may be Michael Schumacher, but whoever you are, before you can do anything, you need a car maker, a mechanic for the car, and a map to know where you are going. Certainly, you can get into any old car and drive around randomly, but while that may be exciting for a time and you discover some things you've never seen before, it's hard to say that it represents progress if you want to get from, let's say, Oslo to Rome. You can't do it all alone.

Comparing this to the library-information situation, we are the mechanics and the map makers. But as mechanics, we should not concentrate our skills on making buggies for horses, and this is where the car makers come in, to design new means of transportation for people. But first, we need to know where people are going and living. I don't think the mechanics should determine this, but it much more the job of the map makers.

I don't want to push this analogy too far, but basically, it shows where I think we should be going. We are building tools based on FRBR, but to me, the structure of FRBR is the same as that "Horsey Horseless Carriage" http://www.time.com/time/specials/2007/article/0,28804,1658545_1657686,00.html. FRBR/RDA is like a car designed by the mechanic (or in this case, the buggy maker), and we don't even know if it is what people want.

We need to move on. But it is only in this way that we can demonstrate our importance to our users.

<snip>
Everything points, for me, to the conclusion that what's going to *work* (as opposed to what, in our discours, will "be needed") is not librarians but better ( expert ) end-user systems. My experience is that the persons who come up with such systems that are really effective are mostly not librarians -- and they do that, I believe, precisely because they don't think as librarians have for the last fifty years or so been socialized, and are still being socialized, to think. ( As I implicitly argued in a published article [ in Dutch ] some five years ago. )
</snip>

I agree to a point, but why can't the librarian be an integral part of the expert end-user system? Your experience that people who come up with effective systems are not librarians is different from my experience. I have been to several meetings where the "information managers" were demonstrating their systems, and they put in a query to the database, e.g. "milk" and all they got was "cheese." They concluded that this was a good, useful result because cheese is made from milk(?!). They just don't understand that when someone searches for "milk" they want "milk" and not cheese and butter and yogurt and all other dairy goods.

I can state definitively that extremely few people understand a Google result: i.e. why something comes up as #1 and how results can--and are--being manipulated for all kinds of purposes. In this regard, I remember a story from a couple of years ago about how a major U.S. medical database put the word "abortion" on their stop word list. http://www.wired.com/threatlevel/2008/04/a-government-fu/

Who discovered that? A librarian. And it was fixed only because the librarian had the energy and courage to speak up. Read the article for the reasoning of the database manager. It's quite enlightening concerning the differences in values and the entire mental construct of the "information manager" opposed to the librarian.

I know that it is difficult to stay optimistic in these times. Perhaps I live in a fairy land, but I still believe that the skills and ethics of librarians are, and will be, of tremendous benefit. We may be fated to fade away, but that would be a definite harm to society, since the knowledge, skills and ethics of librarians exist nowhere else. So, if we are indeed, doomed to oblivion, it doesn't mean that we have to go meekly, but we can decide to go down only after putting up the very best fight we possibly can.

Because it is only when you struggle and decide to put up a fight that you have even the slightest chance of winning.

It's too soon to give up.

Wednesday, March 24, 2010

Observations of a Librarian on Ebook Readers

Posting to NGC4LIB

I thought I would share some more thoughts on digital materials based on a real-life incident:

Something interesting happened yesterday concerning the “laziness factor” I wrote about in my original posting on ebooks. (See: Observations of a Bookman on his Initial Encounters with an Ebook Reader) This is a real-life example.

A professor at my university is writing a paper for a conference that will take place very soon. He has put everything off until the last moment, and discovered that he desperately needs an article from 1996. He told me that he and his wife searched frantically for hours and hours trying to get a hold of this article that is so crucial to his paper. They had given up and were completely out of ideas.

It turned out that it wasn’t that difficult. I discovered that the journal has not yet been digitized that far back (only to 1998) and a print version doesn’t exist in Italy, so I continued looking, and found that the author had made a personal webpage. One year after the article, the author published a book on exactly the same topic, which is also not in Italy. Looking around a bit more, I discovered that the article the professor wanted had been published later in a book from 2003, which we do not have, but as it turned out, is in another library here in Rome that he can go to. I also found several articles published later that referred to his article, a few with some rather deep analyses of the article he was interested in.

I found that scans of both these books are in Google Books, but while you can preview them, you can’t see either the book or the article in its entirety. But I did manage to get a copy of the author’s dissertation, which is undoubtedly what the article and book are based upon, since scholars normally get as much out of their dissertations as they possibly can.

Well, he was very happy, etc. but it turns out that he is just too busy to go to the library in Rome (about ½ hour on the bus) and there is no ILL between libraries in Rome (people are supposed to go there instead), plus a regular ILL would never get here in time, so it’s up to me to try to get a scan in time. (It turns out that the library does not do scans, but will only send photocopies) While he was in my office, we talked a bit and I showed him my ebook reader. He was interested but said that he prefers physical books.

So, I find the entire incident curious: he told me how he and his wife each spent “hours and hours” in a fruitless attempt to find a digital copy of the article, and I am sure that they did, but they are suddenly too busy (i.e. it’s too much trouble) to get on the bus to go see a physical copy of it. It seems that the one doesn’t follow logically from the other! Also, he says he wants physical books, but I just don’t believe it. He wants the physical book right here, right now, which doesn’t happen in the real universe. It’s like sometimes I may want to fly like a bird to work instead of getting on a crowded bus with a lot of stinky people, but… I go on the bus. It has happened that the bus has arrived immediately and was empty, but that happens very rarely. The alternative is to get a car and drive, and that can be just as bad or worse.

Realize that I honestly am not finding fault with any of this; I am only giving it as an illustration of what I think is a normal, human trait—or human failing if we want to get judgmental about it: we are all illogical beings. To expect logic and consistency from human beings simply makes no sense because we are much more complicated that that. In this case, what brings it all together and makes sense is the very human trait we all share: what I am calling “the laziness factor.” Once we accept such traits are normal, everything makes far more sense.

Relating this to the Google-Publishers agreement that will go through eventually, the professor would then be able to get it all online immediately and although he will probably complain that “it’s just not the same as holding a real book” in his hands, so what? It will be there and he will take it.

So, I think this little vignette may point toward one path leading into the future for librarians. I think it shows that we are definitely needed now and we will definitely be needed in the future (he couldn’t find any of this online in Google Books or the other digital projects), but we won’t be needed to keep the physical books in order on the shelves and check them in and out. We will be needed for other tasks; we will need special tools built and a hundred other things, but it must be made very clear to the “experts” that finding relevant information is a different task from specializing in a subject. And if the “experts” are having serious problems, what does this mean for the rank and file?

Tuesday, March 23, 2010

FW: Well I'm amused

Posting to Autocat

On Mon, 22 Mar 2010 14:15:09 -0400, Aaron Kuperman wrote:

>There's one publisher who frequently republishes government documents (though unless you want them online, they may not be convenient to acquire otherwise). I try to include a note connecting the reprints to the originals, but other catalogers focus on just what is on the title page (which pretends that they aren't reprints).
>
>On Mon, 22 Mar 2010, Vogel, John L wrote:
[The originals are online, thereby saving almost $200]

In a print culture services such as these may have had at least some kind of reason for existing: although they may have made far too much money, they at least helped make public domain documents more accessible. Today, there is less and less purpose to this type of function and must now be recognized as almost useless and practically a scam.

Of course, this situation is not limited only to government documents, but all materials in the public domain and now through Creative Commons, which is a huge and growing area. So long as digital documents are too uncomfortable to read as they are, they will be printed out, consequently adding a binding makes them more useful, and that keeps these sorts of businesses on the horizon, but the moment ebook readers become popular, all that will change. (If it hasn't completely changed already because of the costs involved in this economic climate)

The publishers will want to continue as before: to charge $200 or so for what really will be nothing, and I think it will become part of the job of catalogers to show what is really available to our users. This will conflict with the publishers' intentions though, since their bread and butter will be to make sure people are not aware of these other versions. Retailers such as Amazon won't be interested in steering people to these versions either, since they will want a piece of the action as well. I can't predict how a company such as Google will play into this.

We need tools that will make the task of finding textual copies much easier and efficient, perhaps a type of "Turnitin.com" for catalogers, which will try to find different versions, or using web search engines, perhaps Google Books, and other innovative tools, to seek out textual copies easily and quickly.

Monday, March 22, 2010

Observations on ebook readers (continued)

Posting to NGC4LIB

For those who are interested in the continuation of the posting I made on my initial experiences with an ebook reader (see https://listserv.nd.edu/cgi-bin/wa?A2=ind1003&L=NGC4LIB&T=0&F=&S=&P=42853, or on my blog at http://catalogingmatters.blogspot.com/2010/03/observations-of-bookman-on-his-initial.html), I suggest the following:

"The future of publishing: Why ebooks failed in 2000, and what it means for 2010" by Michael Mace (a former executive with Palm, Apple, etc.)
http://rubiconconsulting.com/insight/winmarkets/michael_mace/2010/03/the-future-of-publishing-why-e.html

A "news report of the future" about Blockbuster Video stores from the Onion:
http://www.theonion.com/video/historic-blockbuster-store-offers-glimpse-of-how-m,14233/

The article by Michael Mace is quite insightful and he comes to a different conclusion than I do: that ebooks will not find much of a following any time soon. At the same time, he asks some very telling questions about the functions of a publisher and what kind of future they should prepare for. I agree with much of what he writes, but still believe the near future is better for ebooks than he predicts. My basic points of disagreement are:

- "There were not enough books in 2000." Mr. Mace goes on to claim that there hasn't been that much of an advance onwards to 2010, but I think he is ignoring the ebook projects of Google, the Internet Archive and dozens of others, including those of lots of libraries. He mentions "older" books and gives as an example only Robert Heinlein, who died in 1988 and whose works are therefore completely under copyright.

Of course, concentrating only on the newest materials ignores much of what is available when people actually go to a library and consequently, what is available through library digitization projects. When you add in these materials, there is enough to keep people busy for several lifetimes. Although it may be that people *really* are not interested in older materials and will be willing to pay a premium for the newest books, I reply that this remains to be seen. For example, if someone has an ebook reader, why would anyone pay for a copy of anything by Mark Twain or Charles Dickens? For modern commentary, OK, but how many really want that except for students?

And once the Google-Publisher agreement is accepted, at least many of these newer materials will become available. This is when, as I mentioned in my original essay, the laziness factor may very well kick in. (See the video from the Onion about this) In addition, I want to emphasize the "free" factor.

As a concrete example, when I have shown this page to people: http://www.digitalbookindex.org/_search/search002a.asp?AUTHOR=Twain,%20Mark I point out the various editions of "A Connecticut Yankee In King Arthur's Court" and show six versions, five of which are free and one is for pay. I ask, "Which one would you choose?" and the invariable answer is: laughter. No one will choose the one for pay, *once they know* there are the free versions.

I still maintain that if people knew what is really available to them--right now, today, they may be as overwhelmed as I am. This may change the entire dynamics of reading and "information gathering" in fundamental ways.

- Mr. Mace discusses the marketing of the Sony Ebook Reader, whose advertising states that it can hold 350 books, and asks: "Unfortunately, how many people do you know who want to carry 350 books at one time?" calling this "phantom value" since in his opinion, there are very few people who would want so many books. My reply is: there are many more of these sorts of people than he would think. For those who are not interested in books, it will not matter if a reader can hold one or 100,000, but for those who are interested in books, it makes a tremendous difference. I believe that an ebook reader should not be compared to a single book, but to a library.

To continue, I really like the "Horsey Horseless Carriage" that he points out! Plus, one commenter on his blog mentioned that even though he loves printed books, he has discovered that he prefers reading ebooks to regular books. This post has made me reconsider, because although I feel guilty about it, I confess I have discovered this as well. But I want to understand this a bit better and I suspect it may have more to do with disliking certain book formats, especially concerning the hardbound vs. paperback.

A *well-made* hardbound book will open flat and will be easy to hold, turn the pages, etc. I think of "The Library of America" series in this regard, based on the beautiful Pleiade series, which is excellently produced and a joy to use. Most paperbacks however, are glued and therefore do not open nearly as easily and mostly cannot open flat, especially the fat paperbacks that are so popular today. (Dover paperback reprints are sewn, so those I love!) Handling one of those fat paperbacks can be very awkward and becomes tiring rather quickly, especially for bedtime reading. A quarto of almost any kind is practically impossible to handle and must eventually be used on a table. Of course, there are no troubles of this kind with an ebook reader, but other troubles.

All in all, this is a very interesting essay.

The video from the Onion is both funny--and frightening--for a librarian. I don't need to discuss it since its message is clear enough, but I don't want that to be the future of libraries.



Historic ‘Blockbuster’ Store Offers Glimpse Of How Movies Were Rented In The Past

Wednesday, March 17, 2010

RE: Browsing (was Re: [RDA-L] Contents of Manifestations as Entities)

Posting to RDA-L from Chris Beer:
http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/msg03371.html

Chris,

Thanks so much for the information, but I am looking at it another way. In my experience, when you are working with any tool, there are different ways of working with it: the way it was designed to work, or not. People who are experienced will use the tool as it was designed to be used; those who are not experienced use that same tool in different ways.

So, if you are using a power saw, someone with experience will use it in one way, while someone without experience will use that same power saw completely differently, probably incorrectly, and may saw off a body part or two. The experienced person may figure out some innovative ways of using that power saw, but there are still certain ways of using it that cannot be transgressed.

This is the way I view the library catalog: it is a tool, and its basic design occurred in an evolutionary manner over a long period of time. But, it has not changed in its particulars since it was codified primarily in the mid-19th century. The primary means of control are through the use of "consistency" in the creation of bibliographical description, and by the use of "organization" in the creation of what we always termed "access points," i.e. the formal places in the catalog where searchers could find a card (or an entry in a printed catalog). Therefore, if you decided not to make a card for a title with author main entry (which happened), that means that it was *impossible* to find e.g. Twain's Huck Finn under the title, unless a cataloger made a mistake.

It is only the organization of the cards in the catalog that make a heading coherent. For example, only the organization of the cards make a heading such as "Agnelli family--Homes and haunts--Italy--Villar Perosa--Pictorial works" less ludicrous than it seems when taken out of the catalog. When it is removed from the catalog as I did here, people will say, and I agree, that *nobody in the world* would ever think of anything in such terms. "Homes and haunts"???? "Pictorial works"???? But, when seen in relation to the other "cards" around it, the heading begins to make sense. See: http://catalog.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&Search_Arg=agnelli%20family&Search_Code=SUBJ_&CNT=100&hist=1 and do some browsing. You will see how the headings work together, with various subheadings for the family, then comes each family member, often with their own subdivisions, and so on and so on.

There is a power shown here in the catalog that is not found in search engines. The experienced librarian understands how this tool is designed to work, but fewer and fewer patrons today know this; many fewer understand it, and their numbers are diminishing all the time.

I apologize for this lengthy discussion, but I fear that this sort of knowledge is being lost. For centuries, the premise of how our tool worked was based on browsing, because that was all anybody could do. The introduction of keyword eliminated this. The modern view is that the library catalog is a database, and while I agree with this I add the proviso: it is a card catalog stuck into a database, where a lot of things that worked fairly well before don't work very well at all now. These basic premises were never seriously rethought and catalogers continue to make headings that are designed to be browsed.

But we certainly agree about opening up the data for experimentation. This is the only solution. I think that the headings we make, even the crazy-looking ones, are still very useful, but something must change somehow. Making APIs is definitely the way to go, but that essentially means that the records will "be taken out" of the catalog into a new environment. As I tried to show, the result will be that the headings will look ludicrous, and this is not even mentioning how people can go about finding them. We must find solutions.

These are some of the tremendous changes of the possibly not-too-distant future that we need to be dealing with. And why I sort of think that discussing what are manifestations, items, expressions and works is a bit like sweeping the porch when the tsunami is coming.

Tuesday, March 16, 2010

RE: [NGC4LIB] OCLC and Michigan State at Impasse Over SkyRiver Cataloging, Resource Sharing Costs

Post to NGC4LIB

On the other hand, what percentage of the Worldcat visitors are librarians trolling for cataloging copy? Based on conversations with colleagues, this percentage could be rather high.

Although the absolute numbers on Alexa and Compete (didn't know that one) may not be trustworthy; it's the relative impact that impresses me.

The original point of this part of the thread was when someone wrote:
"Two reasons we want our holdings to display in worldcat.org: when someone is at an auction and considering a book appropriate for our collection, they would be able to tell in one search if we already had a copy, and if it was widely held; and we want our rarer holdings to be visible to researchers."

I replied that this was good, but the statistics show that very few people actually go to WorldCat, so therefore, if you want to be visible to researchers, it is no longer enough to put a record in your local catalog and send a copy to OCLC. This seems like a logical conclusion and I see no reason to revise it. If people on this list really want to believe that scads and scads of people we want to reach are going to Worldcat, that's fine with me, but it is certainly not my experience at all. I've even built tools that make searching Worldcat easier than most other catalogs, but very few of my patrons use it. Certainly very few understand it and when somebody asks me about Worldcat, I am surprised.

Again, I am not finding fault with anyone or anything. Worldcat only went live a few years ago so it is readily understandable that few people know about it, but since it is such a latecomer to the web, it may never catch up. People who are on Wikipedia (and I shall add Google Scholar) are definitely looking for information, so there seems to be some kind of relationship between them and the people on Worldcat.

Sitemaps in Google are great. I am just concerned that we change our traditional attitude of: put a catalog record into my own local catalog, throw a copy onto Worldcat, and it's done. No, it's not. Continue to do that, but realize that now it's only the beginning. We have to find where our patrons are and go to them. This is not an easy task, but absolutely imperative, in my opinion.

FW: Infrastructure for RDA

Posting to Autocat

On Mon, 15 Mar 2010 11:30:59 -0500, Suzanne Stauffer <stauffer@LSU.EDU> wrote:

>Speaking as one user, if the catalog is restricted to items owned by the library, that is EXACTLY what I want to see when searching for a work of fiction for recreational reading.
...

>I'm interested in knowing what you think our users want to see?

People should be able to see what is in the local collection, and that should be no problem at all. But, let's imagine it is 1 or 2 years from now and the ebook reader has taken off. They are becoming useful, and there is a great deal of interest among the business community (at last!) to create the equivalent of the Ipod for ebooks. (Not to toot my own horn, but I recently bought one and wrote a piece about my experiences, sent it to another list, and put it on my blog:
http://catalogingmatters.blogspot.com/2010/03/observations-of-bookman-on-his-initial.html
This may interest people)

When (not if) the ebook reader takes off, I will want to know (and many want to know right now) what is available for download, and at that moment, people will be faced with a staggering amount: hundreds of thousands (at least) of freely downloadable books in Google Books and the Internet Archive, plus lots from other projects, such as Gallica and Europeana, but more ebooks and edocuments from think tanks, international organizations, and many, many other places. The numbers will probably skyrocket when the ebook reader becomes popular. There are big problems with this, as I try to outline in my blog post.

For another idea, see my previous post in this thread, where I experiment with a record to show how people could use the catalog record with dbpedia, which seems as if it would be much more interesting than what we have today. It makes me think that there should be a project that links our headings with dbpedia. That could be done now using id.loc.gov. If librarians and catalogers worked with dbpedia to make useful links to materials all over the place, then we would be creating something new and useful for society.

FW: [RDA-L] Contents of Manifestations as Entities

Posting to RDA-L

BEER,Chris wrote:
<snip>
Of course - browse is simply a single view of data, using a single type of abstraction layer (human readable in this case) to generate that view.
</snip>

I do think that browse is a bit more than that: it is the way people are *supposed* to search the system. It is the way the catalog was designed to function correctly. 99% of the control that librarians provide is based on headings. Browse searches make these headings much more comprehensible than simple keyword. For example, subject headings with their many subdivisions, make sense only in the aggregate, and are designed to be browsed alphabetically (mostly). Uniform titles are the same, along with corporate bodies. Personal names, less so, but with personal names, the variants (4xx, 5xx) are critical to browse.

The problem is, the moment keyword became the dominant way for people to search (which was about 2 minutes after it was implemented), the traditional browse became stranger and stranger. Catalogers and other librarians caught on to this change very slowly, and some never at all. The undergraduates I work with now think browses are very, very weird. As a result, our catalogs, traditionally based on browsing cards, based in turn on printed catalogs, are becoming more and more distant from our patrons. Librarians never really reconsidered the function of the catalog--they just tacked on keyword and thought they were done.

The task is not to expect everyone to use the browse search again and teach/force them to do it, since this is impossible and retrograde, but to adapt the power of our records to the new environment where traditional browsing does not occur and never will again. We must accept that those days are gone forever. At the same time, browsing the headings is very powerful and something you *cannot* do in a search engine. Tools such as Aquabrowser have tried some new methods to a point, but I don't know if any has succeeded.

I like to think of these things in a different way: there were always big problems with browsing. It was never the greatest thing to do and it was always very complicated. How can we make it better?

Monday, March 15, 2010

RE: Infrastructure for RDA

Posting to Autocat

Following the same general idea of my previous post: of whether the FRBR/RDA displays are what our users *really* want, I decided to do an experiment using a record I just made, trying to interoperate with dbpedia, which is an "authority file" of sorts in use at Wikipedia.
http://www.galileo.aur.it/opac-tmpl/npl/en/pages/dbpediatest.html

The catalog record is only semi-functional, but the purpose of it is with the headings. You will see the little icon that denotes "open another window" to the right of each subject. Using this, I made links into parts of dbpedia that seemed the most relevant to each subject.

The display of dbpedia is really overwhelming--there's a lot of variant languages that could disappear and the entire display could be made friendlier, that that is not the point of it since you can download the entire thing--but there is some really useful information related to each topic, but the subjects are not really 1:1. Note especially the redirects, which function as UFs.

I couldn't find anything specific for the Abu-Ghraib prison.

People may also be interested in the RDF Book Mashup that works with Dbpedia http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/. He has some examples, some of which work.

RE: [NGC4LIB] OCLC and Michigan State at Impasse Over SkyRiver Cataloging, Resource Sharing Costs

Posting to NGC4LIB
Laval Hunsucker wrote:
<snip>
I have no 'proof' immediately available one way or the other ( and would like to know what methodology, what valid and reliable research results, and what apt inferential statistics lie behind the declaration that "it can be proven that very, very few people use Worldcat or even know about it" ; maybe you have a quick reference at hand ), but I've for some years been constantly hearing disciplinary scholars saying in an off-hand fashion that their search in WorldCat yielded such-and-such a result. Isn't at least some of this dependent upon how visible the resource has locally been made, and how well it has been locally marketed ?
</snip>

If you look at the Alexa site, you can see the statistics:
Worldcat: http://www.alexa.com/siteinfo/worldcat.org
Yesterday 0.022 +10% (they had a good day yesterday)
7 day 0.0177 -3%
1 month 0.0164 +1%
3 month 0.0164 -7%

Therefore, about 0.016% of searches go to Worldcat.

Compare to LibraryThing:
http://www.alexa.com/siteinfo/librarything.com
Yesterday 0.018 -30%
7 day 0.0254 -0.7%
1 month 0.0259 -8%
3 month 0.0277 -1%

Therefore, LibraryThing has a much higher percentage of use than Worldcat. (0.027%)

Compare to Wikipedia:
http://www.alexa.com/siteinfo/wikipedia.org
Yesterday 12.51 -5.3%
7 day 12.97 +0.2%
1 month 12.975 -0.09%
3 month 12.748 +13.77%

Wikipedia gets a major number of hits.

Based on these statistics (which have remained pretty constant) it would make sense to conclude that a record placed into WorldCat will make it available to the least number of people. Placing it in LibraryThing or Wikipedia would increase its use. The conclusion I make from this is that while we can go ahead and put our records into WorldCat, we shouldn't expect too much. We need other routes as well.

Perhaps this is unfortunate or sad, but it is a fact nevertheless. What about when we add Google Scholar and Google Books into the equation? A friend of mine at FAO of the UN mentioned that they had recently placed the AGRIS database (an agricultural database) into Google Scholar and the hits went up exponentially. This only makes sense.

<snip>
May well, but library-centric focus in itself needn't -- witness my experience, but also in principle -- disqualify an instrument as enduser-appropriate and enduser-used, at least in a full-blown academic environment. Other factors can play determinant roles.
</snip>

Agreed, but products that are objectively better die every day if you can't get people to use them. Libraries have their collections and these collections need to be used, so I don't care how people find out about my materials. I can suggest certain ways, but if people want to use their own methods, that's fine with me. This is the world that is changing in fundamental, and as yet, very unclear ways.

What is important is to save libraries (in whatever form they take) and the values of librarianship. We should not confuse this with maintaining "OCLC" and/or "WorldCat." Either one of these entities could disappear and libraries should be able to continue.

<snip>
Perhaps -- but what will then be left of librarianship after they have been really solved ? Of course, once *all* of the librarianship problems have been solved ( i.e. from the perspective of those outside parties who alone ( can ) lend the field its legitimacy ), librarianship will not exist, or at least need to exist. Shouldn't your consolation ( as an apologist/advocate ) be predicated on the thought that not all of those problems will ever be solved ?
</snip>

Interesting question. I think there will be many solutions, but no "ultimate solution." As soon as one "solution" is implemented, a dozen more will arise. When you solve these dozen, your original solution will have to be rethought.

Some may consider this to be the very essence of futility, but to me, it represents the idea of progress. As new ideas and capabilities arise, you must adapt yourself to them. Just as the card catalog solved a myriad of problems, and created a host of others, we are in a similar situation today. The various ways of producing, and even using, cards evolved, and so will the methods we devise.


RE: [NGC4LIB] Observations on ebook readers

Posting to NGC4LIB

Harvey Hahn wrote:
<snip>
The concern I wish to share is that of altering free public domain text without any obvious or external clues to that effect other than comparing the text with other sources.

A blatant example I recently became aware of is a work by the famous market trader W.D. Gann entitled "The Magic Word". The 2008 "Revised Edition" published by The Richest Man In Babylon Publishing CHANGED(!) the "magic word", substituted a different translation for the selected Bible verses contained therein, and omitted significant amounts of original text (and probably altered other text, too). Yet the title and author attribution remain the same (the author died in 1955, so he could NOT have "revised" this work). Someone who did not know of these major alterations might think this is essentially the original work with a few updates here and there, not a whole new work! (This publisher has done similar "editing" and reformatting of other Gann works, to the detriment of the originals, which depend on the exactness of language, pagination, etc.)

</snip>

You are absolutely right. But it would be a mistake to think that this does not happen with printed materials. It has since day 1. Normal library catalogers deal with these issues much less often than rare book catalogers, and especially, catalogers who work for antiquarian book dealers.

The very concept of "edition" is very different for each group. Actually, I think the discussion on Wikipedia describes it as well as anything I have seen http://en.wikipedia.org/wiki/Edition_%28book%29#Collectors.27_definition. I also suggest the excellent page at http://www.bookthink.com/0003/03beid.htm for a much more indepth terminology of "edition" in antiquarian terms.

The determination of "who really wrote what" was some of the work the scholars did at the Library of Alexandria, because often, they would have a text that would purport to be by, e.g. Aristotle and they would determine that it was not Aristotle, make up a persona and call it "Pseudo-Aristotle." Then they had to detail the textual variants and so on, which was all absolutely necessary to determine what Aristotle *really* wrote in a manuscript world.

But of course, it all continued into the world of printing. Early printed books show lots of lots of major variants, but even modern ones do as well. I'll bet that if you would look at the items in a library's collection that claim to be "copies" many can look quite different. The layout on the t.p. can be different; the array of dates on the t.p. verso can be quite different, so it would be logical to assume that at least some of the text is different too, but for us, so long as it fits into LCRI 1.0 that it has the same: 245 abc, 250, 260 abc, 300 a, 4xx, it is by definition a copy, even though the text itself may be quite different because we do not compare the text. Rare book dealers' bread and butter goes much deeper than this, though. Take a look at the points for a Harry Potter book at: http://www.fedpo.com/BookDetail.php/Harry-Potter-Phil-Stone to discover how they look for a true 1st edition. And while I may not care, a lot of others do and the difference can be amazing. Here is an example of how much money you can get if you find that real 1st edition! http://www.abebooks.com/servlet/BookDetailsPL?bi=1284941522. Wow! Maybe I would care after all! Signed editions go for much, much more.

Regular catalogers do not and cannot go into such detail, but we will have to find solutions to the problem you mention in this networked world. Still, there is this basic tension between, in FRBR terms, the work/expression (abstract) and the manifestation/item (physical). Normal catalogers assume that what they see in the 245, 250, 260, 300, 4xx describes the text (i.e. work/expression), and rare book catalogers do not. For example, I may catalog a book that says it is the text of "The Old Man and the Sea" but it starts out with "Call me Ishmael." As a cataloger, I would not bet very much money at all that the text from one "item" of a book is 100% the same as another "item," but yet they are considered to be the same "manifestation." Still, I would bet the house that the 245, 250, 260, etc. information is the same.

In the digital world, we have different possibilities available to describe the text by using automatic word counts, file compares, and other tools which could allow for a much higher degree of exactitude than ever before. Perhaps people will be able to get lists of differences between one text and another, a kind of automatic manuscript collation. Or maybe we'll say that a difference under 5% means it's a copy. Maybe we'll even bring back the idea of the autograph and archetypal copy.

It's really an interesting time!

RE: Infrastructure for RDA

Posting to Autocat

On Sun, 14 Mar 2010 14:48:23 -0800, J. McRee Elrod wrote:

>You are also right, I fear, that ILS's will do no better job of
>displaying RDA than AACR2, particularly since RDA gives *no* guidance
>concerning display. We are advising clients to stick as close to ISBD
>as possible for display.

I don't see how the RDA *display of the individual bibliographic record* will change at all from what we have now. (Pardons for using the obsolete "bibliographic record") Behind the scenes, most of the the information will be imported from various other "records" work/expression and from the name, subject authority files, but the public display of an individual bibliographic record/instance/whatever will be substantially the same.

Where it will be different will be in the display of the various "manifestations." Instead of having separate and successive bibliographic records in a multiple display, similar to the card catalog, they will all be merged. The practical example I think that shows FRBR displays the best is FictionFinder http://fictionfinder.oclc.org/

If you look at anything in there (I am looking at the record for Stevenson's "Kidnapped") you will see the basic information, plus 468 editions in different languages and formats. You can arrange and limit these editions in all kinds of ways. They have all done a really great job and we can see more or less clearly the work, expressions, manifestations and items.

But we have to ask ourselves: is this what our users want? It seems to me this is what we are aiming for with FRBR and RDA, and if we achieve this generally for all of our materials, we can confidently claim success. Yet, would anyone outside of the library community consider such displays a success? If we achieve these kinds of displays, will we find our users coming back to use them? Ask yourselves: is this what you want when you do a search? That is, not as a cataloger, but as a user or researcher?

I have to say for myself: absolutely not. Such a display would never have helped me in my researches, and as a non-cataloger, such a display would probably be both obscure and frightening. In fact, I think these displays actually harken back to a much earlier time and are based on Panizzi's catalog. (I have been doing a lot of thinking about this and think it would be fascinating to reopen the discussion he began, now that we have some working prototypes based on his ideas)

To continue, I believe that users will find such displays very strange and of little utility. To me, the FictionFinder project shows very convincingly, that the FRBR declaration that users want to "find, identify, select and obtain: works, expressions, manifestations, and items" is definitively wrong.

What do people really want? I don't know but there is a lot of research going on right now. Obviously, the way people use the web is changing all the time. But FRBR seems extremely suspect.

Friday, March 12, 2010

Observations of a Bookman on his Initial Encounters with an Ebook Reader

Post to NGC4LIB
Observations of a Bookman
on his Initial Encounters
with an Ebook Reader


A few weeks ago, I bought an ebook reader and I thought it would be interesting to record how I feel about it. In the interests of full disclosure, I will state that I am a hopeless lover of books, with several thousand of them in our small apartment. We buy at least a couple every weekend and lose very few to attrition, so in my more lucid moments, I realize that disaster looms ahead. Perhaps I should also mention that I very rarely write in a book since that is how I was brought up.

At the same time, I am currently the director of a small library in Rome, and most Italian libraries do not let you borrow books for home; as a consequence, since I left the US and in particular the magnificent collection at Princeton University, I have felt rather disadvantaged in relation to books. As my interest has always leaned toward history, I tend to emphasize the older publications more than most people. This is one reason why I have been so interested in the mass scanning projects of Google, Microsoft (now in the Internet Archive), Gallica, and the many other projects on the web.

While I have been overjoyed to see these wonderful books available online, it has also been terribly frustrating for me since I can see, and even download, volumes I find online--for free!--but I have not been able to read them on a regular computer screen because it is simply too hard on my eyes. Therefore, I have been intensely interested in ebook technology, and now, after several years of reading quite literally every review I could find, I decided to buy an ebook reader: not a Kindle, but a Sony. I've used it for a few weeks and can now begin to inquire: How do I feel about it?

In short, I absolutely love it. I have discovered that for the first time I can read--and enjoy--a digital book that I have downloaded. It turns out that I use the Internet Archive much more than ever, more than Google Books, but I have downloaded some beautiful publications from Gallica and other projects as well. I realize that the Google Books agreement with the publishers will be ratified eventually, and I want it sooner rather than later. Additionally, I am happy that I can finally dispense with wasting paper by printing off long web pages I want to read but are too uncomfortable to do so online. I simply make a pdf file and put it on my reader where I can read it anywhere, make notes, look things up in the dictionary, and so on.

Naturally, there are problems. First, I would prefer better contrast control, and since it is only gray on gray, some pdfs are better than others. For example, if you download an electronic book from the Internet Archive, you should choose the black and white pdf, and not the full-colored one. Also, the screen sizes of some pdfs make them more difficult to read and the zoom feature can be very awkward. As an example, I downloaded a beautiful 1600s edition of Burton's Anatomy of melancholy. The image is just slightly too small on my ebook reader to read comfortably and the zoom function needs improvement. There are a few other minor problems as well, but they pale in comparison with what I can actually do. I can even write in the books without any guilt whatsoever, and it turns out that as you highlight text and make notes, you create your own "common place book" by default!

Yet, the purpose of this essay is to discover how all of this has affected me personally.

The biggest surprise, and a very pleasant one, is that I am rediscovering the excitement I experienced when I walked into a large library the very first time. When I stood alone in the stacks of that first large library, I suddenly understood that I was surrounded by hundreds of thousands of the greatest books ever written and any of them were now available to me. (I have never felt the urge that I "absolutely had to get" the latest diet book or latest novel, so I have not found this to be a serious issue) Also, as I carry my ebook reader around, it is slowly dawning on me that wherever I go I can take dozens of books with me easily, and in just a few years, it could be the equivalent of my entire book collection. And in just a few more years, it could be the equivalent of the Library of Congress.

There has been another revelation about myself that has not been quite so positive: I have discovered how lazy I can immediately become. Already, I confess to a small twinge in the very back of my mind that finds it too "cumbersome" to have to go through the motions of downloading the book I want, take out the wire and plug things in, and transfer the file onto my reader, even though it takes only a few minutes altogether, all the while fully aware that getting the physical item would demand far more time and effort, that is, if I could get the item at all. In spite of this personal failing of my own however, I have found it to be a liberating experience.

An important sideline though, is that I am an "expert" in information retrieval and most people are not. Therefore, I know about the existence of the Internet Archive, Making of America, Gallica, Scirus, the Digital Book Index, and a host of other sites on the web where I can download ebooks and edocuments. I know a lot of the problems to be encountered when searching these sites: their advantages and disadvantages. I understand the different formats; I have some experience to help me know where I can probably find--or not find--a specific publication, and what is probably available, and what is probably not. Therefore, I have a sense pretty quickly that, e.g. Bury's Idea of Progress, or a novel of H.G. Wells is probably available for download somewhere on the web and where I would most likely find them. Perhaps most importantly, I know I will almost never be able to find what I want with a single search, but when looking for certain materials, with diligence I should succeed. Without that knowledge, I don't believe that I would feel quite so liberated.

In fact, when I have talked about this with some of my patrons, I fancy myself in a situation similar to that of Ainsworth Spofford, who worked at the Library of Congress back in the later 1800s. He just knew where all the books were in the Library of Congress!

"Spofford freely admitted that his approach to subject collocation had made both the 1869 printed subject catalog and the shelf classification scheme into "subjective" systems. But he claimed that the subjective nature of the shelf system did not matter as long as the speedy retrieval of items was accomplished. The latter was possible because both he and his "intelligent assistants" were so familiar with the idiosyncrasies of the system that they simply knew where things were."
See: Miksa, Francis L. The development of classification at the Library of Congress. Occasional papers (University of Illinois at Urbana-Champaign. Graduate School of Library Science) ; no. 164. p. 14. http://hdl.handle.net/2142/3957]

Did Spofford and his people really know where all the books were? I seriously doubt it, but it's irrelevant, because although very few want to contemplate such matters at length, each person, no matter how dedicated or how intelligent, eventually retires and/or dies; today they will often move on to other jobs, but eventually all their knowledge simply disappears. This is why we build such things as library catalogs, so that others are not overly reliant on what exists in the head of only one or two people.

Thankfully, I am not fool enough to declare that I know where all the materials are on the Internet, but I do have some skills and knowledge that have made it much easier and I want to share those skills and knowledge with other people. Of course, others out there know many things I do not know and if they are willing to share everyone gains.

The next logical question is: Has a traditional library catalog helped me find any of these wonderful books on the web? No... and yes. Let me explain.

It has happened that I want something in the Internet Archive, but it is in a multi-volume set. Very rarely are there any contents notes telling me that a certain story or novel is in volume 16 of someone's 25 volume set of collected works. This is where I can go into WorldCat or the LC catalog where I hope to be lucky enough to find a contents note for the set, or perhaps the volumes will be analysed in such a way that I can know which volume contains the work I want. This way I don't have to waste my time, and the bandwidth, downloading lots of materials that I do not need.

To summarize: while I still want some improvements in my ebook reader, it has already made a difference in my life since I can now read materials that were, for all intents and purposes, frustratingly unavailable to me; I discovered that I am lazy (my mother would say that she already knew that long ago) and these materials are easier to obtain than ever; finally I have a suspicion that one of the main reasons I find the ebook reader so useful is that finding interesting materials to place on it is relatively simple for me.

So, what can I conclude from this? If the ebook readers improve and come down in price as they have in the case of mp3 players, then it is reasonable to assume that ebook readers will become as common and as varied as audio players are today. This could occur very quickly if the price comes down enough and people realize the wealth of materials that are available. From a personal point of view, I think something like this could be extremely good for society, as people discover that great materials are available for free. Therefore some of the greatest writings from the past that are the equal of anything written today could be read widely again by the general populace. I even fantasize that a "New Enlightenment" could take place as these great writings are "rediscovered" and they regain what I believe to be their rightful place in our society.

Is it so fantastical to imagine that someone looking for books to download will see that the latest Anne Rice novel can be had only for a high price, but discovers that Laurence Sterne's works, or Bram Stoker's works, are all available for free? Which would you choose? Although I may be wrong, I think many, many people would opt for the free; or at least give them a real try. And of those people willing to give them a try, some may even prefer the older works to the newer works. People may once again experience the power of the older works. Wouldn't it be amazing if Thomas Paine's incendiary Common Sense caught on anew in the popular imagination and modern governments again tried to ban it? Or what if someone "updated" it? In any case, I think it may be possible that older works, out of copyright, may play a more vital role in our society than before.

Of course, this is premised on people being able to find materials quickly and easily. It looks to me like there is a need for some kind of bibliographic tool somewhere.

But something frightens me in such a scenario. Once the scans of the millions of books in Google Books become generally available, when it eventually becomes possible to read any book on any ebook reader, and as more and more texts are made available, both free and for pay, the laziness factor may kick in. I don't believe I am all that much lazier than the next fellow (my mother may have disagreed) and I try to picture what I would do when I would find an electronic version for a book on my computer at home. I also imagine that I am the greatest curmudgeon who ever lived and I have gone into ecstasies over how much I hate digitized books--that they could never, ever replace printed books and I would not allow one into my home, but here I am looking at a digital book that I want.

I suspect that even though I may dislike the electronic book to the depths of my soul, would I hate it enough to: get up, find my keys, get into the car, drive to the library, park, go in, search the catalog, enter the stacks, hopefully find the book where it was supposed to be, check it out, then get back into the car and drive home? How many people will do all that just so they can hold the book in their hands for a couple of weeks? I confess that I would most probably be lazy enough to just take what I could get easily and have done with it.

Plus, I doubt very seriously if I would ever tell anyone what I had done.

If two major events such as these occurred at more or less the same time: the Google Books agreement is approved, and people like ebook readers as much as I do, the two together could prove to be the double whammy that make the physical library a very lonely place. Although publishers have their concerns about selling digital books, sooner or later they will have no choice except to give the public what it wants. Almost everything published in the last twenty years or so has been created on a computer, and all the publishers would have to do is decide to make it available. If my scenario above holds about few people going through the motions to get a physical copy of a book that they can read online and usage of the physical collections declines, there seem to be few choices for physical libraries. Perhaps librarians would decide to try to keep their physical collections relevant and initiate new plans based on services such as NetFlix. The library would deliver the book(s) directly to your home or office and you will be able to keep it as long as you want. Nevertheless, it seems to me that if this sort of thing happens, it will represent the death spasms of the physical library.

It should be obvious from the preceding however, that I think this may happen in any case. While I may be wrong, I think that we can assume--based on previous experience--that ebook reader technology will continue to improve significantly and become much cheaper. Companies must and will fill that demand. Libraries must do the same thing.

Simple economic necessity will dictate a lot of it. Ebook readers are still between $200 and $500, but compare that to buying a regular book, for example, physical versions of Lord Byron's poetical works are available now for $25.99 on Amazon.com; electronic versions are available for $5.00. But there are other versions that can be downloaded for free. Does Amazon.com want its patrons to know that? Will any ebook provider help you to know that Byron's works are available for free when they are trying to sell them at a profit? Very few, I think. It must make book publishers shudder to think that the moment an author's works go into the public domain, there will no longer be a reason for anyone to pay for them. Ever. Also, what will happen when the Google Books agreement goes through?

No matter what happens, I think that the public will still need the help of librarians for a long time to come, but librarians must begin to broaden their focus from their local collections, beloved as they are, to the greater world that must become increasingly useful as these materials become easier to access and more pleasant to use.

To achieve this, librarians must begin to realize exactly what their skills are. The skills of managing a local physical collection may soon become similar to those of the traditional village blacksmith: skills absolutely vital to the workings of their communities--until the advent of the automobile. On the other hand, it is my fervent belief that information does not organize itself, no matter how hair-raising the "information retrieval algorithms" may become. People will always need help and human intervention. Today's amazing technology should not render humans into passive receivers of whatever is given to them, but humans should use the technology to advance their own needs and interests.

Conclusions
My initial observations lead me to assume that ebook readers will become very popular rather quickly. While this is an undoubtedly positive step, simply placing ebook readers in the hands of untrained persons could render them only more helpless prey to companies that will point them only to those items they will supply for a price, although the same works may be available for free elsewhere. These companies and organizations may have other political or moral agendas. This is not finding fault, only stating a fact about the realities of modern corporations. In such cases, it will be a paramount interest for these agencies to keep the choices of the general public limited to those that will benefit the agencies' own profit and interests.

I believe that this is exactly the area where both the ethics and the tools of librarians can play a pivotal role not only for our own patrons, but for all of society: that is, to help people know what is really and truly available to them, no matter where it happens to be--not just what is in our physical collections, not just what is in Google Books, not just what is in our expensive online databases, but what is really available. Of course, this is a huge amount of material and people will continue to require a lot of help to navigate it and use it wisely. We also need to let people know what is available to them even though it may conflict with our own personal moral values or political opinions; to let people know what is available to them with a sense of privacy, and without any monetary profit to ourselves. This is something people will not find from profit-making corporations. In short, librarians need to be the trusted information professionals that we are now and have always been, and we need to transfer these values into the ever-widening information universe.

What a great time to be a librarian!

To see further comments on this, click on the ebooks tag.

Thursday, March 11, 2010

RE: [RDA-L] expressions and manifestations

Posting to RDA-L
Karen Coyle wrote:
<snip>
Quoting Laurence Creider

> Is their a technical reason for your statement MARC is "not up to providing" the appropriate subfields? MARC21 certainly allows for indication of the thesaurus from which subject terms are taken, and presumably that could be extended to other fields as well.

There are a number of reasons. Here are a few:

1) there are only 36 possible subfields in every field. In many fields, there are none or at most one left to use
</snip>


This assumes we are stuck forever with ISO2709 records transferred using Z39.50. The moment we change to almost any other format, we have an infinite number of fields and subfields. For example, here is part of a MARCXML record (totally made up):
<datafield tag="700" ind1="1" ind2=" ">
<subfield code="a">Jones, John</subfield>

<subfield code="t">The tree frogs of Texas</subfield>


We can add a subfield:
<subfield code="relation">b</subfield>
(b is a code defined as: "Has part" or "earlier version" or "based on" or whatever you want. If we want natural language text, we can do that too.)
<subfield code="relation">HasPart</subfield>
</datafield>


We can't do this in our current MARC format since we are stuck with single digit subfield codes because of the limitations of ISO2709:

700 1\ $aJones, John$tThe tree frogs of Texas$relationHasPart

[theoretically, today we could add the entire UNICODE character set, but I doubt if a lot of people would want to add a subfield lambda λ or shin ש! In any case, there is little sense to expand an obsolete format]

In fact, once we move beyond ISO2709, we could even do things that can interoperate with other formats, e.g. Dublin Core (for an analytic):
<DC.Relation.hasPart>
<datafield tag="100" ind1="1" ind2=" ">

<subfield code="a">Jones, John</subfield>

</datafield>

<datafield tag="245" ind1="1" ind2="4">

<subfield code="a">The tree frogs of Texas</subfield>

<subfield code="c">John Jones</subfield>

</datafield>

<datafield tag="300" ind1=" " ind2=" ">

<subfield code="a">p. 34-85</subfield>

<subfield code="b">ill.</subfield>

</datafield>

...

</DC.Relation.hasPart>


This is just as easy with RDF or almost any other modern format. The number of codes and relationships will be endless and we can gain a lot of freedom once we dump that outmoded, obsolete ISO2709 format, which has fulfilled its function but is now holding us back. This does *not* mean that we must abandon MARC. Each bibliographic agency can add on its own sets of fields and subfields, so long as the XML is correctly defined.

Whether we need an endless number of codes, fields and subfields I do not want to discuss here. But I think people can understand why non-librarians see that ISO2709 is a kind of straight-jacket in today's world. A lot of those same non-librarians also conclude that MARC format is just as obsolete, but I disagree and believe that MARC can survive so long as we rethink it.

Wednesday, March 10, 2010

FW: [RDA-L] RDA requirements in LMS

An errant, off-topic post to RDA-L that should have been private. This is in answer to a question about Koha and open-source

Hello Su Nee (I hope I got your name correct!),

Koha 3.0 works with MARCXML now. This is where you can see it in action at the John C. Fremont Library District http://jcfld.us.to/.

Again, open source is "free" but this does not mean there are no associated costs. For example, someone could say that they will give you a "free" house, and you may be happy but if they are only giving you all the wood, bricks, mortar, and so on, it still needs to be built. Some open source projects are like this; others are more advanced.

With Koha, it has advanced significantly to where you will have relatively little maintenance problems. Customizing it is actually the fun part and if you know basic web programming (HTML, Javascript, Style sheets) you can do a lot. If you don't have those skills, there is still a lot you can do, but these skills are easily and cheaply available everywhere now.

Suffice it to say, that if you want to change something in Koha, it can be done without asking anyone's permission. With proprietary software, you must ask and wait, sometimes forever. But as an example of what you can do, look at my catalog (based on Koha 2.2.7) http://www.galileo.aur.it/cgi-bin/koha/opac-main.pl which I have modified a lot. I made my own display and it works in different ways from other catalogs. For instance, I have managed to embed tutorials, and one I will suggest you look at, which is an overview of my catalog: http://issuu.com/j.weinheimer/docs/aurcatalog?mode=embed&viewMode=presentation&layout=http%3A%2F%2Fskin.issuu.com%2Fv%2Fcolor%2Flayout.xml&backgroundColor=61A900&showFlipBtn=true and then look especially at the Extend Search which is used only in my catalog: http://issuu.com/j.weinheimer/docs/extendingthesearch?mode=embed&viewMode=presentation&layout=http%3A%2F%2Fskin.issuu.com%2Fv%2Fcolor%2Flayout.xml&backgroundColor=61A900&showFlipBtn=true

Another example: I managed to work with the Worldcat API to provide automatic citations, e.g. see http://www.galileo.aur.it/cgi-bin/koha/opac-detail.pl?bib=25256 and click on "Get a Citation"

It is only with open source that you can experiment in these ways. Otherwise, you can only wait and receive what the owners decide to give you. Try my Extend Search and let me know what you think.

Hosting your own web server (on a local machine) can be quite an experience. I host mine locally, and sometimes you get hit with spammers and so on and you have to deal with it yourself. These are matters beyond my capabilities, but there is a professor here who enjoys playing with perl and Linux, so between the two of us, we have been able to deal with it.

But if you don't want to deal with these things, you can find someone else to host your site, for pay. I don't know how much something like that would cost, but probably not very much. There are some hosts that specialize in Koha, also.

I want to convert to Koha3.0 but I have run into conversion problems and can't do it yet. If I could, I wouldn't waste a second!

The Extensible Catalog also looks very, very nice but I have no experience with it. http://www.extensiblecatalog.org/ It can work with Drupal, but there are lots of possibilities using plug-ins and add-ons with browsers like Firefox (also open source).

I hope this helps you.

RE: [RDA-L] expressions and manifestations

Posting to RDA-L
Concerning http://www.mail-archive.com/rda-l@listserv.lac-bac.gc.ca/msg03284.html


Karen has delineated the problem very well, but we should all just admit that *any solution* on these analytic-type records will definitely *not* be followed by everyone. I don't think that lots of libraries outside the Anglo-American bibliographic world would ever agree to use a 505 (although I personally like them!). The best we can do is to decide to help one another as much as possible.This is why I think the solution lies much more in terms of "open data." Someone on one of the lists suggested the TED talk of Berners-Lee (thank you, whoever you are!). I finally saw it last night available at: http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.html and I suggest that everyone watch this. (TED talks are very short. This one is less than 6 minutes, so it shouldn't take too much time) What he demonstrates is something absolutely amazing, and it happened only because some agencies put their data in a place for others to take and share in different ways! I found it quite inspiring. How could this work with our data?

If there were an open way of sharing data, I can imagine that, e.g. Mac in Canada makes a record with a 505 note. It is placed into something like the Internet Archive. Bernhard in Germany is working, finds the record with the 505 and runs a very clever macro that he and his friends have made and turns the record into something more suitable for his purposes. Maybe it's not 100%, but even 70% will save a lot of manual editing. He places his version somewhere, so now there are two versions. We can probably see that there could be multiple versions rather quickly.

Some other person, perhaps a non-librarian, wants to take all of these versions and merge them in another incredibly clever way and this person adds his/her own information. What would this be? Right off, I can think of a public, cooperative effort to input tables of contents, with links if possible. This would definitely be appreciated by everyone in the world. Now we are getting something absolutely new. At this stage, there will be a real desire for genuine cooperation since everyone can see how they can all benefit if they work together. Plus, it all happens while everyone is still helping one another in very concrete ways that everyone can point to.

Is this pie-in-the-sky? Definitely not. It is happening *right now* in other information communities, as Berners-Lee shows. And it has happened very, very quickly. The problem is deciding to take the leap and let our information--now seen in proprietary terms--into the world.

RE: [NGC4LIB] OCLC and Michigan State at Impasse Over SkyRiver Cataloging, Resource Sharing Costs

Posting to NGC4LIB

Lois Reibach wrote:
<snip>
Two reasons we want our holdings to display in worldcat.org: when someone is at an auction and considering a book appropriate for our collection, they would be able to tell in one search if we already had a copy, and if it was widely held; and we want our rarer holdings to be visible to researchers

> They're promoting it? How's that going, exactly?
>
There have been some webinars recently pushing the advantages of having a library's holdings visible in this setting; seems like I've also gotten some targeted emails, although I don't have an example at hand
</snip>

But this is exactly Tim's point: it can be proven that very, very few people use Worldcat or even know about it. So, if someone wanted to know if "we already had a copy, and if it was widely held; and we want our rarer holdings to be "visible to researchers" they would *not* know because they would have to go to Worldcat in the first place, which is virtually unknown outside the library community. Therefore, when we place materials in Worldcat, we cannot logically conclude that we are making them visible to researchers.

Again, there is no need to fault anyone on this: this was never the mandate of OCLC, which was originally very library-centric and existed mainly to provide *cataloging copy* for *libraries* and some other services. The need to make it open to be public happened only later. According to the Wikipedia entry http://en.wikipedia.org/wiki/WorldCat:
"In 2003, OCLC began the "Open WorldCat" pilot program, making abbreviated records from a subset of WorldCat available to partner Web sites and booksellers, to increase the accessibility of its member libraries' collections. In 2006, it became possible to search WorldCat directly at its website."

So, Worldcat is definitely a latecomer to the newer world of information, and it must catch up, if it can at all. Expecting the world to come flocking to its door once it came online could never have been expected in reality. Will it become much more important in the future? I doubt it very seriously because it retains its library-centric focus and will probably continue to do so.

Aside from this point however, there is a deeper question about the fundamental purpose of a catalog in today's environment. If even Worldcat is having trouble making a dent in the information world, what does this portend for our own local catalogs? Especially when (not if) the Google Books agreement will be ratified eventually? Perhaps even this month?! But if the judge says no, it's still only a matter of time.

I agree that holdings of a library should be "visible to researchers," but this is becoming far more complex a task than it used to be. Just making a record in the local catalog and throwing it into Worldcat is definitely not enough today. To compensate, there are many more avenues available to us than ever before.

I console myself with the thought that finding solutions to these problems could turn out to be one of the most fascinating eras in the history of librarianship!