Thursday, April 24, 2014

Re: [ACAT] Metadata 101 - Post-graduate training?

Posting to Autocat

On 4/17/2014 8:43 PM, J. Adalia wrote:
<snip>
... I'd love to get recommendations from you all about how to structure a self-taught program from beginner to where you think I ought to be and possibly some ways to obtain experience. Does anyone know of how a novice could gain experience even if it's not within a library?
</snip>

Metadata is a huge area, but traditional library cataloging is only one part of that and is undergoing serious changes right now. If you are interested in metadata as a bigger area, one of the hot places is in SEO, or Search Engine Optimization. What is that?

If a company or other organization has built a website and it only comes up no. 500 in the search engines, it may as well not even exist. This can mean the difference of life and death for a company. Your IT and advertising people will probably not have the slightest idea what to do. SEO experts do what they can to raise specific websites within a search engine. It is a field almost without any rules and can change by the day, so in many ways, it is the "Wild West" of the web.

While I have never worked specifically in an SEO job, I follow developments very, very closely because I think that it may represent much of the future of library cataloging (for better or worse). Still, I think that traditional library knowledge could be very important for an SEO practitioner, and it will be especially so if and when--as I think--traditional library cataloging begins to borrow some tools from SEO.

A couple of sites:
http://www.stateofdigital.com/start-career-seo/
a little older, but still pretty good:
http://seogadget.com/how-to-get-a-job-in-seo/

That's my opinion, anyway.

Monday, April 21, 2014

Re: [ACAT] Advantages of RDA - get rid of disbelievers

Posting to Autocat

On 4/18/2014 11:10 PM, MULLEN Allen wrote:
<snip>
It is unfortunate that I, a largely ignorant cataloger on the sidelines in a small city public library, feel a need to explain the goals and possibilities of RDA/Bibframe. It is likely I'm off-base in one or several respects, but this is my best take on it, and I really have not seen a clear, easy overview of this for catalogers to digest anywhere else, so I offer it. It is partial, but my time of this public service desk is up and I need to return to cataloging.
RDA needs improvement and RDA needs further development. But it's a pretty decent foray if one believes in the utility of library metadata. Whether it is sufficiently useful to stand up in a world of sophisticated algorithms, predictive results, user profiles, data mining, etc. is open to the future to answer. Those aren't our strengths though.
</snip>
The problem is not that I do not understand the goals and possibilities of RDA/Bibframe. I have gone to great lengths to show that I do understand them. I am guilty of an even more heinous crime: understanding and disbelieving--or at least not allowing myself to put my faith where there is no evidence, and my experience tells me that the goals of RDA/Bibframe do not deal with the problems I have seen--with my own eyes--that people have when they use the catalog, and with the problems I experience myself. I have worked with scholars and researchers and students from around the world, and done so in international institutions. I have even built tools using different technologies, some completely on my own that I realize were not very good, and better ones where I have played a supporting role.

Therefore, my skepticism does not come from lack of understanding, but seeing that RDA/FRBR/Bibframe do not address the very real problems that the public is facing when they use the tools we make. And yet, I still believe in the utility of library (more specifically: cataloging) metadata. RDA concentrates on changing individual records, e.g. including an author's degrees and affiliations in the SR, spelling out abbreviations, adding the relationship info, without any concern for the trade-offs involved, the costs, or even to find out if this will make any real different to the public. It is my belief, based on my experience, reading some of the relevant literature (not only from the library world), and discussions with others, that the main problems people have when using a catalog is not with individual records, but with how the catalog itself works and how it is structured. Some of this is not new at all, and we can find evidence in very early writings on the catalog. Other problems are new.

The public is being bombarded with outrageous amounts of information every, single day (I keep referring to that talk about the IT person who is adding 8000+ records to his catalog every day, but this represents only a fraction of what the public experiences), and libraries feel under pressure to provide people with a single search box to search "everything" in one search (whatever the word "everything" means to a particular library). The single search can be achieved in different ways, by adding records to the catalog, like the IT person in the talk is doing, or it can be done through federated searching, but what is important is that for the user, the result will ultimately be the same. They will see all kinds of records thrown into the same pot and the very concept of authority control begins to disintegrate since 99% of everything the user sees will be made by non-library organizations.

These organizations will--at best--follow their own internal rules but not by any means AACR2 or RDA or even ISBD. In such a metadata mash, how can a well-experienced and knowledgeable librarian tell someone how best to search the writings by Mark Twain? Adding the relationship information becomes completely random as well. In that metadata soup with all kinds of records, standards and non-standards, what can you tell someone who wants Vladimir Nabokov only as a translator? It will all be hit and miss.

In such a situation--which is the situation our users face now every day--discussing the minutiae of RDA, e.g. relator codes, is as useful as washing your windows when a tsunami is approaching. This is only one problem that people experience with our catalogs. I have gone into long discussions of some of the other problems, but what I am trying to say is that the problems the public experience with the catalog are not with the cataloging rules. We can change them in any ways we want and they will still experience the same problems.

These problems will also not be solved by changes in format, since that is also not the problem. Changing from ISO2709 record transfer should have been done long ago, but there is no "miracle format" that will change anything for the user. With a more web-friendly format (i.e. a variant of XML), other entities will be able to take our records more easily, and while that is primarily a good, the underlying problems still remain. Strange headings that would never enter a user's head, such as "Characters and characteristics in literature" will be even stranger outside the library catalog environment.

Changing to an FRBR-type format will also not change the user's experience of the catalog. As I keep pointing out, anybody can do the FRBR user tasks now, and nobody cares. The XML format right now is so powerful that we can make these search results look and operate any way we would want to. Why can another format make any difference? In the card catalog, people had no choice except to do the FRBR user tasks, and when other tools arrived, they discarded the old ways immediately and never complained.

Implementing linked data may very well change the user's experience, but first we have to figure out what to link to (id.loc.gov? VIAF? dbpedia? Something new?) and by doing so we may be able to add information from Wikipedia or whatever, and that may or may not be useful to the searcher. In any case, it remains an article of faith that linked data will be the solution, and in spite of the hype, from what I have seen of linked data so far, I have seen nothing so incredible. Anyway, RDA and FRBR are irrelevant to linked data. All you need for linked data is a change of format and Bibframe should allow it, but the change in format should have been the first step and done long ago, by the early 1990s. I feel that linked data will be a part of the solution, but only a part.

I have explained it all yet again to show that I do have an understanding of matters, and why I cannot hold the faith that RDA and Bibframe solve the problems that people experience with the catalog. And yet, I say again that catalog information can be very useful for the public, but we must create tools that focus on their problems. We can only do that by learning more clearly what those problems are, and build tools that serve that purpose. That can be done only through studies of user behavior, and building tools whose success will be based on trial and error.

Unfortunately, I don't think that is going to happen and that we are experiencing "RDA, FRBR, Bibframe! Damn the torpedoes! Full speed ahead!"

Re: [ACAT] Advantages of RDA - get rid of disbelievers

Posting to Autocat

On 4/18/2014 6:25 PM, MULLEN Allen wrote:
<snip>
These twin strategies - incorporating library metadata into the web and defining relationships so that users can follow their interests using structured metadata - build on the legacy and work of the cataloging community. Whether they are adequate or will be successful remains to be seen. I really don't care whether RDA developers made their business case to you, or conducted the user studies that you require, James.
</snip>

Allen,

I am not the one who demands a business case. These things happen whether we want them or not. Business cases always work themselves out, either before implementation or afterwards. This is a fact of life. It is best if it is worked out before, so that obvious errors and potential disasters can be avoided, and the fewest number of people are hurt. RDA hasn't cared about the costs of its implementation and has never done basic "customer research". That is just a fact and that report from the national libraries actually said it. I have only pointed out that this would never be allowed in a business environment, but for some reason it is allowed with libraries. Even with a solid business case, nothing is certain, but without one, ... well, we just cross our fingers and hope for the best. I am glad you have such faith, but I have seen too many ideas that people love go down the tubes.

It is also not you and I, or even librarians, who will ultimately decide success or failure: it will be those who are in control of the budgets. I have said time and time again that I believe libraries and catalogs can be very important in the new environment, but it still remains a chancy business.

Also,
<snip>
On last thing - whether you see it as a platitude or not, the work I do is for the convenience user, both as a cataloger and as reference staff. If you didn't see your work that way when you were employed as a librarian, that's unfortunate.
</snip>

I am employed as a librarian.

I understand that you intend your work to be for the convenience of the user; I think we all want that, but this is where we must look at matters from the user's side, not ours, and we have to look at it honestly. Is it true? The public has complained for years and years and years (if we don't know that, then we should), so should it be such a shock that when they find new tools that are easier, they leave our tools? This is what is happening. As only an elementary example, any cataloger should understand the absolute need for cross references, so that when someone searches "wwi" they will find the cross-reference to something they would never think of: "See: World War, 1914-1918". Simple enough and I should not have to explain to any cataloger how vital cross-references are for anyone who wants to search a catalog at least half-way decently.

But cross-references have never worked in an online catalog. While it can be claimed that they "work" today, they work only with left-anchored text searches, which is an unnatural search today. Also, subject arrangement is absolutely nuts in our catalogs. I have mentioned this plenty of times, where e.g. "United States--History--Queen Anne's War, 1702-1713" comes long after "United States--History--Civil War, 1861-1865". There are tons of problems, and I have written extensively about them. People don't search that way any longer. It's been this way for a long time now--at least a couple of decades, and yet, this is supposed to be for the convenience of the user? Pardon my skepticism.

As catalogers, we should know that an informed user should be besieging the reference staff with questions about what are the authorized forms? Or do we think that 99% of the public is searching left-anchored text browses in our catalogs, or checking out the LC Authorities online? And yet, reference questions continue to go down drastically. Why? I have never seen an answer, but one reason seems logical enough: the public just doesn't understand these things anymore and go elsewhere because their searches in library catalogs must be inferior (there are no cross-references) and we have never made our tools work for them. And when our tools fail, they have many other places to go.

As I said, if we really were so interested in the convenience of the users, we would be busting our b***s finding out what they want. But we don't.

Friday, April 18, 2014

Re: [ACAT] Advantages of RDA - get rid of disbelievers

Posting to Autocat

On 17/04/2014 23.51, MULLEN Allen wrote:
<snip>
RDA, nor any cataloging code, is about the convenience to the cataloger. Our work remains the convenience of the user.
</snip>
This is one of those platitudes we are taught in library school and is repeated over and over again, much as a mantra. The fact is, people have complained loudly about library catalogs from the beginning. They complained about card catalogs from their inception; they have complained about OPACs, and remember that catalog of Panizzi, that we idolize today? People hated it so much back in the 1840s that they managed to get nothing less than a Royal Investigation into it. Just because Panizzi happened to convince the members of the committee that he shouldn't be fired doesn't mean that people suddenly loved his catalog.

The library catalog was created to do the work that the library needed. That is why it exists. If the catalog did not do the work the library needs, it would have been discarded long ago. The only real difference today is that if people want to find information, they have multitudes of other options which did not exist in the antideluvian era, before the internet, when they had no choices at all.

If our work really is for "the convenience of the user" doesn't that mean that we should at least try to find out what the users want? And then design something that fulfills what we discover they want? RDA didn't do that research before implementation and doesn't do that now. I think it's a safe bet that if they did that research now, they would discover that people don't want RDA and would find the FRBR philosophical structure bizarre and unnatural--and unnecessary.

It is a fact that, in spite of the platitudes, library catalogs are for the convenience of the library. They always have been and they always will be. To think otherwise is to ignore what has been the public's reaction for at least more than a century.

And there is nothing at all wrong with that so long as we realize it. The real task should be to make the catalog into something that can fulfill the needs of the users, but we first need to find out what those needs are instead of mindlessly repeating the FRBR user tasks (which are actually the FRBR librarian tasks), or crossing our fingers and hoping against hope that just entering the world of linked data (heavenly
chorus) will be the solution.

Re: [ACAT] Advantages of RDA - get rid of disbelievers

Posting to Autocat

Of course, being against certain changes does not mean that someone is against all changes. That is a dead argument. Still, there needs to be at least some agreement on what are the challenges facing us before any kind of changes should be considered, and I don't think there is much agreement at all.

As only one example, I think there is general agreement that we need to be able to create more records--at least if the 8000+ records being added per day (that no one seems to want to discuss) can be dealt with in any realistic way. This is nothing really new and was also the focus of the LC report on MARC record creation from five years ago "Study of the North American MARC Records Marketplace" http://www.loc.gov/bibliographic-future/news/MARC_Record_Marketplace_2009-10.pdf

Someone in one of these threads mentioned that RDA will make new record creation faster and easier. I would agree that this would be a positive development. I have seen this assertion several times but have seen no evidence for it at all and in fact, quite the opposite. With RDA, just the physical act of typing goes up substantially with typing out abbreviations in full and adding the author's degrees and home institutions in the statement of responsibility. Therefore, that cannot be the reason because it just that adds up to more work.

Perhaps we are supposed to think that if we have a variant edition (expression or manifestation) of something already in the catalog, we can take the work or expression information that will exist. Yet, I don't see how that will make the slightest difference in the actual work of the cataloger from what we do today with copy cataloging. We can now take a record and derive a new one, changing/adding/deleting the information from the variant copy we find. How can this change with RDA/FRBR/Bibframe? This avoids the question of adding and--more importantly--figuring out the multiple relationships that are now supposed to be manually added, whose complexity sometimes seems to demand calling committee meetings!

Can someone please give me an example of how RDA/FRBR structures will make cataloging a variant edition easier for the cataloger than what we do today with copy cataloging?

I will agree that in a relational database structure, it could be more efficient for the SQL queries in the database, but that is almost irrelevant today. Computing speed is still proceeding at an exponential rate and anyway, relational databases are certainly not the latest technology today, and faceted indexes are far more efficient and effective, and those work on flat files.

I do not see how RDA, FRBR or Bibframe can make ease the work of the cataloger.

Wednesday, April 16, 2014

Re: [ACAT] Advantages of RDA

Posting to Autocat

On 4/16/2014 7:47 AM, Hal Cain <hegcain@gmail.com> wrote:
<snip>
The alternative view is the pragmatic one: it's happened, we'll just do the best we can with it.
</snip>
I've been out for a few days, and just saw this thread.

The advantages of RDA are primarily theoretical: based on entities and attributes, so that the public can navigate the works, expressions, manifestations and items (WEMI) by their authors, titles and subjects (ATS). This is now being supplemented by adding more explicit relationships, so that the searcher knows that the relationship of two entities: e.g. a specific work, has a specific relationship to another entity that is not only an author, but e.g. a thesis advisor.

To believe this, you must ignore a lot of the reality that is right in front of our very faces: right now if someone wants to, anybody in the world can navigate the WEMI by their ATS by using the new technologies that allow facets. All you need is a uniform title and everything works from that. If implemented correctly, you don't even have to know that much to be able to use it, and anything can display however you want it to. As an additional plus: all of the technology is not owned by some monopolistic company that can charge any outrageous fees it wants and you are locked in forever, but because of the essential goodness of some very talented people, it is all open source and is downloadable for free. Those are simple facts and anyone can convince themselves of their correctness in a couple of minutes.

Another fact is that if the public is to search for the specific relationships, e.g. find people as thesis advisors (but there are far more relationships among all of the entities), then it is a fact that none of it can possibly work until those relationships are added to the records--including those that already exist--otherwise people will be searching only the tiniest fraction of what is really available to them. To add those relationships to all the entities (the relationships among the WEMI and ATS are truly complex) will demand vast resources and will have costs, but no one has even suggested any kind of ultimate figure. What we have seen so far in RDA/FRBR implementation represents only the barest costs, and already beyond the reach of many libraries in this economic crisis we are going through.

I shall ignore the obvious question of whether people actually want to find, identify, select and obtain WEMI by their ATS, or if they actually want something else, but will just point out that it is folly to assume that it is what they want without evidence.

To get more of a grasp upon the reality, we should also try to see things from the point of view of the users. I think that a very good way to do this is to watch the talk by a library IT person "E-Books Do Not Exist (and Other Conundrums of Digital Asset Management)" https://www.youtube.com/watch?v=BJ5sSUHJagg but to consider this talk not so much in terms of libraries dealing with this information, but of users dealing with this information. In short, he says that the number of records he is putting into the catalog is 8000+ per day. I may be wrong, but would suspect that the average large library adds anywhere
from 1000 to 2000 records per week. In a year's time, this would be the difference between

8000 x 5 (days per week) x 50 (weeks per year) = 2,000,000 per year
1000/2000 x 50 (weeks per years) = 50,000 to 100,000 per year
(or no more than 5% of what is being added to the collection)

The speaker of this video mentions that since there is so much, he cannot know what records are being put in there--he cannot ensure any kind of authority control, cannot figure out what is and is not duplicated, which links work and which don't, etc. etc. etc. While this is obviously a serious matter for the library, I want to focus on the users. The speaker describes the situation he is in as "being squashed like a bug." I am sure he is right, but if he feels this way, what does this mean for the users? Also, we have to remember that these numbers do not in any way represent everything relevant that is available to the public. As an example, I am personally very interested in following the latest Ukrainian news, and the best place for me to find that information is on the web, not in a library, and certainly not in JSTOR, Proquest, Ebscohost, or anything libraries pay for. On almost any topic, there are wonderful materials on the web that are just as vital, interesting and useful for the public than many of the materials on our shelves--if not more so.

These comments are merely to introduce some of the facts that users and other non-catalogers have to deal with every, single day. While Hal is right in one sense about RDA, "it's happened, we'll just do the best we can with it," it still doesn't mean that RDA is dealing with any of the very real problems people, and libraries, are facing. It was introduced through executive fiat, without a business case, and we are seeing the consequences of such a decision.

Sooner or later, the cataloging community is going to have to deal with it, otherwise at yearly rate of 5% of the whole, the catalog records that we make will constitute tinier and tinier proportions of the growing mass, eventually winking out of realistic existence in the information universe.

I am personally not so pessimistic: I believe there are many, many things catalogers and catalogs and libraries can do--and very important things at that--but first of all, they have to face some very obvious facts, and accept that any solutions must be both practical and sustainable. Concerning RDA and FRBR: do they offer real solutions to the very real problems that people are facing, and are those solutions practical and sustainable? What relationship do these records have with the far greater numbers of items available to the public through the entirety of what the library pays for, plus the even greater numbers of resources available through the web?

I and others have asked these questions repeatedly but neither RDA nor FRBR have been very concerned with the stark realities that catalogers, librarians and the public, all face. There are serious questions that need to be asked, and answered, not "which relator term do I use?" or "Does this author go with the work or expression?"

Friday, April 4, 2014

Re: Transforming non-MARC metadata to MARC to the library catalog

This topic, some threads on various lists, and a talk on youtube (that I suggest all librarians should watch): "E-Books Do Not Exist (and Other Conundrums of Digital Asset Management)" https://www.youtube.com/watch?v=BJ5sSUHJagg, all of these discussions have all really made me think.

Posting to various lists

The task of adding non-MARC metadata to the library catalog is absolutely huge and fabulously important--there is no doubt about that. It also doesn't seem to be discussed much, especially when compared to RDA, WEMI or which relator codes to use. But the challenge is exponentially larger and has the potential to sink the catalog, and some apparently think it already has.

Still, I question whether adding non-MARC metadata (or in other words, non-standard records) is really a new problem or not. Libraries have always had files to all kinds of materials that did not make it into the library's catalog: finding lists for archives, journal and newspaper indexes; lots of analytics, all kinds of subject bibliographies... this list can go on for a long time. It was always too much work for individual libraries to catalog everything that came into the library, so that isn't new. What is really different now is that these things can go into the catalog--that is, if you receive some kind of delimited format, it is possible to convert it into a type of MARC where you can load it into the catalog without the catalog noticeably blowing up. It can be done, but I question: should it be done? And if so, how should it be done?

Zillions of problems arise when adding them to the catalog. The problems of incorrect headings, as discussed by Julie, plus the numerous problems laid out in the youtube talk make me think that adding those records causes as many problems as it proposes to solve. Still, the public wants to be able to search "everything" in one search, and I accept that. Does that also mean we have to destroy the consistency in our catalogs that our predecessors (and me!) have worked so hard to maintain through the decades, and even longer in some cases? I don't think so. The power of today's systems could offer a solution. For instance, with federated
searching, searching "everything" does not mean that it all must be in one database. The information can be almost anywhere, in any format.

There are several open source tools now, such as MasterKey (demo at http://mk2.indexdata.com/). In this demo, you are searching library catalogs, OpenCourseWare, Wikipedia, and PLOS. There is also Wheke http://wheke.org, made by an acquaintance, based on Drupal. Most of the documentation is in French however. There are probably other similar tools as well. Anyway, these will search MARC and non-MARC databases all at once, then sort and even merge the records it finds. The MasterKey demo is very impressive. Of course, when you install it yourself, you can decide what you want to search and how to do it.

What is the advantage? Well, in Julie's case with the "free" Dublin Core records, she would not have to put them into the catalog but she could put the records into another local database, perhaps a very simple mysql one or something similar. It could be searched along with anything else she wanted using the federated searching tool, and the users wouldn't even know the difference. But the real advantage is: the records would be in their own database, you would have additional tools not available in a MARC database, and you could continue to work update/edit/completely overwrite them separately without worrying about how it affects your own catalog records. The result would be that a lot of the terrible headaches mentioned in the youtube talk would disappear.

And, in the spirit of sharing, if Julie were kind enough to let other libraries and catalogers use her mysql  database of those records, the entire workload could be shared out, for instance, updating headings or using URIs. Everyone would benefit.

Records for electronic documents are fundamentally different from records for physical items because they are all pointing to exactly the same files in exactly the same places. Although you may need specific permissions to access the files, that is a separate issue, and fully solvable as well. So, I don't see why each library needs separate records in its catalog that all point to exactly the same things--why not share it if you can, and do it efficiently?

This does raise other questions however, but I'll talk about those in other posts.

Friday, February 28, 2014

Re: [ACAT] Code s in character position 28 of 001

Posting to Autocat concerning the use of the "government publications" code in the MARC format

On 26/02/2014 1.48, john g marr wrote:
<snip>
First, let's start from the bottom instead of with generalities. Which ["very few"] publications by state university presses ARE "government publications" and why, and which state university presses are NOT affiliated with state universities?
With that, we can then establish some guidelines based on fact instead of mumbling about vagaries.
PS: I see that line "Treat an item published by an academic institution as a government publication if the government created or controls the institution" as being intended to distinguish such items from items published by private academic institutions (e.g. Bob Jones University), which brings up the question of whether such private institutions actually have MORE control (and censorship) over what their presses publish and whether that problem should be addressed.
</snip>
I think it's a little late in the day to start trying to figure out how to treat the government publications code. I wanted to check in "The MARC II Format. A Communications Format for Bibliographic Data" from 1968 to discover if it existed in the original format but unfortunately, neither Google Books nor Hathitrust allows me to see the text of the clearly public domain document. (As an aside, I have noticed that Google appears to be withdrawing several materials that used to be open. I have used proxies to check if this is only because I am in Italy, but it appears not to be the case) In any case, I am sure there has been that code for several decades.

As I mentioned in my previous message, I suspect that the code was introduced so that people could limit by "government documents". That would be good and useful. But catalogers immediately had the very legitimate question of: "What is a government document?", a vague and nebulous idea. As happens so many times, to answer this legitimate question, nobody ever went back to the public to ask them what they wanted, and instead, catalogers decided to do it "theoretically": a government document is something that comes from a government body. So, what is a government body? Again, the catalogers fell back on theory and decided:
"Academic publications- In the U.S., items published by academic institutions are considered government publications if the institutions are created or controlled by a government.
University presses- In the U.S., items published by university presses are considered government publications if the presses are created or controlled by a government (e.g., state university presses in the United States)."

Obviously, this becomes useless and is certainly not the general idea of someone who wants to work with government documents. It is also quite a political statement to declare that anything that comes off of a university press--if it is a state university press, or from other academic institutions--creates government documents. Again, I don't think any cataloger ever researches their publishers to find out their relationship to the government. I never have and I won't do it.

This is one of those little points in the format that started with good intentions but became useless since there has been such wide variation in its implementation. We should either fix it, which would demand a huge amount of resources for no purpose, or consider eliminating it.

Unless there is no concern for consistency any longer. Then, we can just keep putting in a useless code.

Re: [ACAT] Online Encyclopedia

Posting to Autocat
This is about trying to cataloging a multi-volume encyclopedia that is in the Internet Archive

On 2/27/2014 4:33 PM, Lisa Romano wrote:
<snip>
Unfortunately, these are Internet Archive records.
</snip>

The IA arranges their materials in a way that is not all that library-cataloger-friendly. In library-cataloging terms, it mixes manifestations, so while it contains some fabulous resources, e.g. Migne's Patrologiae cursus completus in both Latin and Greek--something that would be outrageously expensive for a library that wants it, it exists in the IA for free! e.g.
https://archive.org/search.php?query=Patrologiae%20cursus%20completus%20migne

Unfortunately, there is no overall page and to find a specific volume can be difficult if not practically impossible. There is no way to find out what is there and what is missing. This isn't the case just for the Migne, because of course there are zillions of booksets in the IA, some very large such as this one, but many(most?) booksets lack any volume numbering and I have discovered that even when the records say something is volume 13, there is absolutely no guarantee that the item is volume 13. Therefore, each volume must be examined.

Finally, the individual records in IA are structurally different from traditional library records. For instance, this one
https://archive.org/details/patrologiaecursu00hammuoft
mixes together different formats: 2 types of pdf, epub, kindle, DjVu, etc. In library cataloging terms, each format would be considered a separate manifestation and therefore each would get a separate record. So, the DjVu format would get one record, while the epub gets another and so on. What we see in the IA is a type of Expression record.

Kind of....

To make matters even more complex, in the future, additional formats can be made automatically from what exists now, so if someone at IA wanted to provide .html or Apple's .ibook format, or some new format in the future, that format could be generated automatically and there would be a completely different format to manage. And generated in the blink of an eye! Of course, in the IA record, there would just be an additional option in the left column.

Whew! But the materials themselves are nevertheless highly valuable to our patrons. What is the best way to control it? That's tough, but I think these sorts of resources should make us rethink our normal methods. To do it correctly would mean to catalog these materials so that all manifestations are together, that is: all of the epubs, all of the pdfs, the DjVus and so on. That is a complete rearrangement of what exists now and would take many many hours, perhaps months of horrible work and besides--is that really what the public wants?

In practical terms, if someone wants to add the IA collection of Migne to their library catalog--thereby saving their libraries probably tens of thousands of dollars--and have a record for the entire set so you can see what you have and don't have, should there be one bookset record for all of the pdfs, another for all of the epubs, another for all of the DjVus and so on?

While that would be "correct" in cataloging terms, it is also a ton of work, and anyway: does the public want materials brought together by format in this way? Clearly, the IA does not think so and believes that the public is far more interested in the contents of a resource than its format.

After all of this, my suggestion is: don't even think about cataloging the bookset "correctly". Do it in the most economical way that will help your users find the information they need, but not the format they
need. Create a separate html page that describes your encyclopedia and the links to the individual volumes can go to the IA mixed-format records and then catalog the page you made. Our normal methods fall apart here and lead to too much needless work.

You could also involve others for design of the webpage you make, and anyway, it may be a good idea to involve other library staff. It is an interesting situation and they can see some of the problems we face.

Thursday, February 6, 2014

Re: [RDA-L] Re: Future of WEMI

Posting to RDA-L

On 2/5/2014 7:43 PM, J. McRee Elrod wrote:
<snip>
Most of us are doing manifestation records in MARC. There are no work records (unless you consider 100$a$t and 130 authority records to be such), Bibframe has no expression records (calling them works instead), It seems to me that in MARC we have MI, and in Bibfame WII (work/instance and I assume item). WEMI does not seem to be happening.
</snip>
There are still an entire host of questions concerning WEMI that need to be addressed. First, it seems that with Bibframe, the works and expressions will be merged. Also, who will "own" the W/E instances or will they--along with all the W/E instances that will be made in the future--be in the public domain, or will libraries be expected to pay? Has that issue been solved or discussed? I cannot imagine any organization agreeing to--in essence, give away all of their headings (i.e. W/E information) if there are not some kind of iron-clad guarantees somewhere. Too many libraries have already been burned by losing rights to their own materials in the digital age.

Finally, it remains to be seen whether any of this will have the slightest impact on the users, especially after the fact that anyone is able to find WEMI in Worldcat now (as I have demonstrated several times), and this has had precisely zero impact on the public, on libraries, or on the world of cataloging itself. After we spend outrageous amounts of money converting and retooling each and every database and catalog in the world (and we are only at the very beginnings of the costs), what exactly is the major change that people will be able to experience that they cannot do right now; changes that will make the catalog more relevant to their needs, and will get them to appreciate library catalogs once again, other than merely for inventory
control?

But these questions seem to be among those that nobody wants to discuss.