Thursday, December 16, 2010

Cataloging Matters Podcast no. 7: Search

Cataloging Matters #7:
“Search”


http://www.archive.org/details/CatalogingMattersPodcastNo.7Search

Hello everyone. My name is Jim Weinheimer and welcome to Cataloging Matters, a series of podcasts about the future of libraries and cataloging, coming to you from the most beautiful, and the most romantic city in the world, Rome, Italy. The topic of this installment? Search!

What is Search? Is it really so different from what everybody has always done, or is it just another example of serving up new wine in old bottles?

Before I begin, I would like to spend just a moment on a couple of grammatical peculiarities I have noted. If you do some research on this topic, you will soon discover that the term “search” is used without an article: not “a search”, not “the search”, just “search”. Also, authors rarely use the gerund form (i.e. “search-ing”) for this concept.

So, once again: what is search?

From the library point of view, there would seem to be clear parallels between this newer concept of “Search” and the traditional library/FRBR user task of “Find” from Find/Identify/Select/Obtain, nevertheless it is Search that is getting an increasing amount of attention in our society. Yet, it is vital--for librarians especially--to understand that the two are quite different in their methods and in their goals. A lot of this difference has to do with user expectations, how they are changing and it may give us some insight into how these expectations will change in the future. Personally, I believe that search, if it becomes widespread, as I think it will, may very well become an important political and even moral issue.

So what is search and what makes it so different from what people have always done?

Modern computer technology has made child’s play of some tasks that had been incredibly complex not so long ago. As only one example, Bing Travel http://www.bing.com/travel/ allows someone to search for airline tickets in multiple databases at once, and will even give you a prediction for the price you are paying, whether it will most probably go up or down in the future. [There is a link to this, plus everything else I discuss, from the transcript] At one time, this would have demanded a highly-experienced and well-trained travel agent but today, all of this can be done in just a couple of seconds by a layperson, who has had absolutely no training in how to do any of it. The obvious question is: How good of a job does Bing Travel do? Only an expert travel agent could make an accurate determination, but from what I have read, Bing Travel appears to be not all that bad.

Another example that I find simply amazing is the Google Public Data site. http://www.google.com/publicdata/home Google has partnered with various agencies such as the World Bank, the OECD, Eurostat, and others, to use the power of Google’s tools to create something genuinely new using data that remains on each agency’s site. Today, anybody in the world with an internet connection can do their own statistical analyses in vital areas of concern, using some of the most powerful computers that exist. Of course here, the obvious question is: do people know how to interpret this information? That is another issue, but the fact remains that everyone can actually work with the same data.

Even though these kinds of projects take advantage of some of the power of modern computing, they do not deal with search and many see options that are even more subtle and far more intrusive. Depending on who you are, such options can be viewed in either a positive or a negative light. In essence, this newer concept of search foresees a time when the computer will automatically look for things that even you, yourself are not looking for consciously. In other words, search will do all of the work. Isn’t that bizarre? How could something like that function in reality?

Let’s consider an example based on something we can all understand: a library catalog. Someone uses a catalog to find books on how money is divided among the population of the U.S. This person knows how to use a library catalog, goes to Worldcat, finds the subject heading “Income distribution -- United States” and is led by this subject to the record for Lisa Dodson’s book The Moral Underground: How Ordinary Americans Subvert an Unfair Economy. New York: The New Press, 2009. http://www.worldcat.org/oclc/320803437

Of course, using the traditional tools, this book can be found by other subjects that the cataloger has added, by searching the author’s name, by the title of the book, and if it is part of a series, by that title as well. These “access points” represent the FRBR user tasks of finding by author, title, and subject. This is also where FRBR pretty much stops, and if searchers want to continue, they are expected to repeat those same FRBR user tasks over and over again. But the alternate concept of search works quite differently. We can see one, very minor part of this new idea of search in the Worldcat record mentioned earlier, where, if we scroll to the bottom, we can see various “User lists” that have been created by individuals, and we can click on them. For example, this book is part of a list called “New Economics Books” created by someone named Joyline http://www.worldcat.org/profiles/Joyline/lists/636954, and this list includes several other books on the same topic that may be of interest to the searcher.

It is important to note that finding these books in this way definitely falls outside the FRBR user tasks, since these materials most probably have different authors, titles, and subjects than what the searcher originally utilized. But we need to admit that even this represents nothing fundamentally new since people from time immemorial have been recommending books and articles to one another. Normally however, people have known at least something of the person recommending a book: they may be a friend, a relative, a teacher, a journalist of a newspaper or magazine, or maybe even a person talking on Oprah Winfrey that you can see and hear.

In the case we are examining, we know nothing of Joyline since this person’s profile is private. Joyline may be an economics professor, a high school teacher, a librarian, a truckdriver, a dentist, or even a teenage girl from Japan. Even if this profile were public, it could be completely falsified. The anonymity of Joyline as a book recommender is something rather new, and this anonymity may or may not be of much importance for a searcher to decide to read a book on these list, but this remains to be seen.

I can’t resist a bit of self-advertisement at this point and I’ll mention the Extend Search that I have instituted in my own catalog at the AUR Library. In some of my postings, I have mentioned that I believe the information universe is composed of separate “intellectual microcosms”. These microcosms are defined when you choose a resource and then become aware of other resources related to your resource: perhaps books on similar topics, but there may be reviews, critical blog postings, public lectures and all kinds of resources surrounding this item you are looking at. My Extend Search is an attempt to make it easier for the public to find and enter those “intellectual microcosms”. An advantage of this is seen by the book by Lisa Dodson mentioned earlier: that book is not in my library, but my searchers can nevertheless get into the “intellectual microcosm” and find all kinds of other resources related to it; in this specific case, these resources include a 75 minute public lecture the author gave on her book. The methods I employ differ somewhat from the traditional library searching methods, but nevertheless, I want to make clear that the Extend Search methods I employ are also not a part of this new concept of search.

OK, end of advertisement.

Although these tools are useful and allow some new options, none have much to do with the new concept of search. Search is built on metadata, but not necessarily the library-bibliographic type of metadata that librarians think of: titles, authors, series statements and so on; it is built on metadata about you, and your interests such as what websites you go to, what kind of documents you download, what you buy, what you spend time reading and all kinds of extremely subtle bits of information about you. It is also built on similar metadata about your friends, as well as about me and my friends, about everyone else and their friends, and relating it all together. Search attempts to figure out what you want by indexing your documents, following your movements on your computer, and doing a semantic analysis to determine your interests. Not only that, but it links all of your metadata to similar metadata taken from your friends on Facebook, Twitter, and other projects to build an overall profile of you, your needs and your wants.

Based on this deep and profound profile of you that the computer now has at its command, when you look for something in a tool that utilizes this information, e.g. on Google, the result can be tailored much more finely to what you actually want. So, if I am logged in to my Google account and look for “cat” in Google, it would know immediately that I am looking for the animal, whereas if a construction worker were looking, it would know he or she wanted heavy machinery. Or if I enter chess, that I am primarily interested in the board game, but another person would be interested in the record label.

Once again, these computers will have an extensive profile not only of me, but of everyone who is linked to me because it is building a web of everyone. But this is far from the end.

There is a site called Aardvark [http://vark.com], (why people insist on using these ridiculous names is beyond me!) and I still do not fully understand this site, but it apparently relies completely on the power of these personal profiles, so that when you type in a question, it will link you to a specific expert who can provide an answer. It does this by doing a semantic analysis of your search terms, also looking for your “friends” in tools such as Facebook or LinkedIn, using that information to find their friends, to find the friends of those friends, and so on, to finally link to profiles of those who can answer your question. The site claims that you will get an answer in a few minutes, although I have never tried it.

But wait! Did I say the word “search”? What I have outlined so far is only the palest vision of what many want. The latest ideas are to get rid of search-ing (not search) completely! Well, not completely; of course, it’s more subtle than that. It’s just that you won’t have to do any search-ing anymore.

One example of this is the popularity of the new applications, such as what you can find at the Apple Apps store. I do not own a phone that allows applications, so I only know this through reading articles, but if you own an iPhone for example, you can download special applications that will do all kinds of things, from keeping up with the latest news on topics of your choice such as movies or sports scores, to maps and directions, to social interaction, and on and on. This way, you can keep up with everything of your choice with practically no effort and without searching anything. If you decide you need some kind of information and are lacking it, you will just download a new app. How would this work? I could imagine someone could download a Fine Arts app, which would bring you the information on fine arts, or a Music app, which would bring you the information on music, or a French Renaissance music app, each of greater or lesser specific needs, and which could allow you to configure them to your needs, as expert, undergraduate, high school, interested layperson, or whatever.

But even this is not the end. There is also the idea of persistent, implicit search. This means that in the background, the computer will be running, searching, and analyzing constantly, using your profile, which is being constantly updated and refined, and in this way, the computer can actually interpret your needs. Let’s imagine that you have spent the last few days searching for a new refrigerator. The computer has logged everything you have done, analysed the kind of refrigerators you happened to like by noting how long you looked at each one, and compared the similarities of the majority you looked at, or something like this. Then, it continues to seek out information for you even though you may have never asked it to, but that is how your profile works. The goal of persistent, implicit search, is when you are walking or driving down a street, using a GPS system through your automobile or cell phone, this entire system would alert you that a few blocks away, a refrigerator you would like is on for sale at the best price within a 500 mile radius, and here are the directions.

Although many of the examples are presented in a marketing or shopping sense, it is pretty easy to imagine uses in more educational and informational settings. In fact, the future is here today, right now, and even, for free! Today, everybody can download their very own copy of Mendeley http://www.mendeley.com/, which purports to do exactly what I outlined before except it is in the field of scholarship. After installing Mendeley, you add your documents to it and Mendeley does the work of semantic analysis to figure out what your interests are. Of course, in the process it will dig out the citations from your documents, and if it can’t find the citations within the documents, Mendeley will go out on the web and find them for you. You can then go online to share by joining groups of similarly-minded people, but none of this is all that new.

The new part is that while you are searching for information you want, Mendeley is learning all the while and will search all kinds of databases for you automatically, using the profile it has created and is constantly updating, to find resources it considers relevant to your needs, and it will even show you the latest trends in research! It does this by analysing you, creating your profile and comparing it to other researchers’ profiles, to better figure out what you want. Of course, Mendeley is as yet very new and still has a long way to go.

So in this way, the goal is that you won’t even have to search anymore because the computer will do it all for you automatically, silently, persistently and implicitly. I mean, people can and will continue to do searching, but the idea is that they won’t have to anymore because the computer will have done it for them already, and will have done an even better job. You will do a search only when you think the computer hasn’t worked well enough.

For those who have read some of my postings where I have discussed “find” in the FRBR user tasks, and have mentioned that I am not sure if “find/identify/select/obtain” is what people are doing now, and what they will be doing, this is primarily what I have had in mind. With search, a tremendous amount of “search-ing” will still be going on behind the scenes, in fact there is so much “search-ing” that I believe it really does turn into something new and consequently, justifies the separate term “search” without the article “a” or “the”, as I pointed out at the beginning of this podcast. But it remains to be seen how much similarity there will be with “find” in the traditional library/FRBR sense; that is, if there will remain any similarity at all. I am not sure how to answer this, but in any case, it is clear that the future of “search” is only very remotely connected with the library ideas of “find/identify/select/obtain” or with “authors/titles/subjects”. Search goes far beyond these traditions.

Will “search” become predominant as the new information environment develops? Of course, it is impossible to tell since even newer capabilities may become available, but search is one of the only really serious attempts I have read about that tries to deal with information overload, which is a serious problem already and can only get much worse in the future. Its promise of incredible simplicity and ease is also a point in its favor of being adopted by the general populace.

It is difficult to imagine that anyone, least of all any librarian, who comes into contact with a profound change such as this will not have at least some opinion about it, and I am no exception. Personally, I do like the idea of getting away from a lot of the drudgery of research. I also like the idea of having more chances to come into contact with other scholars who share interests and communicate with them because this can often be very difficult.

Yet, the idea of a machine that silently collects all of this information about me, collates it, summarizes me to extract my “needs” that perhaps even I myself may not realize consciously, and then to search--persistently and implicitly--in a whole variety of places, makes me very uncomfortable. Perhaps this is because of the way I was raised; or perhaps I am just of another generation and those who are more comfortable with the Facebook-type “let it all hang out” mentality will have fewer qualms about it, I don’t know. Machines are storing vast amounts of information now, no matter how we may feel about it. For those who have Google accounts, take a look at your Web history if you haven’t yet. If you have never seen it, you might find it quite enlightening and perhaps even highly alarming. “They” (whoever “they” are) have a lot of information on you!

But another problem I have is that by making everything so incredibly easy, with all of this information simply falling into your lap with little or no effort at all, would seem to be terribly numbing, and reminds me of the story of the Land of the Lotus Eaters. For those who do not remember, this event takes place in Homer’s Odyssey, where Odysseus and his men are coming home to Ithaca. Just before he meets the Cyclops, he discovers a land where people eat flowers. I quote from Robert Fagle’s translation:
“...on the tenth day we reached the land of the Lotus-eaters, who live on a food that comes from a kind of flower. Here we landed to take in fresh water, and our crews got their mid-day meal on the shore near the ships. When they had eaten and drunk I sent two of my company to see what manner of men the people of the place might be, and they had a third man under them. They started at once, and went about among the Lotus-eaters, who did them no hurt, but gave them to eat of the lotus, which was so delicious that those who ate of it left off caring about home, and did not even want to go back and say what had happened to them, but were for staying and munching lotus with the Lotus-eaters without thinking further of their return; nevertheless, though they wept bitterly I forced them back to the ships and made them fast under the benches. Then I told the rest to go on board at once, lest any of them should taste of the lotus and leave off wanting to get home, so they took their places and smote the grey sea with their oars.”

What was actually a death sentence seemed to those sailors bedazzled by the lotus, to be everything they could ever want, and all they had to do was reach out and eat the lotus. But those under its influence couldn’t even be aware of the dangers it held.

It seems to me that reliance on such a “computerized-lotus” to search for what it determines you need before you even know it, and presents the results to you before you have even thought about any of it, and very possibly in overwhelming quantities and complexity, this would seem to be the very prescription for how to kill curiosity. Of course, it is easy to expand such a scenario and imagine the spectre of some silent, ruling cabal behind everything, leading everyone to the resources they want people to see, and we can picture ghoulish visions from the film “The Matrix”. Although such images are exaggerated, I believe dangers are looming even without them, and in any case, it demonstrates that the management of information really does have the potential to become a powerful political tool, especially in a world such as ours is becoming.

So to be frank, I am personally horrified by many of these possibilities and remain profoundly skeptical. For instance, although I have installed Mendeley on my computer, I still haven’t added any of my documents to it! Perhaps that is silly, yet I realize that skepticism, fear and even repugnance are natural reactions when someone confronts profound change. During the early days of printing, good Catholic folks were deeply shocked by some of the publications coming off the new-fangled presses. Although we may laugh and mock them today, such reactions are natural and easy to understand. Today we are witnessing similar reactions on the part of individuals, organizations, and even governments, to what they are seeing on the Internet.

I just hope that I can learn from the struggles of those people from the early days of printing as they tried to come to terms with what they saw, and let their experiences help me discover where the real problems lie. One point I am trying to learn: it is useless to get angry and try to clamp down on the changes we fear and perhaps even abhor. Stopping the changes doesn’t work and history takes a very dim view of you and your reactions.

At the same time, I admit I may be completely wrong and it may turn out that innovations such as search may actually free the human mind from milennia of unnecessary mindless toil and allow humanity to experience a new Renaissance and Enlightenment all at once. Let us hope so.

It seems to me, that search may very well represent a Darwinian challenge in the information environment that will force such deep and lasting change that it will prove to be evolutionary. I don’t think there is the slightest possibility of rolling things back to a former time, which I think most will agree was not really “better” at all, so there remains little to do but adapt to the new circumstances or die. Will a change toward a universal acceptance of search have an impact on libraries? Of course it will! I believe libraries are feeling the lightest initial impacts of search already, but of course, search itself is still in its infancy. Libraries will have to adapt to this in some way as well, or I think they will be fated to be discarded as anachronisms.

I wish I could offer some useful suggestions, but with new capabilities such as search--which will continue to develop in ways that are unpredictable--it seems everyone is entering virgin territory, whether they want to or not. All I can possibly suggest for librarians is to keep in mind the ALA Code of Ethics, especially those parts about trying to uphold “intellectual freedom” and not advancing “private interests at the expense of library users.” http://www.ala.org/ala/issuesadvocacy/proethics/codeofethics/codeethics.cfm

For those who are interested in some philosophical reflections, I suggest a thought-provoking talk by the journalist Frank Shirrmacher from Germany available on the Edge website, where, among other things, he discusses these concerns in relation to free will or determinism and possible political ramifications. http://www.edge.org/3rd_culture/schirrmacher09/schirrmacher09_index.html

I would like to close with a wonderful piece by Andrea Falconieri, his Ciaccona, performed by the group Sonatori de la Gioiosa Marca from Treviso, a town in northern Italy. Falconieri worked around Italy and Spain in the early 1600s before dying from the plague. One episode from his life I found curious was that he lost his job at the Santa Brigida convent in Genoa, because the Mother Superior decided that his music was too unsettling for the nuns! Perhaps you’ll understand from this piece. http://www.youtube.com/watch?v=V3SrHzY0d8Q

That’s it for now. Thank you for listening to Cataloging Matters with Jim Weinheimer, coming to you from Rome, Italy, the most beautiful, and the most romantic city in the world.

3 comments:

  1. Discovered your blog on Autocat. Enjoyed this posting very much (gave me a lot to think about) and the beautiful music was an added bonus.

    ReplyDelete
  2. Great! I am glad that I could listen.

    ReplyDelete
    Replies
    1. Excelent, very interesting podcast.

      Delete