Tuesday, December 7, 2010

FW: [NGC4LIB] Cablegate from Wikileaks: a case study

Posting to NGC4LIB

Jonathan Rochkind wrote:
<snip>
Ranking in search results anywhere, as with all software, is based on defining certain rules that will be followed to order the results.
Even in Google, citation (link) analysis is actually just one small part of the rules; it's the one they are most famous for, because it's the big innovation that let their search results satisfy users so much better than the competition when they first started. Even then it wasn't the only component in the rules, and since then Google has added all sorts of other tweaks too. The details are carefully guarded by google, but if you search around (on google, heh) you can find various  articles (from google and from third party observers) explaining some of the basic concepts. Commercial online businesses are of course especially interested in reverse engineering googles rules to get their pages to the top.
...
</snip>
Again, quite interesting comments. As you point out, there are many ways that Google ranks pages, based more and more on the nebulous concepts of "trust" and "authority" (for a quick overview of this, see:
"SEO TIP: Goodbye PageRank, Hello Trust and Authority" http://www.abraxasweb.com/blog/2010/05/29/seo-tip-goodbye-pagerank-hello-trust-and-authority/) and there is a lot of discussion among web masters about how to raise their "trust" and "authority". (Of course, this is according to all the Google definitions of "trust" "authority" "relevance" and so on, which are *quite different* from the popular understanding) For example, what are the sites where you get penalized? http://www.seoco.co.uk/blog/pr-penalised-authority-sites/
 
Of course, this is just the tip of the iceberg, and it is all done in secret. In any case, it is clear that Google is tweaking its results for political reasons (as Mark Cutts himself admitted changing the rankings due to the NY Times article in the exchange with John Battelle http://battellemedia.com/archives/2010/12/in_googles_opinion) and individual web masters are trying to raise their search results all the time. As Google tries to "correct" for this, it has consequences in all kinds of other areas. This has also happened in Google Maps, as I discussed in another post: http://comments.gmane.org/gmane.education.libraries.autocat/34453

Do libraries redo their arrangements etc. for political purposes? After all, there is a code of ethics that we are supposed to follow, but it is a difficult one. Yet some refuse to do so, and at personal risk. What is our role in all of this?

I really like Daniel Lovins' observation that "another major difference is that Google's search algorithm is a carefully-guarded secret. Library search algorithms, by contrast, especially when harnessed to open source search engines like Lucene/Solr, can be verified by others for accuracy and objectivity".

"Trust" should not be to simply have faith in Google's benevolence and wisdom. Rather, openness may represent the key to genuine trust (a traditional goal of librarianship anyway, and is what underlies our code of ethics). Still, it seems to me that this also implies some kind of control that Google and so on do not, and cannot, have.

Once again, I find myself returning to Ross Atkinson's "Controlled Zone". I am not finding fault with Google (in fact, I have found that Google is so beloved by many students that when I discuss these matters in my information literacy workshops, the response is real anger), but I point out that Google is only a tool, like a hammer or a drill. And like any tool, it has strengths and weaknesses. Since Google is such a major tool, and it changes constantly, it is important to reconsider its (continually changing) strengths along with its (continually changing) weaknesses.

No comments:

Post a Comment