Google Penguin and SEO

Posting to NGC4LIB

The blogosphere has been discussing the latest updates to the Google Search algorithm, these called “Google Panda 3.5” and “Google Penguin” announced April 24 of this year and Google Penguin has proven to be especially controversial. In essence, it is a step against some of the methods used in SEO (search engine optimization) that Google has deemed negative, or to use their terms “black hat webspam”. What does this mean? The official announcement (above) discusses this in some detail, including terms like “keyword stuffing” and “link schemes” while Google cites its own “quality guidelines”. Google punishes the websites it deems guilty by sending their sites farther down the list of results, and this can have devastating consequences for those involved.

Google Penguin may have had human costs already. Here is a post from one SEO fellow who claims that he will be impoverished, and another that Penguin has led to unemployment in parts of the developing countries because so much of SEO is taking place in those countries. These reports have not been verified, but it seems as if they could be logical consequences. The impact of the websites going down significantly in the web results definitely have negative consequences for the businesses affected, along with their employees.

This shows how much power Google has. I wrote about this in another post, “Google and Link Spam” about a little store that sells flour in Vermont. Also, since everything on the web has to do with money, I personally suspect that these updates have some relation to the looming Facebook IPO that so many are talking about.

In Google terms, “quality” in a website is quite a different concept from what a normal person would consider quality, and certainly is radically different from the idea of “quality” in a library’s collection, its catalog, or its public services. Google provides basic guidelines for their idea of quality:

  • Make pages primarily for users, not for search engines. Don’t deceive your users or present different content to search engines than you display to users, which is commonly referred to as “cloaking.”
  • Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you’d feel comfortable explaining what you’ve done to a website that competes with you. Another useful test is to ask, “Does this help my users? Would I do this if search engines didn’t exist?”
  • Don’t participate in link schemes designed to increase your site’s ranking or PageRank. In particular, avoid links to web spammers or “bad neighborhoods” on the web, as your own ranking may be affected adversely by those links.
  • Don’t use unauthorized computer programs to submit pages, check rankings, etc. Such programs consume computing resources and violate our Terms of Service. Google does not recommend the use of products such as WebPosition Gold™ that send automatic or programmatic queries to Google.

Of course, none of this has anything to do with the general notion of “quality”: the actual quality of the information contained in a webpage, whether the information is factual, is it obsolete or biased, whether it is based on sound reasoning or superstition. Google’s “quality” is related to a type of authenticity, although of a strange type. It is there primarily to protect people from wasting too much of their time on pure advertising. (I think)

Still, many have been left confused. For instance, their guideline “Avoid tricks intended to improve search engine rankings” seems to get rid of SEO altogether. What one would consider a trick, another would consider a flash of brilliant insight. SEO is vital but as with everything, it can be abused. Without SEO, the biggest sites could only continue to rise while the lesser ones would be fated to disappear into the morass.

One case where Google Penguin will demote your site in the search results is if your site gets too many links made to it too quickly. This is opposed to the concept of “natural links” which are supposed to build up gradually over time. “Natural links?” This all seems strange to me. What if you come up with the “killer information” that everyone wants, or something you have created is suddenly the newest internet meme? Why should you be punished? How long will the punishment last?

Also, it turns out that Google Penguin may actually make it easier for competing websites to harm one another. How can this happen? By one website employing the “black hat webspam” techniques mentioned above, but have them aimed at their competition’s site instead of their own. Clever, but fairly obvious when you get into that kind of mindset.

Another interesting demotion is in what is called “exact match domains” which are domain names that are based on the popular keywords. For instance, if you search “credit cards” the first hit is not Mastercard or something, it is Everyone is saying that these domains have been demoted but I haven’t seen it yet. “Brittany Spears”, “Barack Obama”, and “credit cards” all come up on top. A search for “ebooks” has come up second to Project Gutenberg.

Much of the impact is still being researched, but it must be understood that “relevance” when speaking in terms of search engines, is quite different from what the same term “relevance” means in the way people use it in their everyday speech.

While I still believe that SEO will eventually become an important issue for catalogers, the example of Google Penguin shows the dangers of it: that what could be found easily yesterday is much more difficult to find today. The patrons of libraries—not to mention the librarians themselves–would find this outrageously complicated, if not bordering on the insane.

The traditional task of the library catalog to provide “reliable results” remains just as crucial as ever, in my opinion. If SEO is to be worked into the library’s tools in some way, it must allow for these additional needs. Reward and punishment should not be part of the library’s tools.