[pullquote align=”left|center|right” textalign=”left|center|right” width=”30%”][/pullquote]Google has, despite residual competition from Yahoo and Microsoft, become the world’s default search engine. Indeed, such is the omnipresence of Google, the word itself is entering modern contemporary language (“to Google”, “to have Googled” etc.). To have, or have not, a high page ranking in Google can mean the difference between success and failure for many businesses.
But how exactly does Google Google?
Bust the myths first
One of the first things to get out in the open before any discussion about how Google Googles is that no one really knows for sure! (Well apart from those people which actually work at Google that is.)
This is an important fact to consider, especially given the huge industry that has grown up in search engine optimization (SEO) and search marketing in general (optimizing web pages and marketing strategies to gain high rankings in search engines). Whilst the basics are well known because they have been proven to work in the real world, Google has kept relatively quiet about the actual specifics of how it ranks web pages!
Why? Quite simply to avoid unscrupulous web designers and the like from abusing the system! In the good old days (sepia tinted images come to mind) when the web was young, search engines used meta tags to ascertain what a website was all about, and it used those tags to return search queries. However, these were so abused by certain websites that they became almost useless as a means of ascertaining what a webpage was all about!!! Hence, Google (and other search engine providers) do not generally publicize how they rank pages, using fiendishly complex algorithms instead…
The simple mechanics
Anyway, I digress, back to the question in hand.
The simple mechanics of how Google Googles (try saying that quickly) is relatively simple, and is a three stage process:
Stage 1 – the Googlebot!
The Googlebot is Google’s web spider. A web spider is an automated programme that trawls the internet visiting pages that it can find and harvesting data from them! The Googlebot is the foundation of how Google works!
The Googlebot (like other web spiders) on visiting a site will download information about that site to Google’s servers, this information includes the meta data, content (abridged), keywords and the location of content and keywords. Further, the Googlebot will also “deep-crawl” a website, harvesting links which will be “crawled” in the future.
The Googlebot will usually “crawl” a webpage once a month, for the more popular websites and those sites which have regularly changing content (such as news media) this will be more often!
Stage 2 – the Indexer!
Once the Googlebot has crawled a webpage, the results of that crawl are downloaded to Google’s servers and indexed (using the indexer). The content of a website is indexed (alphabetically) by search term (the indexer generally ignores common words such as “is, on, in, or, of”.. etc.), together with the location on the document where that term appears (this is why you need to give your search terms prominence – but that’s a different discussion).
The output of the indexer is a database which stores the location of words (and their location) on every website crawled by the Googlebot. This index is used by Google’s query processor to generate search results.
Stage 3 – Google query processor
The final stage, is Google’s query processor. This takes a user’s query (perhaps a single word, or a more complex search query), compares it to its search results (as stored in the indexer) and then returns the results to the user, based on the prominence of those search terms on a particular page, their proximity to each other etc.. Sounds simple?
However, considering there are approximately (as no one really knows for sure) 25 billion web pages in existence (Google itself admits it’s only trawled a fraction, only several billion!!!), the above methodology in returning search results would be somewhat haphazard and unlikely to give the user relevant results!
The Google difference – relevance
So what makes Google different, and what makes it so successful? In a word – relevance!
Google strives to produce relevant results for every single search query it handles. The effectiveness of this is reflected in Google’s market dominance of the search engine market. But how does Google do this time and time again?
The search algorithm and Page Rank
Google spends a lot of time and effort in refining its search engine in order to deliver relevance to each and every search query it handles.
Firstly, its spidering and indexing of web pages is highly effective – this is the backbone of producing relevant results. Google has produced algorithms and processes which can, with a high degree of accuracy, determine what a particular web page is about by analyzing the text, layout, headings, images and links on a specific web page.
Web designers, well the good ones at least, therefore design their web pages to be as Google (and other search engine) friendly as possible. This is often called “internal Search Engine Optimization”, and classified, by some agencies as a separate service to designing a website (???).
Internal SEO aims at making a website more easily read and understood by the Googlebot (and equivalent web spiders) by making sure that certain standards are met, that links and images are properly labelled and that the text (content) itself is appropriate and is actually written in a way that can help web spiders “understand” the page’s subject (use of key words, proximity, and position of key words etc.). Most search engines will deliver broadly similar results here.
Viva la difference – Page Rank – External
However, what makes Google’s results more relevant for the majority of users, is its use of the Page Rank algorithm (named after Larry page – one of the co-founders of Google). Once a website is Page Ranked, it is assigned a ranking of between 0 and 10. With “10″ being the best! Higher Page Ranked websites will be listed higher on Google’s search results than lower ones (page rankings are logarithmic, meaning that there are a lot of lower ranked pages to higher ranked pages).
Put simply, Page Rank is an algorithm that assesses a pages relevance for a particular subject (search term) by the number and quality of inward bound links (links pointing to that particular web page) that are relevant to that search term. The theory being that good websites with good content will be linked to already by other pages. Quality is important too – with links coming from highly ranked website accounting for more than other links (significantly more in certain cases).
Optimizing a website for links is often called “external Search Engine Optimization”, and can be done in a number of ways, from link exchanges with similar websites, to posting articles on article boards to blogs (quick tip to get your site listed on Google faster is to start a blog on Blogpedia.com linking back to your site).
Oh, and one last thing. The web has always been open to abuse by unscrupulous web designers trying to beat the search engines! Firstly, it was with meta-tags, then key words in the text of the document, and now with link farms (websites set up full of links to other websites, serving no other purpose than to link to each other and manipulate search engine rankings).
Google, and other search engines, have highly complex algorithms and have processes in place to identify manipulation of links, keywords and meta-tags. If a search engine decides a particular website is guilty of such activity, that website could find itself black listed!!
Not so simple after all
Search engine optimization, either internal or external, whilst relatively simple in generic terms can get complicated. Hence the SEO industry!!! So whilst this blog entry is simplistic, don’t necessarily be fooled that getting a high-ranking website is an easy thing to do!
Marketing director at Hot Lemon – http://www.hot-lemon.com
Network Marketing Professional