Along came a Spider

Arachnaphobic or not, search engine spiders, crawlers or bots are exactly what you want crawling all over your web site, as many and as often as possible. Without them the Internet would seem a very empty place.

To state the obvious the term spider is derived from the concept of the World Wide Web from whence the acronym www is also derived. The world wide web itself comprises all the interconnected computers or servers that make up the Internet and email plus a number of other protocols.

The analogy of a spider moving across a spider web pretty much describes the manner in which a search engine spider navigates its way across the Internet, often referred to as spidering or crawling the web.

A search engine spider is actually a specialized computer program that is running on a computer server connected to the Internet. For the most part what this program is designed to do is follow the links that it finds on web pages, as it moves from page to page it analysis the content of the page the results of which are used by the search engine, to which the search spider belongs, in determining the relevance of the page to the search criteria.


With the billions of web pages on the Internet, even if each page were only visited once, this process would occupy a singe search spiders time from here to infinity and beyond, as the Internet is forever expanding as new pages are added continuously.

For this reason search engines obviously have more than one search engine spider running at any given time. In fact each time a spider visits a page with multiple links on it a new instance of the spider is spawned, each of the spiders will then follow a link, these spiders will in turn spawn their own new instances of the spider and so on and so on. With the ability to run multiple threads the spiders manage to get the job done, they manage so well in fact that they will usually revisit pages on a regular basis, just to check if new links have been added and to report back to the search engine any changes in content.

Without search spiders crawling the web the search engines would only be able to list sites that had explicitly been submitted to it, vast tracts of the Internet would be unmapped and the Internet would indeed seen a much emptier and less interesting place.

If you have not submitted your site to the search engine it will not explicitly send its spider to your web site.

If you have no incoming links from other web pages that are visited by the spider the spider will not be able to find your site.

If you have a site that has no outgoing links the spider will not have anywhere to go and the thread will end at your site. Although the search engine will know the location of your site it will have less information to use in establishing the relationship between your site and other sites, consequently the search engines overall understanding of your site will be less clear.

Search engines such as Google use the web sites quantity and quality of incoming and outgoing links as part of the algorithm used in determining page relevance.


Share |