GoldenSurf Topics:



Understanding Search Engines

How Search Engines Work

Search engines use software robots to survey the Web and build their databases. Web documents are retrieved and indexed.  When you enter a query at a search engine website, your input is checked against the search engine's keyword indices.  The best matches are then returned to you as hits.

There are two primary methods of text searching--keyword and concept.

Keyword Searching

This is the most common form of text search on the Web.  Most search engines do their text query and retrieval using keywords. 

Unless the author of the Web document specifies the keywords for her document (this is possible by using meta tags in the latest version of HTML), it's up to the search engine to determine them.  Essentially, this means that search engines pull out and index words that are believed to be significant. Words that are mentioned towards the top of a document and words that are repeated several times throughout the document are more likely to be deemed important.

Some sites index every word on every page. Others index only part of the document.  For example, Lycos indexes the title, headings, subheadings and the hyperlinks to other sites, along with the first 20 lines of text.

Full-text indexing systems generally pick up every word in the text except commonly occurring stop words such as "a," "an," "the," "is," "and," "or," and "www."  AltaVista claims to index all words, even the articles, "a," "an," and "the."  Some of the search engines discriminate upper case from lower case; others store all words without reference to capitalization.