Skip to Main Content

Research on the Free Web Tutorial: B. What do Search Engines actually search?

The hidden mysteries of search engines

Search engines do NOT, in fact, “search the web” each time you submit a search. Each search engine searches its own database of words drawn from the computer code behind individual web pages. Search companies like Google send "spider" programs out to "crawl" through the internet looking for new or changed web files. The spiders collect the words and bring them back to include in the database for that company.

 

That means that Search Engines:

  • Do NOT all have the same data. (only what their own spiders find.)
  • Do not include everything on the web.
  • Are always just a bit behind the times.
  • Will sometimes return “dead” (outdated) links.
  • Are susceptible to manipulation depending upon what is in the page code.  Hint here: this is HUGELY IMPORTANT! (do a search for Google Bombs to see how this plays out in real situations)

So after the search engine finds your words, how does it list the millions of results?  Why is the first one first? How did it choose the first page of ten? How does a computer figure out "relevancy?" Well, when the words from the pages are collected into the search company's database, other information about them is also collected, such as:

  • Location: where the word appears on the page. (in Web address/Title/Body/Link on page);
  • Viewable: whether it appears to the user or is hidden behind the scenes in the code;
  • Rate: how many times the word is used;
  • Un/attached: whether it appears singly or more commonly with other words. (United and States are actually two individual words that are commonly used together as a phrase);
  • Categories: Whether there are useful categories of information such as images, phone numbers, addresses, definitions, UPS tracking codes, etc.;
  • Result Hits: How many times that page gets chosen from that search engine’s results list.
  • Link backs: How many times that page is linked to from other pages on the web (note that this is easily manipulated).

 
Anything that can be counted or given a value is fed into a formula which determines the order of the results. (Note that not everything that is potentially useful can be counted by a search engine. Many people think search engines include how many times a site is visited in their formula, but they don't have that information.)

The exact formula is proprietary to each database – Google does this differently than Bing does, and they ain’t tellin’. Why do you need to think about this? Because things like paid advertisements can influence the rankings. (Advertising will come up several times in this tutorial - it's important.) Makes it kind of difficult to know just what you’re getting, which is why you have to ultimately judge for yourself what to click on.

 

Previous  /  Next