Tuesday, September 30, 2008

Search Engines

Search Engines on the internet. (Based on a text by Hartmut Winkler: Search Engines)

We can never find the things on the internet. We type in words at a search engine and hope that the right results are presented. This is often not the case. In addition, if you have found a useful site it is impossible to find it back unless you know exactly which words you used or how the site is called. This is because of the types of search engines used. It takes minimal training to find specific websites but it takes years of experience to actually use a search engine well. As a student this is one of my biggest problems.

The first type of search engine is based on classification of words. Yahoo for example has many employees who spend their time classifying words that are used on the internet so that words with the same classification are grouped together. New websites are put into several groups and they pop up when the right words are typed into the search engine. For example pollution falls under Society and Culture/ Environment and Nature/ Pollution. Yahoo has about 320 million websites categorized.
The problems with these kinds of search engines are that people sometimes look for sites that fall outside of these categories. People are looking for words in broader ways than Yahoo for example classifies them. So with these kinds of search engines you have to look at categories not for words.

The second type of search engine is based on words. Search engines like AltaVista read documents and websites and remember every word that comes up, so if people are looking for words than websites that include those words will come up. For example, if you type your name into such a search engine than sites that includes your name come up. AltaVista has about 125 million texts or websites. I believe Google is also this type of search engine.
The problem with these kinds of search enigines is that if you put words into the search engine and you put slightly different words into it the next time, or change the order of the words, than totally different result will come up. It is harder to find things back. In addition, search engines like AltaVista do not recognize words in the same category or synonyms; so you have to be really careful to formulate your question the right way otherwise you will get useless results.

The third type of search engine is some sort of combination between these two. Excite for example reads every word in texts and websites but it also clusters. The search engine clusters those websites and texts together that share many of the same keywords.
The problem with these kinds of websites at the moment is that they are less known and less used. Excite for example only has 50 million websites to search and that means that it suggests only certain kinds of websites. It is less broad and will therefore not always give the websites you are looking for.

Some old tips (most already known):
-Use AND to find websites with all the words.
-Use OR to find websites with all or any words.
-Use AND NOR to find websites with words excluded.
-Use “” to find websites with words lines together.
-Use lower case letters for lower case and capitals.
-Use capital letters for words that have those capitals, not words that use lower case letters.
-Use title: to find websites with words in the title area (certain areas on websites are called the title area and if you put title in the search engine it will look at websites where the words are in the title area, not in the whole text).
-Same for Host:, Domain:, URL: or Link:.

By
Merel van Helden

No comments: