четверг, 8 апреля 2010 г.
Algorithms for search engines
History of search engines
As an initial way of organizing access to information resources via the Internet used catalogs sites, Which is usually used thematic grouping of links. Pioneer in this area has Yahoo, which appeared in April 1994. Over time, the number of sites contained in the catalog increased, and the developers have created a special search engine directories. But such a system, of course, can not be called a search because the search was strictly limited to only those resources that were present at this site.
Brochures have been widely disseminated and used widely, but the internet is a dynamically developing, and together they developed and search methods. At the present time is difficult to find a system based on directories. It's very simple to explain, because even today the directory which will contain a huge number of resources that can provide access to only a small part contained in the information network. The biggest to date directory on the network, called Open Directory Project or DMOZIt includes information about the 5 million resources, which is quite a bit. After all, for comparison, the base of a world-famous search engine like Google has about 8 billion documents.
The first full-fledged search engine on the expanse of the Internet has appeared only in 1994, it was WebCrawler.
And a year later, in 1995, launching a search engine AltaVista and Lycos. The search engine AltaVista for many years held the forefront of the search for information online.
In 1997, Larry Page and Sergey Brin, students at Stanford University, have begun to implement a research project, which was developed by search engine Google, which today is the worldwide leader in the field of search.
Also in 1997, the month of September 23 the number of officially announced the creation of the Russian search engine Yandex, which is still market leader in search services in Russian segment of the network.
Today, there are only 3 search engines international level, it is MSN Search, Yahoo and Google, which have their own databases and search algorithms. Most everyone else on search engines as a basis using the results of the three above. For example, search engines: Mail.ru based on the search engine Yandex and search.aol.com based on the search engine Google, and search engines such as Lycos, AltaVista and AllTheWeb use database search engine Yahoo.
The leading search engine runet today is Yandex, the second place is Rambler, followed by Google, Mail.ru, Aport and KM.ru.
четверг, 1 апреля 2010 г.
Keywords and inquiries
Key words and inquiries
Search engines are based on the words requested by users to determine the results, which will be processed in accordance with their algorithms, ordered and issued to the user. However, search engines do not just recognize and retrieve exact matches of the requested words, they use knowledge of semantics (the science of language) for the construction of intellectual relevance. An example would be the case when the request loan providers search engine displays results that do not contain this phrase, and word lenders.
Search engines collect data on frequency of use of words and their joint distribution in the network. If the relevant words and phrases are often found together on pages or sites, search engines can construct intelligent theories about their relationship.
This extensive knowledge of the language and its usage gives the search engines to determine which pages are thematically linked, what is the subject page or site, as a reference structure of the network is divided into thematic community and much more.
The development of artificial intelligence search engines in the field of language means that the results of queries users are becoming more intelligent and evolving. Huge investments in the area of processes of natural languages will help achieve greater understanding of the meaning and intent in the user's query. In the long term as a result of this work, users can expect the increased relevance of the SERPs, as well as more accurate assumptions about the expectations of the search engines users.