A search engines basically performs the following tasks –
Data Acquisition
A search engine has automated software called “crawlers”, “bots” or “spiders” that constantly roam the Internet for new websites. Once a website is found, these programs obtain the specified information from each webpage and pass it on to search engine database.
Data Storage and Indexing
The data obtained by the crawlers / spiders is stored in a huge database. The stored data is then categorized and indexed as specified by the search engine administrators.
Data Ranking
The indexed pages with similar content are then ranked. This ranking method again is specified by the administrators and can depend on several factors such as keyword relevance, incoming links etc.
Data Display
Every search engine has a data entry window where the user enters their query. The entered query is compared against the indexed data stored in the database and the most relevant websites are displayed on the end users screen in the pre-determined ranking order.
The specification of how a particular search engine obtains website data, what data it captures and stores, how it indexes websites, ranks them and relates them to a particular user query is collectively called a search engine algorithm. Every search engine uses a different algorithm and hence every search engine gives slightly different results for the same query.
Search engines usually tout their algorithms as a competitive advantage – the better their algorithm, the more relevant the search results which leads to the end user finding the desired information easier and faster.