Browse Definitions :
Definition

distributed search

Distributed search is a search engine model in which the tasks of Web crawling, indexing and query processing are distributed among multiple computers and networks.

Originally, most search engines were supported by a single supercomputer . In recent years, however, most have moved to a distributed model. Google search, for example, relies upon thousands of computers crawling the Web from multiple locations all over the world.

In Google's distributed search system, each computer involved in indexing crawls and reviews a portion of the Web, taking a URL and following every link available from it (minus those marked for exclusion). The computer gathers the crawled results from the URLs and sends that information back to a centralized server in compressed format. The centralized server then coordinates that information in a database , along with information from other computers involved in indexing.

When a user types a query into the search field, Google's domain name server ( DNS ) software relays the query to the most logical cluster of computers, based on factors such as its proximity to the user or how busy it is. At the recipient cluster, the Web server software distributes the query to hundreds or thousands of computers to search simultaneously. Hundreds of computers scan the database index to find all relevant records. The index server compiles the results, the document server pulls together the titles and summaries and the page builder creates the search result pages.

Some projects, such as Wikia Search (formerly Grub ) are moving towards an even more decentralized search model. Similarly to distributed computing projects such as SETI@home , many distributed search projects are supported by a network of voluntary users whose computers run client software in the background.

This was last updated in April 2008

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • risk management

    Risk management is the process of identifying, assessing and controlling threats to an organization's capital and earnings.

  • compliance as a service (CaaS)

    Compliance as a Service (CaaS) is a cloud service service level agreement (SLA) that specified how a managed service provider (...

  • data protection impact assessment (DPIA)

    A data protection impact assessment (DPIA) is a process designed to help organizations determine how data processing systems, ...

SearchSecurity

  • encryption

    Encryption is the method by which information is converted into secret code that hides the information's true meaning. The ...

  • cybersecurity

    Cybersecurity is the protection of internet-connected systems -- including hardware, software and data -- from cyberattacks.

  • computer worm

    A computer worm is a type of malicious software program whose primary function is to infect other computers while remaining ...

SearchHealthIT

SearchDisasterRecovery

  • business continuity plan (BCP)

    A business continuity plan (BCP) is a document that consists of the critical information an organization needs to continue ...

  • disaster recovery team

    A disaster recovery team is a group of individuals focused on planning, implementing, maintaining, auditing and testing an ...

  • cloud insurance

    Cloud insurance is any type of financial or data protection obtained by a cloud service provider. 

SearchStorage

Close