Browse Definitions :
Definition

distributed search

Distributed search is a search engine model in which the tasks of Web crawling, indexing and query processing are distributed among multiple computers and networks.

Originally, most search engines were supported by a single supercomputer . In recent years, however, most have moved to a distributed model. Google search, for example, relies upon thousands of computers crawling the Web from multiple locations all over the world.

In Google's distributed search system, each computer involved in indexing crawls and reviews a portion of the Web, taking a URL and following every link available from it (minus those marked for exclusion). The computer gathers the crawled results from the URLs and sends that information back to a centralized server in compressed format. The centralized server then coordinates that information in a database , along with information from other computers involved in indexing.

When a user types a query into the search field, Google's domain name server ( DNS ) software relays the query to the most logical cluster of computers, based on factors such as its proximity to the user or how busy it is. At the recipient cluster, the Web server software distributes the query to hundreds or thousands of computers to search simultaneously. Hundreds of computers scan the database index to find all relevant records. The index server compiles the results, the document server pulls together the titles and summaries and the page builder creates the search result pages.

Some projects, such as Wikia Search (formerly Grub ) are moving towards an even more decentralized search model. Similarly to distributed computing projects such as [email protected] , many distributed search projects are supported by a network of voluntary users whose computers run client software in the background.

This was last updated in April 2008

SearchCompliance

  • information governance

    Information governance is a holistic approach to managing corporate information by implementing processes, roles, controls and ...

  • enterprise document management (EDM)

    Enterprise document management (EDM) is a strategy for overseeing an organization's paper and electronic documents so they can be...

  • risk assessment

    Risk assessment is the identification of hazards that could negatively impact an organization's ability to conduct business.

SearchSecurity

  • cyber espionage

    Cyber espionage, also called cyber spying, is a form of cyber attack that is carried out against a competitive company or ...

  • virus (computer virus)

    A computer virus is malicious code that replicates by copying itself to another program, computer boot sector or document and ...

  • honeypot (computing)

    A honeypot is a network-attached system set up as a decoy to lure cyber attackers and detect, deflect and study hacking attempts ...

SearchHealthIT

SearchDisasterRecovery

  • risk mitigation

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a business.

  • call tree

    A call tree is a layered hierarchical communication model that is used to notify specific individuals of an event and coordinate ...

  • Disaster Recovery as a Service (DRaaS)

    Disaster recovery as a service (DRaaS) is the replication and hosting of physical or virtual servers by a third party to provide ...

SearchStorage

  • cloud storage

    Cloud storage is a service model in which data is transmitted and stored on remote storage systems, where it is maintained, ...

  • cloud testing

    Cloud testing is the process of using the cloud computing resources of a third-party service provider to test software ...

  • storage virtualization

    Storage virtualization is the pooling of physical storage from multiple storage devices into what appears to be a single storage ...

Close