Browse Definitions :
Definition

robots.txt

Contributor(s): Matthew Haughn

Robots.txt is a file on a website that instructs search engine crawlers which parts of the site should not be accessed by search engine bot programs. Robots.txt is a plaintext file but uses special commands and syntax for webcrawlers. Though not officially standardized, robots.txt is generally followed by all search engines.

Spider programs, such as Googlebot, index a website using instructions set forth by the site's webmaster. Sometimes a webmaster may have parts of site that have not have been optimized for search engines, or some parts of websites might be prone to exploitation by spammers through, for example, link spam on a page that features user generated content (UGC). Should a webmaster wish to keep pages hidden from Google search, he can block the page with a robots.txt file at the top-level folder of the site.Robots.txt is also known as “the robot exclusion protocol.” Preventing crawlers from indexing spammy content means the page will not be considered when determining PageRank and placement in search engine results pages (SERP). 

The nofollow tag is another way to control webcrawler behavior. The nofollow tag stops crawlers from tallying links within pages for determining PageRank. Webmasters can use nofollow to avoid search engine optimization (SEO) penalties. To prevent Googlebot from following any links on a given page of a site, the webmaster can include a nofollow meta tag in the robots.txt file; to prevent the bot from following individual links, they can add rel="nofollow" to the links themselves.

This was last updated in June 2017

Continue Reading About robots.txt

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

SearchCompliance

SearchSecurity

  • time-based one-time password (TOTP)

    A time-based one-time password (TOTP) is a temporary code, generated by an algorithm, for use in authenticating access to ...

  • Security Operations Center (SOC)

    A security operations center (SOC) is a command center facility for a team of IT professionals with expertise in information ...

  • incident response team

    An incident response team is a group of IT professionals in charge of preparing for and reacting to any type of organizational ...

SearchHealthIT

SearchDisasterRecovery

  • disaster recovery team

    A disaster recovery team is a group of individuals focused on planning, implementing, maintaining, auditing and testing an ...

  • cloud insurance

    Cloud insurance is any type of financial or data protection obtained by a cloud service provider. 

  • business continuity software

    Business continuity software is an application or suite designed to make business continuity planning/business continuity ...

SearchStorage

  • storage class memory (SCM)

    Storage class memory (SCM) is a type of NAND flash that includes a power source to ensure that data won't be lost due to a system...

  • Hadoop as a service (HaaS)

    Hadoop as a service (HaaS), also known as Hadoop in the cloud, is a big data analytics framework that stores and analyzes data in...

  • blockchain storage

    Blockchain storage is a way of saving data in a decentralized network which utilizes the unused hard disk space of users across ...

Close