Browse Definitions :
Definition

site scraper

A site scraper is a type of software used to copy content from a website.

Site scrapers work similarly to web crawlers, which essentially perform the same function for the purposes of indexing websites. Web crawlers cover the whole Web, however, unlike site scrapers, which target user-specified websites.

Depending on the particular scraper program and user specifications, the software can download any data, including entire websites, and follow links to other content for further downloads. The data obtained may be saved as text, CSV, HTML or XML files; some scraper tools also enable export to a compatible database.

Content scraping has numerous legitimate purposes but is also often used for data theft and plagiarism. Websites featuring content scraped from other sites are called scraper sites.

Examples of site scrapers include Web Content Extractor, Wget, ScrapeGoat and Scraper, a Chrome extension.  

Asheesh Laroia explains web scraping in this video:

This was last updated in February 2014

Continue Reading About site scraper

SearchCompliance
  • OPSEC (operations security)

    OPSEC (operations security) is a security and risk management process and strategy that classifies information, then determines ...

  • smart contract

    A smart contract is a decentralized application that executes business logic in response to events.

  • compliance risk

    Compliance risk is an organization's potential exposure to legal penalties, financial forfeiture and material loss, resulting ...

SearchSecurity
  • What is cybersecurity?

    Cybersecurity is the protection of internet-connected systems such as hardware, software and data from cyberthreats.

  • DOS (disk operating system)

    A DOS, or disk operating system, is an operating system that runs from a disk drive. The term can also refer to a particular ...

  • private key

    A private key, also known as a secret key, is a variable in cryptography that is used with an algorithm to encrypt and decrypt ...

SearchHealthIT
SearchDisasterRecovery
  • What is risk mitigation?

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a business.

  • change control

    Change control is a systematic approach to managing all changes made to a product or system.

  • disaster recovery (DR)

    Disaster recovery (DR) is an organization's ability to respond to and recover from an event that affects business operations.

SearchStorage
  • NOR flash memory

    NOR flash memory is one of two types of non-volatile storage technologies.

  • What is RAID 6?

    RAID 6, also known as double-parity RAID, uses two parity stripes on each disk. It allows for two disk failures within the RAID ...

  • PCIe SSD (PCIe solid-state drive)

    A PCIe SSD (PCIe solid-state drive) is a high-speed expansion card that attaches a computer to its peripherals.

Close