Browse Definitions :
Definition

document sanitization

Document sanitization is the process of ensuring that only the intended information can be accessed from a document.

In addition to making sure the document text doesn’t openly divulge anything it shouldn’t, document sanitization includes removing document metadata that could pose a privacy or security risk. Document metadata can contain the names of authors and modifiers, the dates of creation and changes, file size, edit changes, revision histories and comment exchanges between authors and editors. Because that metadata may contain sensitive information, it's important safeguard it from unauthorized access. 

A common way to remove metadata from a document is to convert it to PDF format before releasing it; however, there are processes that must be followed to ensure the document contains no unintended information.  The National Security Agency (NSA) recommends the following six-step process for secure conversion and redaction of Word documents:

  1. Create a copy of the original document.
  2. Turn off “Track Changes” on the copy and remove all visible comments.
  3. Delete any sensitive information from the document that you wish to redact.
  4. Use the Microsoft Office Document Inspector to check for any unwanted metadata.
  5. Save the new document and convert it to a PDF file.
  6. Use the Sanitize Document tool in Acrobat Professional as a second check before releasing the redacted PDF.

See also: metadata management, metadata security

This was last updated in August 2014

Continue Reading About document sanitization

SearchCompliance
  • OPSEC (operations security)

    OPSEC (operations security) is a security and risk management process and strategy that classifies information, then determines ...

  • smart contract

    A smart contract is a decentralized application that executes business logic in response to events.

  • compliance risk

    Compliance risk is an organization's potential exposure to legal penalties, financial forfeiture and material loss, resulting ...

SearchSecurity
  • What is cybersecurity?

    Cybersecurity is the protection of internet-connected systems such as hardware, software and data from cyberthreats.

  • private key

    A private key, also known as a secret key, is a variable in cryptography that is used with an algorithm to encrypt and decrypt ...

  • DOS (disk operating system)

    A DOS, or disk operating system, is an operating system that runs from a disk drive. The term can also refer to a particular ...

SearchHealthIT
SearchDisasterRecovery
  • What is risk mitigation?

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a business.

  • change control

    Change control is a systematic approach to managing all changes made to a product or system.

  • disaster recovery (DR)

    Disaster recovery (DR) is an organization's ability to respond to and recover from an event that affects business operations.

SearchStorage
  • RAM (Random Access Memory)

    RAM (Random Access Memory) is the hardware in a computing device where the operating system (OS), application programs and data ...

  • RAID 6

    RAID 6, also known as double-parity RAID, uses two parity stripes on each disk. It allows for two disk failures within the RAID ...

  • NOR flash memory

    NOR flash memory is one of two types of non-volatile storage technologies.

Close