Browse Definitions :
Definition

document sanitization

Document sanitization is the process of ensuring that only the intended information can be accessed from a document.

In addition to making sure the document text doesn’t openly divulge anything it shouldn’t, document sanitization includes removing document metadata that could pose a privacy or security risk. Document metadata can contain the names of authors and modifiers, the dates of creation and changes, file size, edit changes, revision histories and comment exchanges between authors and editors. Because that metadata may contain sensitive information, it's important safeguard it from unauthorized access. 

A common way to remove metadata from a document is to convert it to PDF format before releasing it; however, there are processes that must be followed to ensure the document contains no unintended information.  The National Security Agency (NSA) recommends the following six-step process for secure conversion and redaction of Word documents:

  1. Create a copy of the original document.
  2. Turn off “Track Changes” on the copy and remove all visible comments.
  3. Delete any sensitive information from the document that you wish to redact.
  4. Use the Microsoft Office Document Inspector to check for any unwanted metadata.
  5. Save the new document and convert it to a PDF file.
  6. Use the Sanitize Document tool in Acrobat Professional as a second check before releasing the redacted PDF.

See also: metadata management, metadata security

This was last updated in August 2014

Continue Reading About document sanitization

SearchCompliance
  • ISO 31000 Risk Management

    The ISO 31000 Risk Management framework is an international standard that provides businesses with guidelines and principles for ...

  • pure risk

    Pure risk refers to risks that are beyond human control and result in a loss or no loss with no possibility of financial gain.

  • risk reporting

    Risk reporting is a method of identifying risks tied to or potentially impacting an organization's business processes.

SearchSecurity
  • walled garden

    On the internet, a walled garden is an environment that controls the user's access to network-based content and services.

  • potentially unwanted program (PUP)

    A potentially unwanted program (PUP) is a program that may be unwanted, despite the possibility that users consented to download ...

  • plaintext

    In cryptography, plaintext is usually ordinary readable text before it is encrypted into ciphertext or after it is decrypted.

SearchHealthIT
SearchDisasterRecovery
  • What is risk mitigation?

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a business.

  • fault-tolerant

    Fault-tolerant technology is a capability of a computer system, electronic system or network to deliver uninterrupted service, ...

  • synchronous replication

    Synchronous replication is the process of copying data over a storage area network, local area network or wide area network so ...

SearchStorage
  • Remote Direct Memory Access (RDMA)

    Remote Direct Memory Access (RDMA) is a technology that enables two networked computers to exchange data in main memory without ...

  • storage (computer storage)

    Data storage is the collective methods and technologies that capture and retain digital information on electromagnetic, optical ...

  • storage medium (storage media)

    In computers, a storage medium is a physical device that receives and retains electronic data for applications and users and ...

Close