Browse Definitions :
Definition

document sanitization

Contributor(s): Ivy Wigmore

Document sanitization is the process of ensuring that only the intended information can be accessed from a document.

In addition to making sure the document text doesn’t openly divulge anything it shouldn’t, document sanitization includes removing document metadata that could pose a privacy or security risk. Document metadata can contain the names of authors and modifiers, the dates of creation and changes, file size, edit changes, revision histories and comment exchanges between authors and editors. Because that metadata may contain sensitive information, it's important safeguard it from unauthorized access. 

A common way to remove metadata from a document is to convert it to PDF format before releasing it; however, there are processes that must be followed to ensure the document contains no unintended information.  The National Security Agency (NSA) recommends the following six-step process for secure conversion and redaction of Word documents:

  1. Create a copy of the original document.
  2. Turn off “Track Changes” on the copy and remove all visible comments.
  3. Delete any sensitive information from the document that you wish to redact.
  4. Use the Microsoft Office Document Inspector to check for any unwanted metadata.
  5. Save the new document and convert it to a PDF file.
  6. Use the Sanitize Document tool in Acrobat Professional as a second check before releasing the redacted PDF.

See also: metadata management, metadata security

This was last updated in August 2014

Continue Reading About document sanitization

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

Document sanitization can be automated now with simple add-on to your secure email or web gateway to ensure all documents leaving (or entering) your organization are clean.
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • risk management

    Risk management is the process of identifying, assessing and controlling threats to an organization's capital and earnings.

  • compliance as a service (CaaS)

    Compliance as a Service (CaaS) is a cloud service service level agreement (SLA) that specified how a managed service provider (...

  • data protection impact assessment (DPIA)

    A data protection impact assessment (DPIA) is a process designed to help organizations determine how data processing systems, ...

SearchSecurity

  • quantum key distribution (QKD)

    Quantum key distribution (QKD) is a secure communication method for exchanging encryption keys only known between shared parties.

  • identity theft

    Identity theft, also known as identity fraud, is a crime in which an imposter obtains key pieces of personally identifiable ...

  • cybercrime

    Cybercrime is any criminal activity that involves a computer, networked device or a network.

SearchHealthIT

SearchDisasterRecovery

  • disaster recovery plan (DRP)

    A disaster recovery plan (DRP) is a documented, structured approach that describes how an organization can quickly resume work ...

  • business continuity plan (BCP)

    A business continuity plan (BCP) is a document that consists of the critical information an organization needs to continue ...

  • disaster recovery team

    A disaster recovery team is a group of individuals focused on planning, implementing, maintaining, auditing and testing an ...

SearchStorage

  • logical unit number (LUN)

    A logical unit number (LUN) is a unique identifier for designating an individual or collection of physical or virtual storage ...

  • NVMe over Fabrics (NVMe-oF)

    NVMe over Fabrics, also known as NVMe-oF and non-volatile memory express over fabrics, is a protocol specification designed to ...

  • CIFS (Common Internet File System)

    CIFS (Common Internet File System) is a protocol that gained popularity around the year 2000, as vendors worked to establish an ...

Close