Browse Definitions:
Definition

document sanitization

Contributor(s): Ivy Wigmore

Document sanitization is the process of ensuring that only the intended information can be accessed from a document.

In addition to making sure the document text doesn’t openly divulge anything it shouldn’t, document sanitization includes removing document metadata that could pose a privacy or security risk. Document metadata can contain the names of authors and modifiers, the dates of creation and changes, file size, edit changes, revision histories and comment exchanges between authors and editors. Because that metadata may contain sensitive information, it's important safeguard it from unauthorized access. 

A common way to remove metadata from a document is to convert it to PDF format before releasing it; however, there are processes that must be followed to ensure the document contains no unintended information.  The National Security Agency (NSA) recommends the following six-step process for secure conversion and redaction of Word documents:

  1. Create a copy of the original document.
  2. Turn off “Track Changes” on the copy and remove all visible comments.
  3. Delete any sensitive information from the document that you wish to redact.
  4. Use the Microsoft Office Document Inspector to check for any unwanted metadata.
  5. Save the new document and convert it to a PDF file.
  6. Use the Sanitize Document tool in Acrobat Professional as a second check before releasing the redacted PDF.

See also: metadata management, metadata security

This was last updated in August 2014

Continue Reading About document sanitization

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Document sanitization can be automated now with simple add-on to your secure email or web gateway to ensure all documents leaving (or entering) your organization are clean.
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

SearchCompliance

SearchSecurity

  • password

    A password is an unspaced sequence of characters used to determine that a computer user requesting access to a computer system is...

  • adware

    Adware is any software application in which advertising banners are displayed while a program is running.

  • botnet

    A botnet is a collection of internet-connected devices, which may include PCs, servers, mobile devices and internet of things ...

SearchHealthIT

SearchDisasterRecovery

  • call tree

    A call tree -- sometimes referred to as a phone tree -- is a telecommunications chain for notifying specific individuals of an ...

  • mass notification system (MNS)

    A mass notification system is a platform that sends one-way messages to inform employees and the public of an emergency.

  • disaster recovery as a service (DRaaS)

    One approach to a strong disaster recovery plan is DRaaS, where companies offload data replication and restoration ...

SearchStorage

  • data migration

    Data migration is the process of transferring data between data storage systems, data formats or computer systems.

  • compact disc (CD)

    A compact disc is a portable storage medium that can be used for recording, storing and playing back audio, video and other data ...

  • secondary storage

    Secondary storage is used to protect inactive data written from a primary storage array to a nonvolatile tier of disk, flash or ...

SearchSolidStateStorage

  • SSD RAID (solid-state drive RAID)

    SSD RAID (solid-state drive RAID) is a methodology commonly used to protect data by distributing redundant data blocks across ...

  • Tier 0

    Tier 0 (tier zero) is a level of data storage that is faster, and perhaps more expensive, than any other level in the storage ...

  • PCIe SSD (PCIe solid-state drive)

    A PCIe SSD (PCIe solid-state drive) is a high-speed expansion card that attaches a computer to its peripherals.

SearchCloudStorage

  • RESTful API

    A RESTful application program interface breaks down a transaction to create a series of small modules, each of which addresses an...

  • cloud storage infrastructure

    Cloud storage infrastructure is the hardware and software framework that supports the computing requirements of a private or ...

  • Zadara VPSA and ZIOS

    Zadara Storage provides block, file or object storage with varying levels of compute and capacity through its ZIOS and VPSA ...

Close