Browse Definitions :
Definition

data hygiene

Contributor(s): Ivy Wigmore
This definition is part of our Essential Guide: Guide to managing a data quality assurance program

Data hygiene is the collective processes conducted to ensure the cleanliness of data. Data is considered clean if it is relatively error-free. Dirty data can be caused by a number of factors including duplicate records, incomplete or outdated data, and the improper parsing of record fields from disparate systems. Errors can be introduced at any stage as data is entered, stored and managed.

Data quality is crucial to operational and transactional processes within the enterprise and to the reliability of business analytics (BA) / business intelligence (BI) reporting.

Data scrubbing, also called data cleansing, is the process of amending or removing data in a database that is incorrect, incomplete, improperly formatted, or duplicated. Typically the process involves updating it, standardizing it, and de-duplicating records to create a single view of the data, even even if it is stored in multiple disparate systems.

This was last updated in April 2013

Continue Reading About data hygiene

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

Data Hygiene is the state of data validity and reliability relative to its intended purposes. the functions of data hygiene include TL and scrubbing. data quality and data hygiene are synonomous and "relatively error-free" is not a performance nor quality measure since it cannot, itself, be measured
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • smart contract

    A smart contract, also known as a cryptocontract, is a computer program that directly controls the transfer of digital currencies...

  • risk map (risk heat map)

    A risk map, also known as a risk heat map, is a data visualization tool for communicating specific risks an organization faces. A...

  • internal audit (IA)

    An internal audit (IA) is an organizational initiative to monitor and analyze its own business operations in order to determine ...

SearchSecurity

SearchHealthIT

  • Health IT (health information technology)

    Health IT (health information technology) is the area of IT involving the design, development, creation, use and maintenance of ...

  • fee-for-service (FFS)

    Fee-for-service (FFS) is a payment model in which doctors, hospitals, and medical practices charge separately for each service ...

  • biomedical informatics

    Biomedical informatics is the branch of health informatics that uses data to help clinicians, researchers and scientists improve ...

SearchDisasterRecovery

  • risk mitigation

    Risk mitigation is a strategy to prepare for and lessen the effects of threats faced by a data center.

  • ransomware recovery

    Ransomware recovery is the process of resuming options following a cyberattack that demands payment in exchange for unlocking ...

  • natural disaster recovery

    Natural disaster recovery is the process of recovering data and resuming business operations following a natural disaster.

SearchStorage

  • RAID 5

    RAID 5 is a redundant array of independent disks configuration that uses disk striping with parity.

  • non-volatile storage (NVS)

    Non-volatile storage (NVS) is a broad collection of technologies and devices that do not require a continuous power supply to ...

  • petabyte

    A petabyte is a measure of memory or data storage capacity that is equal to 2 to the 50th power of bytes.

Close