Browse Definitions :
Definition

data lineage

Contributor(s): Matthew Haughn

Data lineage is the history of data, including where the data has traveled through-out the its existence within an organization. Data lineage is a required part of corporate and government data policy compliance. Tracking the history of data is achieved through data lineage documentation and software. Without a way to identify where data errors are introduced into the environment, it is difficult for data stewards to identify and fix data quality issues.

With effective tools, data governance can be eased through the documentation of data’s entire journey through the organization. The documentation of data lineage helps simplify two of the main data governance concerns in for the effects of changes in data: root cause analysis and business impact analysis (BIA). Clear understanding of root causes and impacts of issues with data is aided by knowing everything that happened to the data since it came to be.

In software development, the tracking of data lineage can help with reconciling the difficulties between Agile development best practices, data governance regulations and company data policy. Data lineage tools and procedures help track where data flaws were introduced, which can ease diagnoses and correction. Implementing the tracking of data lineage can be difficult and often seen as a low priority, however, earlier correction means less error propagation, which means the implementation of data lineage tools early in the process often proves worth the effort.

This was last updated in January 2019

Continue Reading About data lineage

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchCompliance

  • risk management

    Risk management is the process of identifying, assessing and controlling threats to an organization's capital and earnings.

  • compliance as a service (CaaS)

    Compliance as a Service (CaaS) is a cloud service service level agreement (SLA) that specified how a managed service provider (...

  • data protection impact assessment (DPIA)

    A data protection impact assessment (DPIA) is a process designed to help organizations determine how data processing systems, ...

SearchSecurity

  • cybersecurity insurance (cybersecurity liability insurance)

    Cybersecurity insurance, also called cyber liability insurance or cyber insurance, is a contract that an entity can purchase to ...

  • phishing

    Phishing is a form of fraud in which an attacker masquerades as a reputable entity or person in email or other communication ...

  • cybercrime

    Cybercrime is any criminal activity that involves a computer, networked device or a network.

SearchHealthIT

SearchDisasterRecovery

  • business continuity plan (BCP)

    A business continuity plan (BCP) is a document that consists of the critical information an organization needs to continue ...

  • disaster recovery team

    A disaster recovery team is a group of individuals focused on planning, implementing, maintaining, auditing and testing an ...

  • cloud insurance

    Cloud insurance is any type of financial or data protection obtained by a cloud service provider. 

SearchStorage

  • NVMe over Fabrics (NVMe-oF)

    NVMe over Fabrics, also known as NVMe-oF and non-volatile memory express over fabrics, is a protocol specification designed to ...

  • logical unit number (LUN)

    A logical unit number (LUN) is a unique identifier for designating an individual or collection of physical or virtual storage ...

  • CIFS (Common Internet File System)

    CIFS (Common Internet File System) is a protocol that gained popularity around the year 2000, as vendors worked to establish an ...

Close