Data and data management

Terms related to data, including definitions about data warehousing and words and phrases about data management.
  • What is dark data? - Dark data is digital information an organization collects, processes and stores that is not currently being used for business purposes.
  • What is data activation? - Data activation is a marketing approach that uses consumer information and data analytics to help companies gain real-time insight into target audience behavior and plan for future marketing initiatives.
  • What is data aggregation? - Data aggregation is any process whereby data is gathered and expressed in a summary form.
  • What is data analytics (DA)? - Data analytics (DA) is the process of examining data sets to find trends and draw conclusions about the information they contain.
  • What is data architecture? A data management blueprint - Data architecture is a discipline that documents an organization's data assets, maps how data flows through IT systems and provides a blueprint for managing data, as this guide explains.
  • What is data as a service (DaaS)? - Data as a service (DaaS) is an information provision and distribution model in which data files -- including text, images, sounds and videos -- are made available to customers over a network, typically the internet.
  • What is data automation? - Data automation is the use of software tools and infrastructure to streamline data management tasks.
  • What is data cleansing (data cleaning, data scrubbing)? - Data cleansing, also referred to as data cleaning or data scrubbing, is the process of fixing incorrect, incomplete, duplicate or otherwise erroneous data in a data set.
  • What is data curation? - Data curation is the process of creating, organizing and maintaining data sets so people looking for information can access and use them.
  • What is data democratization? - Data democratization makes information in a digital format accessible to the average end user.
  • What is data egress? How it works and how to manage costs - Data egress is when data leaves a closed or private network and is transferred to an external location.
  • What is data governance and why does it matter? - Data governance is the process of managing the availability, usability, integrity and security of the data in enterprise systems, based on internal standards and policies that also control data usage.
  • What is data in use? - Data in use is data that is currently being updated, processed, erased, accessed or read by a system, application, user or device.
  • What is data labeling? - Data labeling is the process of identifying and tagging data samples commonly used in the context of training machine learning (ML) models.
  • What is data lifecycle? - A data lifecycle is the sequence of stages that a unit of data goes through from its initial generation or capture to its archiving or deletion at the end of its useful life.
  • What is data loss prevention (DLP)? - Data loss prevention (DLP) -- sometimes referred to as 'data leak prevention,' 'information loss prevention' or 'extrusion prevention' -- is a strategy to mitigate threats to critical data.
  • What is data management and why is it important? Full guide - Data management is the process of ingesting, storing, organizing and maintaining the data created and collected by an organization, as explained in this in-depth guide.
  • What is data management as a service (DMaaS)? - Data management as a service (DMaaS) is a type of cloud service that provides enterprises with centralized storage for disparate data sources.
  • What is data masking? - Data masking is a security technique that modifies sensitive data in a data set so it can be used safely in a non-production environment.
  • What is data migration? Definition, strategy and best practices - Data migration is the process of transferring data between data storage systems, data formats or computer systems.
  • What is data orchestration? - Data orchestration is the process of automating, coordinating and organizing the movement of data across an enterprise through business intelligence (BI) and other analytical tools.
  • What is data preparation? An in-depth guide - Data preparation is the process of gathering, combining, structuring and organizing data for use in business intelligence, analytics and data science applications, as explained in this guide.
  • What is data preprocessing? Key steps and techniques - Data preprocessing, a component of data preparation, describes any type of processing performed on raw data to prepare it for another data processing procedure.
  • What is data profiling? - Data profiling refers to the process of examining, analyzing, reviewing and summarizing data sets to gain insight into the quality of data.
  • What is data quality and why is it important? - Data quality measures a data set's condition based on factors such as accuracy, completeness, consistency, timeliness, uniqueness and validity.
  • What is data science as a service (DSaaS)? - Data science as a service (DSaaS) is a form of outsourcing that involves the delivery of information gleaned from advanced analytics applications run by data scientists at an outside company to corporate clients for their business use.
  • What is data science? The ultimate guide - Data science is the process of using advanced analytics techniques and scientific principles to analyze data and extract valuable information for business decision-making, strategic planning and other uses.
  • What is data stewardship? - Data stewardship is the management and oversight of an organization's data assets, which helps provide business users with high-quality, easily accessible and consistent data.
  • What is data storytelling? - Data storytelling is the process of translating complex data analyses into understandable terms to inform a business decision or action.
  • What is data transformation? Definition, types and benefits - Data transformation is the process of converting data from one format -- such as a database file, Extensible Markup Language (XML) document or Excel spreadsheet -- into another format.
  • What is data validation? - Data validation is the practice of checking the integrity, accuracy and structure of data before it is used for or by one or more business operations.
  • What is data? - In computing, data is information translated into a form that is efficient for movement or processing.
  • What is database normalization? - Database normalization is intrinsic to most relational database schemes.
  • What is demand planning and why is it important? - Demand planning is the process of forecasting the demand for a product or service so it can be produced and delivered more efficiently and to the satisfaction of customers.
  • What is denormalization and how does it work? - Denormalization is the process of adding precomputed redundant data to an otherwise normalized relational database to improve read performance.
  • What is deterministic/probabilistic data? - Deterministic and probabilistic are opposing terms that can be used to describe customer data and how it is collected.
  • What is digital health (digital healthcare)? - Digital health, also known as digital healthcare, is the use of digital technologies in healthcare.
  • What is dimensionality reduction? - Dimensionality reduction is a process and technique to reduce the number of dimensions -- or features -- in a data set.
  • What is electronic data processing (EDP)? - Electronic data processing (EDP) refers to the gathering of data using electronic devices, such as computers, servers and internet of things (IoT) technologies.
  • What is employee self-service (ESS)? - Employee self-service (ESS) is a widely used human resources technology that enables employees to perform many job-related functions that were once largely paper-based, or otherwise maintained by management, administrative or HR staff.
  • What is enterprise content management? Guide to ECM - Enterprise content management is a set of defined processes, strategies and tools that enables a business to obtain, organize, store and deliver critical information to its employees, business stakeholders and customers.
  • What is explainable AI? - Explainable AI (XAI) is artificial intelligence (AI) programmed to describe its purpose, rationale and decision-making process in a way that the average person can understand.
  • What is Extract, Load, Transform (ELT)? - Extract, Load, Transform (ELT) is a data integration process for transferring raw data from a source server to a data system (such as a data warehouse or data lake) on a target server and then preparing the information for downstream uses.
  • What is FHIR (Fast Healthcare Interoperability Resources)? - Fast Healthcare Interoperability Resources (FHIR) is an interoperability standard developed by Health Level Seven International (HL7) to facilitate the exchange of healthcare information between various entities involved in the healthcare ecosystem.
  • What is GDPR? Compliance and conditions explained - The General Data Protection Regulation (GDPR) is legislation that updated and unified data privacy laws across the European Union (EU).
  • What is health informatics? - Health informatics is the practice of applying insight gained from acquiring and analyzing health and biomedical data to help clinicians make better healthcare-related decisions and improve patient care.
  • What is health IT (health information technology)? - Health IT (health information technology) is the area of IT involving the design, development, creation, use and maintenance of information systems for the healthcare industry.
  • What is IDoc (intermediate document)? - IDoc (intermediate document) is a standard data structure used in SAP applications to transfer data to and from SAP system applications and external systems.
  • What is information rights management (IRM)? - Information rights management (IRM) is a discipline that involves managing, controlling and securing content from unwanted access.
  • What is intelligent document processing (IDP)? - Intelligent document processing (IDP) is a type of workflow automation technology designed to automate the process of extracting data from physical papers and image-based documents.
  • What is intelligent process automation (IPA)? - Intelligent process automation (IPA) is a combination of technologies used to manage and automate digital processes.
  • What is master data management (MDM)? - Master data management (MDM) is a process that creates a uniform set of data on customers, products, suppliers and other business entities from different IT systems.
  • What is Meditech Medical Information Technology, Inc.? - Meditech (Medical Information Technology, Inc.
  • What is Microsoft Azure and how does it work? - Microsoft Azure, formerly known as Windows Azure, is Microsoft's public cloud computing platform.
  • What is Microsoft Power BI? Uses, features and guide - Microsoft Power BI is a business intelligence (BI) platform that provides nontechnical business users with tools for aggregating, analyzing, visualizing and sharing data.
  • What is Microsoft Visual FoxPro (VFP)? - Microsoft Visual FoxPro (VFP) is an object-oriented programming (OOP) environment with a built-in relational database engine.
  • What is natural language query (NLQ)? - Natural language query (NLQ) is a capability that enables users to ask questions within their analytics platforms using ordinary human language instead of query language.
  • What is NoSQL (Not Only SQL database)? - NoSQL is an approach to database management that can accommodate a wide variety of data models, including key-value, document, columnar and graph formats.
  • What is Oracle? - Oracle is one of the largest vendors in the enterprise IT market and the shorthand name of its flagship product, a relational database management system (RDBMS) that's formally called Oracle Database.
  • What is PaaS? Platform as a service definition and guide - Platform as a service (PaaS) is a cloud computing model where a third-party provider delivers hardware and software tools to users over the internet.
  • What is picture archiving and communication system (PACS)? - Picture archiving and communication system (PACS) is a medical imaging technology used primarily in healthcare organizations to securely store and digitally transmit electronic images and clinically relevant reports.
  • What is PL/SQL (Procedural Language/Structured Query Language)? - In Oracle database management, PL/SQL is a procedural language extension to Structured Query Language (SQL).
  • What is population health management (PHM)? - Population health management (PHM) is a data-driven discipline within the healthcare industry that seeks to improve health outcomes for a defined population.
  • What is qualitative data? - Qualitative data is descriptive information that focuses on concepts and characteristics, rather than numbers and statistics.
  • What is records management? - Records management is the supervision and administration of digital or paper records, regardless of format.
  • What is RHIA (Registered Health Information Administrator)? - A registered health information administrator (RHIA) is a certified professional who oversees the creation and use of electronic health records (EHRs).
  • What is SaaS (software as a service)? - Software as a service (SaaS) is a software distribution model in which a cloud provider hosts applications and makes them available to end users over the internet.
  • What is Salesforce Wave Analytics (Salesforce CRM Analytics)? - Salesforce Wave Analytics, now known as CRM Analytics, is a business intelligence (BI) and analytics platform from Salesforce that provides native analytics, visual insights and predictions powered by artificial intelligence (AI) for Salesforce CRM to help businesses make better, data-driven decisions.
  • What is SAP Basis? - SAP Basis is the technical foundation that enables SAP applications to function.
  • What is SAP BW (Business Warehouse)? - SAP Business Warehouse (BW) is a model-driven data warehousing product based on the SAP NetWeaver ABAP platform.
  • What is SAP Data Services? - SAP Data Services is a data integration and transformation software application that enables organizations to capture more meaning and value from their structured and unstructured data.
  • What is secure multiparty computation (SMPC)? - Secure multiparty computation (SMPC) is a form of confidential computing that protects the privacy and security of systems and data sources, while maintaining the data's integrity.
  • What is self-service business intelligence (self-service BI)? - Self-service business intelligence (BI) is an approach to data analytics that enables nontechnical business users to access and explore data sets.
  • What is semantic technology? - Semantic technology is a set of methods and tools that provide advanced means for categorizing and processing data, as well as for discovering relationships within varied data sets.
  • What is sentiment analysis? - Sentiment analysis, also referred to as 'opinion mining,' is an approach to natural language processing (NLP) that identifies the emotional tone behind a body of text.
  • What is SPSS (Statistical Package for the Social Sciences)? - SPSS (Statistical Package for the Social Sciences), also known as IBM SPSS Statistics since 2009, is a user-friendly software package used for the analysis of statistical data and to make data-driven decisions.
  • What is stream processing? Introduction and overview - Stream processing is a data management technique that involves ingesting a continuous data stream to quickly analyze, filter, transform or enhance the data in real time.
  • What is Structured Query Language (SQL)? - Structured Query Language (SQL) is a standardized programming language that is used to manage relational databases and perform various operations on the data in them.
  • What is taxonomy in computing? - Taxonomy is the science of classification according to a predetermined system, with the resulting catalog used to provide a conceptual framework for discussion, analysis or information retrieval.
  • What is the Coalition for Secure AI (CoSAI)? - Coalition for Secure AI (CoSAI) is an open source initiative to enhance artificial intelligence's security.
  • What is the Driver's Privacy Protection Act (DPPA)? - The Driver's Privacy Protection Act (DPPA) is a United States federal law designed to protect the personally identifiable information of licensed drivers from improper use or disclosure.
  • What is the Gramm-Leach-Bliley Act (GLBA)? - The Gramm-Leach-Bliley Act (GLB Act or GLBA), also known as the Financial Modernization Act of 1999, is a federal law enacted in the United States to control the ways financial institutions deal with the private information of individuals.
  • What is transfer learning? - Transfer learning is a machine learning (ML) technique where an already developed ML model is reused in another task.
  • What is unstructured data? - Unstructured data is information, in many different forms, that doesn't follow conventional data models, making it difficult to store and manage in a mainstream relational database.
  • What is user acceptance testing (UAT)? - User acceptance testing (UAT), also called application testing or end-user testing, is a phase of software development in which the software is tested in the real world by its intended audience.
  • What is user behavior analytics (UBA)? - User behavior analytics (UBA) is the tracking, collecting and assessing of user data and activities using monitoring systems.
  • What is YAML (YAML Ain't Markup Language)? - YAML (YAML Ain't Markup Language) is a data serialization language used as the input format for diverse software applications.
  • workload - In computing, a workload is typically any program or application that runs on a computer.
  • WORM (write once, read many) - In computer media, write once, read many, or WORM, is a data storage technology that allows data to be written to a storage medium a single time and prevents the data from being erased or modified.
  • XML Schema Definition (XSD) - XML Schema Definition or XSD is a recommendation by the World Wide Web Consortium (W3C) to describe and validate the structure and content of an XML document.
  • yobibyte (YiB) - A yobibyte (YiB) is a unit of measure used to describe data capacity as part of the binary system of measuring computing and storage capacity.