A data-driven disaster is a serious problem caused by one or more ineffective data analysis processes.
According to the Data Warehousing Institute, data quality problems cost businesses in the United States over $600 billion a year. In addition to the financial burden, problems with data quality and analysis can have a serious impact on security, compliance, project management and human resource management (HRM), among other possibilities.
Error can creep into data analytics at any stage. The data quality may be inadequate in the first place, for example. It could be incomplete, inaccurate, not current, or may not be a reliable indicator of what it is intended to represent. Data analysis and interpretation are prone to a similar number of pitfalls. There can be confounding factors and the mathematical method can be flawed or inappropriate. Correlation can be erroneously considered to suggest causation. Statistical significance may be mistakenly attributed when the data doesn’t support it. Even if the data and analytic processes are valid, data may be deliberately presented in a misleading manner to support an agenda.
In a broader context, flaws in data-driven processes have been responsible for real disasters such as explosion of the space shuttle Challenger in 1986 and the shooting down of an Iranian Airbus by the USS Vincennes in 1988.
As businesses deal with huge increases in the amount of data collected -- sometimes referred to as big data -- there's a corresponding increase in the trend toward data-driven decision management (DDDM). Problems arise when insufficient resources are applied to data processes and too much confidence placed in their validity. To prevent data-driven disasters, it's crucial to continually examine data quality and analytic processes, and to pay attention to common sense and even intuition. When data seems to be indicating something that doesn't make logical sense or just seems wrong, it's time to reexamine the source data and the methods of analysis.