Data gravity is the ability of bodies of data to attract applications, services and other data. The force of gravity, in this context, is manifested in the way software, services and business logic are drawn to data relative to its mass (the amount of data) and, as a result, are physically located closer to the data. The larger the amount of data, the more applications, services and other data will be attracted to it and the more quickly they will be drawn.
IT expert Dave McRory coined the term data gravity as an analogy to the way that, in accordance with the physical laws of gravity, objects with more mass attract those with less. In this analogy, applications and services also possess gravity but not as much as a large body of data and, naturally, smaller bodies of data have less gravity than larger ones.
In practical terms, moving data farther and more frequently impacts workload performance, so it makes sense for data to be amassed and for associated applications and services to be located nearby. Hyperconvergence illustrates the concept of data gravity. In a hyper-converged infrastructure, compute, networking and virtualization resources are tightly integrated with data storage within a commodity hardware box.
Also, the more data that exists in a given source or repository, the greater its perceived value will be. Software and services are brought to the data as a means of exploiting its value. Similarly, the greater the amount of data, the more other data might be connected to it, increasing its value for analytics.
According to McRory, data gravity is moving to the cloud. As more and more internal and external business data is moved to the cloud or generated there, data analytics tools are also increasingly cloud-based.
McRory differentiates between naturally-occuring data gravity and similar changes created through external forces such as legislation, throttling and manipulative pricing, which he refers to as artificial data gravity.