Data reduction in primary storage (DRIPS) is the application of capacity optimization techniques for data that is in active use, in contrast to storage that is used for backup purposes.
DRIPS uses data reduction techniques such as data deduplication, data archiving, thin provisioning and compression, that have traditionally been associated with backup storage rather than primary storage. The purpose, in either case, is improved storage efficiency, lower costs and better use of available resources.
DRIPS methods include:
Data deduplication, which involves detecting repeated patterns in data and reducing such patterns to a single instance. Data dedupe, as it's often called, is not yet widely used for primary storage but the space reduction can be substantial. Inline data deduplication can have an impact on system performance because of its demand on system resources. Post-processing deduplication, on the other hand, requires more disk space.
Data archiving, which involves moving less frequently used data to slower, less expensive storage. The data involved may be maintained for the sake of records or for possible future use but quick access is not required. In the case of DRIPS, data would be moved from primary storage to backup media.
Thin provisioning, which involves eliminating the reserve on unwritten blocks of storage, allowing overprovisioning of storage resources and enabling more logical capacity to be created than is physically available. The technique does not actually reduce data but optimizes storage. Thin provisioning is widely implemented by storage vendors.
Compression, which involves finding repeated patterns of similar information that can be reduced and replaced with an optimized data structure. The method works with processing cycles to compress and decompress data as required. Compression is a mature and widely-implemented technology that can significantly reduce storage requirements.
The market for data reduction in primary storage is being driven by an increase in storage costs which is in turn being driven by an increase in the amount of data that enterprises deal with.