Browse Definitions:
Definition

de-anonymization (deanonymization)

De-anonymization is a data mining strategy in which anonymous data is cross-referenced with other data sources to re-identify the anonymous data source. 

Any information that distinguishes one data source from another can be used for de-anonymization. Although the concept of de-anonymization goes back several decades, the term made headlines in 2006 when Arvind Narayanan and Vitaly Shmatikov entered a contest hosted by Netflix, a popular movie-rental service. Narayanan and Shmatikov applied their de-anonymization methodology to a data set that contained the anonymous movie ratings of 500,000 members and were able to successfully identify Netflix data for a number of specific members. According to Narayanan and Shmatik, de-anonymization requires data that is abundant, granular and fairly stable across time and context.

As the United States government and other nations move forward with open government initiatives, more data is becoming publicly available over the Internet. Much of this data has been scrubbed to create what the government calls “limited data sets.”  Personally identifiable information (PII) such as names, addresses and social security numbers are removed from limited data sets or obfuscated through a data anonymization process so that the specific source of the data remains anonymous. This assurance of anonymity protects the source's privacy and allows the government to legally share limited data sets with third parties without requiring written permission. Such data has proved to be very valuable for researchers, particularly in health care. Privacy advocates, however, are concerned that even though the data has been scrubbed, so much of it is available that a specific individual’s identity could be re-discovered.

See also: association rules, business intelligence, opinion mining, OLAP, fuzzy logic

 

This was last updated in May 2015

Continue Reading About de-anonymization (deanonymization)

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

SearchCompliance

SearchSecurity

  • Web application firewall (WAF)

    A Web application firewall (WAF) is a firewall that monitors, filters or blocks traffic to and from a Web application. WAFs are ...

  • MD5

    The MD5 hashing algorithm is a one-way cryptographic function that accepts a message of any length as input and returns as output...

  • identity theft

    Identity theft, also known as identity fraud, is a crime in which an imposter obtains key pieces of personally identifiable ...

SearchHealthIT

SearchDisasterRecovery

  • call tree

    A call tree -- sometimes referred to as a phone tree -- is a telecommunications chain for notifying specific individuals of an ...

  • mass notification system (MNS)

    A mass notification system is a platform that sends one-way messages to inform employees and the public of an emergency.

  • disaster recovery as a service (DRaaS)

    One approach to a strong disaster recovery plan is DRaaS, where companies offload data replication and restoration ...

SearchStorage

  • secondary storage

    Secondary storage is used to protect inactive data written from a primary storage array to a nonvolatile tier of disk, flash or ...

  • VRAM (video ram)

    VRAM (video RAM) is a reference to any type of random access memory (RAM) used to store image data for a computer display.

  • ZFS

    ZFS is a local file system and logical volume manager created by Sun Microsystems to control the placement, storage and retrieval...

SearchSolidStateStorage

  • SSD RAID (solid-state drive RAID)

    SSD RAID (solid-state drive RAID) is a methodology commonly used to protect data by distributing redundant data blocks across ...

  • Tier 0

    Tier 0 (tier zero) is a level of data storage that is faster, and perhaps more expensive, than any other level in the storage ...

  • PCIe SSD (PCIe solid-state drive)

    A PCIe SSD (PCIe solid-state drive) is a high-speed expansion card that attaches a computer to its peripherals.

SearchCloudStorage

  • RESTful API

    A RESTful application program interface breaks down a transaction to create a series of small modules, each of which addresses an...

  • cloud storage infrastructure

    Cloud storage infrastructure is the hardware and software framework that supports the computing requirements of a private or ...

  • Zadara VPSA and ZIOS

    Zadara Storage provides block, file or object storage with varying levels of compute and capacity through its ZIOS and VPSA ...

Close