Definition

emergent medical data (EMD)

Kate Brush

Emergent medical data (EMD) is health information gathered about an individual from seemingly unrelated user behavior data. EMD uses sophisticated methods -- including artificial intelligence (AI) and machine learning (ML) -- to analyze user activities and create a detailed profile of an individual's physical and mental health. The term was coined in 2017 by Mason Marks, an assistant professor at the Gonzaga University School of Law.

Companies can pull emergent medical data from a variety of sources, including Facebook posts, credit card purchases, the contents of emails or a list of videos recently watched on YouTube. Normally, a person looking at this raw data would not see any connection to the user's health. However, AI tools and big data algorithms can be used to transform this previously meaningless information into sensitive medical data.

Collecting EMD proposes several benefits, including the ability to track the spread of an infectious disease, identify individuals at risk of suicide or homicide and monitor drug abuse. However, the primary appeal behind emergent medical data is the opportunity for organizations to enhance behavioral targeting and optimize customer profiling and marketing. Insurance companies can use EMD to determine an individual's accident risk and calculate insurance premiums. Advertisers can take personal medical data and use it to deliver behavioral ads based on the individual's medical history.

As a result, EMD has started to raise many concerns around personal and patient privacy. Patient data collected by healthcare providers is protected by the Health Insurance Portability and Accountability Act (HIPAA); however, EMD receives little to no legal protection.

Furthermore, the reveal of Google's Project Nightingale in late November 2019 has increased the concerns and raised the debate around whether data collected without patient consent can ethically be converted into EMD and used for financial benefit. Project Nightingale is a partnership between Google and healthcare organization Ascension that provided Google with access to over 50 million patient records without doctor or patient knowledge.

What kind of data does EMD include?

When a person interacts with technology, they leave behind a digital footprint of their actions and behaviors. AI, ML and big data algorithms collect and analyze this raw data to create EMD. As a result, almost any activity performed by an individual with technology will create information that can be transformed into sensitive medical data.

Specific examples of information gathered in the emergent medical data mining process include:

Facebook "likes" and comments
Twitter posts
Amazon and Target.com purchases
Uber rides
Instagram posts

All this data can be analyzed to form a profile of the user's mental and physical health; as more information is added, more EMD is gathered and the clearer the profile becomes.

EMD and Google's Project Nightingale

Project Nightingale is a partnership between Google and Ascension -- the second largest health system in the United States. The secret project was kept hidden until November 2019 when it was revealed that through the partnership, Google had gained access to over 50 million Ascension patient medical records.

The partnership gives Google unprecedented power to determine correlations between user behavior and personal medical conditions. Patents filed in 2018 reveal that Google aims to use Project Nightingale to increase its emergent medical data mining capabilities and eventually identify or predict health conditions in patients who have not yet seen a doctor.

While Project Nightingale could provide numerous medical benefits and improve the healthcare system, it has also raised concerns regarding personal privacy. It is likely that Google will conceal its discoveries as trade secrets and only share them with its parent company Alphabet and its subdivisions -- including Nest, Project Wing, Sidewalk Labs and Fitbit.

Each of these subdivisions provides a service with a data mining operation that collects and analyzes user information. Therefore, a major concern around Project Nightingale is its potential to use its collection of personal medical data to create an unprecedented consumer health surveillance system that spans across multiple industries and technologies.

In November 2019, a federal inquiry was opened to investigate Project Nightingale and Google's attempt to collect millions of Americans' protected health information (PHI).

EMD and privacy concerns

The potential risks of EMD are as follows:

Companies can take advantage of loopholes in privacy laws and users' lack of knowledge regarding EMD to access personal medical records that are typically confidential.
Online platforms can benefit by selling EMD to third parties that might go on to use it as the basis for discrimination in a variety of decisions, including employment, insurance, lending and higher education.
The more people learn about EMD, the more likely they are to start changing their online behaviors; this could greatly limit the free exchange of ideas over the internet.
When a company puts users into groups based on medical conditions, they are acting as medical diagnosticians -- a role that should be reserved for trained and licensed professionals.

Emergent medical data has the potential to improve healthcare with a variety of possible benefits. For example, ads for evidence-based recovery centers could be targeted at users with substance abuse issues and individuals with undiagnosed Alzheimer's disease can be referred to a physician for evaluation. However, collecting the necessary information to make this possible without explicit consent is unethical and violates individual privacy. Furthermore, the information gathered is not protected by HIPAA and companies that gather the data are not subject to any related penalties.

Health and privacy laws and regulations

The various privacy concerns around EMD and personal data collection have raised the discussion around public health ethics and the boundaries for responsible use of emergent medical data. Several governments have responded with new laws to further protect personal data.

In April 2016, the European Union approved the General Data Protection Regulation (GDPR), which increases consumers' rights to control their data and how it is used; businesses that fail to comply are penalized with large fines. The GDPR went into effect in May 2018.

Currently, there are no laws in the United States to regulate the mining of EMD. However, California passed the California Consumer Privacy Act (CCPA) in June 2018 that supports California residents' right to control their personally identifiable information (PII). Under the CCPA, citizens have the right to know what personal information is being collected and who is collecting it, as well as the right to access their PII and refuse its sale to third parties.

In addition, U.S. Sens. Amy Klobuchar and Lisa Murkowski recently proposed the Protecting Personal Health Data Act, which aims to protect health data collected by wellness apps, fitness trackers, social media platforms and direct-to-consumer (D2C) DNA testing companies.

While this new law could benefit consumers, it also poses the risk of creating an exception for EMD. For example, one section of the act excludes "products on which personal health data is derived solely from other information that is not personal health data, such as Global Positioning System [GPS] data." Therefore, if the Protecting Personal Health Data Act is passed, it would still allow companies to freely continue mining emergent medical data.

One final concern around EMD is its potential to promote algorithmic discrimination, which occurs when a computer system systematically and repeatedly produces errors that create unfair outcomes -- such as favoring one group of users over another more vulnerable group.

The machine learning algorithms used to find EMD sorts users into health-related categories that are assigned positive and negative weight -- or importance. Discrimination occurs when an algorithm designed to find new job candidates assigns negative weight to groups of individuals with disabilities, thus preventing them from accessing job postings and applications for which they could have been qualified. Algorithmic discrimination can also cause companies to wrongfully deny people access to important resources, such as housing and insurance, without realizing it.

Examples of emergent medical data

American retail corporation Target hired statisticians to find patterns in their customers' purchasing behavior. Through data mining, it was discovered that pregnant female shoppers were most likely to purchase unscented body lotion at the start of their second trimester. Prior to analyzing this data, unscented body lotion had no recognized connection to a health condition. However, using this information, Target was able to reach out to consumers identified as expectant mothers with coupons and advertisements before other companies even learned the women were pregnant.

Recent studies involving Facebook also reveal how nonmedical online behavior can be used to find users with substance abuse issues and other disease risks. For example, one study showed that the use of swear words, sexual words and words related to biological processes on Facebook was indicative of alcohol, drug and tobacco use. Furthermore, the study determined that words relating to physical space -- such as "up" and "down" -- were more strongly linked to alcohol abuse, while angry words -- such as "kill" and "hate" -- were more strongly linked to drug abuse.

Facebook also discovered that the use of religious language -- especially words related to prayer, God and family -- could predict which users had or may develop diabetes. Other areas of focus include analyzing ordinary Facebook posts to determine when specific users feel suicidal.

Google is working on a smart home system that would collect digital footprints from household members and analyze the data to predict undiagnosed diseases -- including Alzheimer's and influenza -- or substance abuse problems.

This was last updated in March 2020