What is Considered PHI or ePHI?
When you were growing up, did your mother keep a special collection of dinnerware? Maybe she had a particular cabinet she kept it in, or maybe it was just hidden away for special occasions, but the point is, your mom had her reasons and designated those plates as unique and worth further protection.
With data, it works the same way—you can consider protected health information (PHI) / electronic protected health information (ePHI) the fine china among the rest of the information your systems process, and that kind of data necessitates special security to protect it.
The trouble many organizations run into is how to define exactly what PHI and ePHI are. Maybe your mother’s dinnerware is a family heirloom, or maybe it just was expensive—regardless of her reasons, she was likely clear on why those plates needed extra protection.
But with PHI and ePHI, it’s not always so simple. With 20 years of cybersecurity assessments under our belt, we’ve seen plenty of organizations struggle in determining whether what they handle is PHI or ePHI, and unfortunately, it sometimes comes back to bite during their HIPAA compliance process.
That’s why, in this article, we will clearly define these classifications for you. We’ll go over how to identify this data as well as when it can be de-identified. With this critical knowledge in hand, you’ll more easily recognize what constitutes PHI/ePHI and where it resides, which will serve as a crucial building block for creating your HIPAA compliance program.
What is PHI?
The simplest way to start is with HIPAA’s formal definition of PHI located in 45 CFR 160.103:
“Protected health information means individually identifiable health information.”
By “individually identifiable health information” they mean information that is a subset of health information, including demographic information collected from an individual, and:
- Is created or received by a health care provider, health plan, employer, or health care clearinghouse; and
- Relates to:
- The past, present, or future physical or mental health or condition of an individual;
- The provision of health care to an individual; or
- The past, present, or future payment for the provision of health care to an individual;
- That identifies the individual or there is a reasonable basis to believe the information can be used to identify the individual.
We are clear that this includes anything regarding an individual’s health information along with anything that can identify said individual. Data such as:
- Birth dates
- Phone numbers
- Social Security numbers
- Health plan beneficiary numbers or any account numbers
- Biometric IDs (finger or voice print)
- Face photographs
- IP addresses or URLs
- Email addresses, among other items.
What is ePHI?
A good understanding of PHI makes defining ePHI considerably easier.
45 CFR 160.103 defines it as “information that comes within paragraphs (1)(i) or (1)(ii) of the definition of protected health information as specified in this section.” Within those indicated two paragraphs, it specifies information that is:
- “Transmitted by electronic media;” and
- “Maintained in electronic media.”
So, ePHI is PHI that is transmitted electronically or stored electronically.
The Difficulty in Classifying PHI/ePHI
Classifying anything, electronic or not, that clearly serves to identify a person is fairly straightforward—the tricky part of categorizing PHI is discerning the “reasonable basis to believe the information can be used to identify the individual.”
That’s because depending on the context and the way information is being used, things like e-mail or mailing addresses may or may not be considered PHI. Consider these two scenarios:
- An organization sends an e-mail or letter to all patients that have a certain medical condition—those e-mail addresses and mailing addresses would be considered PHI, as they could be used in that context to reasonably identify a person in a way that is tied to a past, present, or future physical or mental health condition.
- A business associate receives e-mail addresses or mailing addresses from a covered entity, such as, but does not also receive data tying that information to any past, present, or future physical or mental health condition of an individual, that data would not be considered PHI. It would not be reasonable to believe that an e-mail address or mailing address, with no other connection to health information, could be used to identify an individual’s past, present, or future conditions.
Obviously, not every example will be this straightforward. That’s why the best advice we can give in navigating this murkiness is to involve your legal team and have them review the specific ways you’re receiving and using information. That way, you can be more sure of what data would be considered PHI.
De-Identifying Data – When is Information No Longer PHI or ePHI?
At this point, now with an understanding of what PHI and ePHI are, we should mention the privacy rule 45 CFR 164.514. We don’t need to deconstruct the full requirements, which are quite lengthy—instead, we’ll focus on the list of the 18 identifiers noted in 45 CFR 164.514(b)(2) for data de-identification as well as a related common misconception.
Data de-identification is the process of removing identifiers from health information so as to mitigate privacy risks to individuals. If you understand that, it’s easy to see how some might confuse those 18 “identifiers” in this privacy rule as, if present, concrete certainties for classifying something as PHI.
But that’s not the case. As we noted above, despite these identifiers, context remains paramount—there must be a “reasonable basis to believe the information can be used to identify the individual.”
So, this list of 18, at face value, merely states that these identifiers could be used to identify someone, and if fully removed, the information would be considered de-identified. But you’ll need to examine the greater context of your data to determine whether it is PHI.
If it is de-identification you’re after though, there are two methods to achieve it in accordance with the HIPAA privacy rule:
- “Expert Determination” Method: Defined in 164.514(b)(1); and
- “Safe Harbor” Method: Defined in 164.514(b)(2).
For more information on de-identification, the OCR has put out guidance on this topic which can be found here.
Determining if Data is PHI or ePHI
To recap, when determining if data is PHI or ePHI, you need to ask important questions regarding the following:
- What context surrounds the information’s use
- How the information is stored
- Whether or not multiple identifiers could be put together to tie to a health record of an individual.
At the end of the day, trying to link identifiers to an individual’s past, present, or future physical or mental health or condition, and thereby identifying PHI and ePHI, is essentially a game of connecting the dots. But having those answers and a clear understanding of what is considered PHI/ePHI is very important, as it’s the first step in recognizing your organization’s HIPAA scope.
Without accurate knowledge of what data is considered PHI/ePHI, you’ll face a high likelihood of not properly covering all relevant data and systems as part of your risk analysis and risk management program—the building block of HIPAA compliance, though it’s also often a source of violations.
But now that you understand a bit better how to identify the PHI/ePHI in your systems, you’re better positioned to avoid any related penalties. For more information that can help simplify your HIPAA compliance, check out our other content on different relevant topics:
About DOUG KANNEY
Doug Kanney is a Principal at Schellman based in Columbus, Ohio. Doug leads the HITRUST and HIPAA service lines and assists with methodology and service delivery across the SOC, PCI-DSS, and ISO service lines. Doug has more than 17 years of combined audit experience in public accounting. Doug has provided professional services for multiple Global 1000, Fortune 500, and regional companies during the course of his career.