HIPAA: De-identification and Minimum Necessary
De-identification and Minimum Necessary [45 CFR § 164.514]
Rule: PHI that has been de-identified according to HIPAA standards is no longer protected health information. Additionally, covered entities must limit PHI uses and disclosures to the minimum necessary to accomplish the intended purpose.
Overview of § 164.514
This section contains three critical privacy protections:
| Protection | Citation | Purpose |
|---|---|---|
| De-identification standards | § 164.514(a)-(c) | Remove individual identifiability from health information |
| Minimum necessary | § 164.514(d) | Limit PHI disclosures to what’s needed |
| Limited data sets | § 164.514(e) | Middle ground between identified and de-identified |
De-identification [§ 164.514(a)-(c)]
Section 164.514(a): Standard
Definition: Health information that does not identify an individual and with respect to which there is no reasonable basis to believe the information can be used to identify an individual is not individually identifiable health information.
Effect: Once properly de-identified, information is:
- No longer PHI
- Not subject to Privacy Rule restrictions
- Can be used and disclosed without authorization
- Not subject to breach notification requirements
Two methods for de-identification:
- Expert Determination [§ 164.514(b)]
- Safe Harbor [§ 164.514(b)(2)]
Section 164.514(b): Implementation Specifications
Method 1: Expert Determination [§ 164.514(b)(1)]
A person with appropriate knowledge and experience determines:
Requirements:
- Applies generally accepted statistical and scientific principles
- Determines risk that information could identify individual is very small
- Documents methods and results
Who qualifies as expert:
- Statistician
- Epidemiologist
- Data scientist with privacy expertise
- Anyone with appropriate training and experience
Process:
- Expert analyzes information and context
- Considers:
- Fields of data
- Population characteristics
- Other reasonably available information
- Combination risk
- Determines re-identification risk
- Documents analysis and conclusion
Standard to meet: Risk is “very small” (not zero)
Advantages:
- Can retain more useful data
- Tailored to specific dataset
- Flexible approach
Disadvantages:
- Requires expert
- More expensive
- Must document thoroughly
- Subject to challenge
Example application:
- Dataset contains zip code, age, and diagnosis
- Expert analyzes: small rural area with unique age/diagnosis combination
- Expert determines: must generalize zip code to 3 digits
- Risk now “very small”
- Documents analysis
Method 2: Safe Harbor [§ 164.514(b)(2)]
Remove the following 18 identifiers of the individual, relatives, employers, household members:
| Identifier | Description | Examples |
|---|---|---|
| (A) Names | Any names (full, last, first, maiden) | John Smith, J. Smith |
| (B) Geographic subdivisions | Smaller than state (except initial 3 zip digits if ≥20,000 people) | Street address, city, county, precinct |
| (C) Dates | All dates directly related to individual except year | Birth date, admission date, discharge date, death date |
| (D) Telephone numbers | All telephone numbers | (555) 123-4567 |
| (E) Fax numbers | All fax numbers | (555) 123-4568 |
| (F) Email addresses | All email addresses | john.smith@example.com |
| (G) Social security numbers | All SSNs | 123-45-6789 |
| (H) Medical record numbers | All MRNs | MR-123456 |
| (I) Health plan numbers | All health plan beneficiary numbers | HP-789012 |
| (J) Account numbers | All account numbers | ACCT-345678 |
| (K) Certificate/license numbers | All certificate or license numbers | License #901234 |
| (L) Vehicle identifiers | All vehicle IDs and serial numbers including license plates | ABC-1234, VIN: 1HGBH41JXMN109186 |
| (M) Device identifiers/serial numbers | All device identifiers and serial numbers | Serial: DEV123456 |
| (N) URLs | All web URLs | https://example.com/patient |
| (O) IP addresses | All Internet Protocol addresses | 192.168.1.1 |
| (P) Biometric identifiers | Finger/voice prints, full face photos | Fingerprint image, facial photograph |
| (Q) Full face photos | Any comparable images | Facial photograph |
| (R) Other unique identifiers | Any other unique identifying characteristic or code | Internal patient ID, study ID |
Additional requirement: No actual knowledge that remaining information could identify individual
Geographic subdivisions (Identifier B) details:
May retain:
- State
- First 3 digits of zip code IF geographic unit of those 3 digits contains ≥20,000 people
Must remove:
- Specific address
- City
- County
- Precinct
- Zip codes or portions with <20,000 population
Special rule for small population zip codes:
- If population <20,000, replace with 000
- Maintains consistency without creating new identifier
Example:
- Original: 123 Main St, Smallville, State, 12345
- De-identified: State only (zip 12345 has <20,000 pop)
Dates (Identifier C) details:
Must remove:
- Exact dates (except year)
- Birth date (can keep birth year)
- Admission date
- Discharge date
- Date of service
- Date of death (can keep year)
May retain:
- Year (unless individual is >89 years old)
- Age in years (unless >89)
Special rule for individuals ≥90:
- All ages >89 aggregated into single category “90 or older”
- All dates (including year) for ages >89 may be removed
Rationale: Small population of very elderly makes them identifiable
Biometric identifiers (Identifier P) details:
Includes:
- Fingerprints
- Voiceprints
- Retina scans
- Iris scans
- Facial geometry
- DNA
- Other unique physical characteristics
Full face photos (Identifier Q) details:
Must remove:
- Full face photographic images
- Any comparable images
May retain:
- Images of body parts other than face
- Images where face is not identifiable
- De-identified images (face obscured)
Other unique identifiers (Identifier R) details:
Catch-all for:
- Internal patient numbers
- Study IDs
- Any other code/characteristic that is unique or could identify
Critical: Even if not listed A-Q, if unique and identifying, must remove
Safe Harbor Advantages and Disadvantages
Advantages:
- Clear, objective standard
- No expert required
- Lower cost
- Less documentation
- Defensible
Disadvantages:
- Removes more data
- Less useful dataset
- Less flexibility
- May over-redact
Re-identification Prohibition [§ 164.514(c)]
Rule: Covered entity or business associate may not use or disclose de-identified information for purpose of re-identifying individuals.
Prohibition applies to:
- Any attempt to re-identify
- Using additional information to identify
- Matching against other datasets
- Reverse engineering de-identification
Exception: May assign code or other means of record re-identification IF:
- Derivation method not disclosed
- Code is not derived from or related to individual information
- Not otherwise capable of being translated to identify individual
- Code not used for any other purpose
Practical effect:
- Cannot create “key file” linking codes to identities AND disclose dataset
- Internal re-identification possible (for research coordination)
- External recipients cannot re-identify
Minimum Necessary [§ 164.514(d)]
Section 164.514(d)(1): Minimum Necessary Standard
When using or disclosing PHI or requesting PHI from another covered entity, covered entity must make reasonable efforts to limit PHI to the minimum necessary to accomplish intended purpose.
Applies to:
- Uses of PHI (internal)
- Disclosures of PHI (external)
- Requests for PHI from others
Does NOT apply to:
- Disclosures to or requests by health care provider for treatment
- Uses or disclosures to the individual (or personal representative)
- Uses or disclosures made pursuant to individual’s authorization
- Uses or disclosures required by law
- Uses or disclosures required for HHS compliance review
- Disclosures to individual who is subject of information
Rationale for treatment exception: Clinical judgment requires access to full record
Section 164.514(d)(2)-(3): Implementation
For routine and recurring disclosures:
Covered entity must implement policies and procedures that:
- Identify persons or classes of persons in workforce who need PHI
- Identify PHI categories those persons need
- Limit access accordingly
- Establish conditions for those uses/disclosures
For non-routine disclosures:
Covered entity must:
- Develop criteria for determining minimum necessary
- Review each request on individual basis
- Limit disclosure to minimum necessary per criteria
Examples:
| Scenario | Minimum Necessary Approach |
|---|---|
| Billing department | Access to demographics, insurance, diagnosis codes, procedure codes - NOT clinical notes |
| Registration desk | Access to demographics, insurance - NOT diagnosis or clinical info |
| Physician referral | Full clinical record relevant to condition - EXCEPT treatment exception applies |
| Lawyer subpoena | Only records specified in subpoena - NOT entire medical record |
| Employer fitness-for-duty | Medical opinion on work capability - NOT underlying diagnosis or treatment details |
Section 164.514(d)(4)-(5): Requests for PHI
When requesting PHI from another covered entity:
Must limit request to:
- Reasonably necessary to accomplish purpose
- Only information needed
When receiving requests:
May rely on requested disclosure as minimum necessary IF:
- Request made by public official or agency acting under authority
- Request made by another covered entity
- Request made by professional member of workforce for treatment purpose
- Request is documented research statement under § 164.512(i)
Reliance rule: Receiving entity can assume requesting entity determined minimum necessary
Exception to reliance: If receiving entity has actual knowledge request seeks more than minimum necessary
Example:
- Attorney requests “all medical records”
- Hospital may assume attorney determined this is minimum necessary
- BUT: If litigation clearly only about broken arm, hospital knows full psychiatric records not necessary
- Hospital should clarify request
Section 164.514(d)(6): Review Exception
HHS has authority to review minimum necessary determinations for compliance.
Limited Data Sets [§ 164.514(e)]
Section 164.514(e)(1): Standard
A limited data set is PHI that:
- Excludes certain direct identifiers (specified below)
- May be used or disclosed for research, public health, or health care operations
- Only if recipient enters into data use agreement
Purpose: Balance between de-identified (limited utility) and fully identified (privacy risk)
Section 164.514(e)(2): Direct Identifiers Excluded
Limited data set must exclude these 16 identifiers:
| Identifier | Notes |
|---|---|
| (i) Names | Individual, relatives, employers, household members |
| (ii) Postal address info | EXCEPT town/city, state, zip code |
| (iii) Telephone numbers | All telephone numbers |
| (iv) Fax numbers | All fax numbers |
| (v) Email addresses | All email addresses |
| (vi) Social security numbers | All SSNs |
| (vii) Medical record numbers | All MRNs |
| (viii) Health plan beneficiary numbers | All health plan numbers |
| (ix) Account numbers | All account numbers |
| (x) Certificate/license numbers | All certificate or license numbers |
| (xi) Vehicle identifiers | Including license plate numbers |
| (xii) Device identifiers/serial numbers | All device IDs and serial numbers |
| (xiii) URLs | All web URLs |
| (xiv) IP addresses | All Internet Protocol addresses |
| (xv) Biometric identifiers | Finger/voice prints |
| (xvi) Full face photos | And comparable images |
May retain in limited data set:
- Dates (including date of birth, admission, discharge, death)
- Town/city, state, and zip code
- Ages (including those over 89)
- Other elements of dates (month, day, year)
Comparison to Safe Harbor:
| Element | Safe Harbor | Limited Data Set |
|---|---|---|
| Names | Remove | Remove |
| Addresses | Remove (except state, 3-digit zip if ≥20k) | Remove (except city, state, full zip) |
| Dates | Remove (except year, unless age >89) | Retain |
| Ages >89 | Aggregate to “90+“ | Retain |
| Phone/fax/email | Remove | Remove |
| SSN | Remove | Remove |
| Account numbers | Remove | Remove |
| Photos | Remove | Remove |
Key difference: Limited data sets retain dates and precise geographic information
Section 164.514(e)(3)-(4): Permitted Uses and Data Use Agreement
Limited data set may be used or disclosed ONLY for:
- Research
- Public health activities
- Health care operations
Data use agreement required:
Covered entity may disclose limited data set ONLY if:
- Recipient enters into data use agreement with covered entity
- Agreement meets requirements of § 164.514(e)(4)
Data use agreement must establish:
| Requirement | Details |
|---|---|
| Permitted uses | Specify uses and disclosures permitted |
| Limit uses | Recipient may not use or disclose except as permitted |
| No identification | Recipient will not identify or contact individuals |
| Safeguards | Recipient will use appropriate safeguards |
| Report violations | Recipient will report unauthorized uses/disclosures |
| Ensure compliance | Recipient will ensure agents comply with restrictions |
| No further disclosures | Recipient will not disclose to anyone not bound by data use agreement |
Covered entity responsibility:
- Report violations to recipient
- Take reasonable steps to cure breach by recipient
- If unsuccessful, terminate agreement
- If termination not feasible, report to HHS
Example data use agreement:
Recipient agrees:
- To use limited data set ONLY for analysis of treatment outcomes
- Not to identify or contact any individual
- To implement administrative, physical, and technical safeguards
- To report any unauthorized use within 5 days
- To ensure subcontractors comply with these terms
- Not to further disclose limited data set
Comparison Chart
| Method | PHI Status | Can Use Without Authorization? | Utility | Difficulty |
|---|---|---|---|---|
| Identifiable PHI | PHI | No (except for TPO or other permitted uses) | High | N/A |
| Limited data set | Still PHI | Yes, with data use agreement for research/public health/ops | Medium-High | Easy |
| Safe Harbor de-identification | Not PHI | Yes | Medium | Easy |
| Expert determination de-identification | Not PHI | Yes | Medium-High | Moderate |
Practical Compliance
For De-identification
Using Safe Harbor method:
- ✅ Review 18 identifiers
- ✅ Remove all listed identifiers
- ✅ Check for other unique identifiers (R)
- ✅ Confirm no actual knowledge dataset could identify individual
- ✅ Document de-identification process
- ✅ Train staff on requirements
Using Expert Determination method:
- ✅ Engage qualified expert
- ✅ Expert analyzes data and context
- ✅ Expert determines re-identification risk is “very small”
- ✅ Expert documents methods and results
- ✅ Retain expert documentation
- ✅ Review periodically as context changes
For Minimum Necessary
Routine/recurring uses:
- ✅ Identify workforce roles
- ✅ Determine PHI needed per role
- ✅ Document role-based access policies
- ✅ Implement technical access controls
- ✅ Review and update as roles change
- ✅ Audit access logs
Non-routine disclosures:
- ✅ Establish criteria for determining minimum necessary
- ✅ Review request individually
- ✅ Limit disclosure to minimum necessary
- ✅ Document decision
- ✅ Train staff on assessment process
Example role-based access:
| Role | Access Needed |
|---|---|
| Receptionist | Demographics, insurance, appointment scheduling |
| Billing clerk | Demographics, insurance, diagnosis codes, procedure codes, charges |
| Nurse | Full clinical record for assigned patients |
| Physician | Full clinical record for patients under care |
| Quality analyst | De-identified or limited data set |
| IT support | System access, not patient information |
For Limited Data Sets
Using limited data sets:
- ✅ Remove 16 direct identifiers
- ✅ Confirm purpose is research, public health, or health care operations
- ✅ Draft data use agreement
- ✅ Obtain signed agreement before disclosure
- ✅ Monitor recipient compliance
- ✅ Respond to violations
- ✅ Document process
Common Mistakes
Assuming de-identification = Safe Harbor only:
- Expert determination is valid alternative
- May be better for some datasets
- Both methods legally equivalent
Over-relying on “all medical records” requests:
- Just because someone requests everything doesn’t mean you must provide everything
- If you have actual knowledge more than minimum necessary, clarify request
Confusing limited data set with de-identified:
- Limited data set is still PHI
- Requires data use agreement
- Different restrictions apply
Not documenting de-identification:
- Both methods require documentation
- Expert determination especially needs thorough documentation
- Cannot prove compliance without documentation
Failing to implement role-based access:
- Minimum necessary requires workforce access limits
- Technical controls needed
- Policies alone insufficient