US

HIPAA: De-identification and Minimum Necessary

De-identification and Minimum Necessary [45 CFR § 164.514]

Rule: PHI that has been de-identified according to HIPAA standards is no longer protected health information. Additionally, covered entities must limit PHI uses and disclosures to the minimum necessary to accomplish the intended purpose.

Overview of § 164.514

This section contains three critical privacy protections:

ProtectionCitationPurpose
De-identification standards§ 164.514(a)-(c)Remove individual identifiability from health information
Minimum necessary§ 164.514(d)Limit PHI disclosures to what’s needed
Limited data sets§ 164.514(e)Middle ground between identified and de-identified

De-identification [§ 164.514(a)-(c)]

Section 164.514(a): Standard

Definition: Health information that does not identify an individual and with respect to which there is no reasonable basis to believe the information can be used to identify an individual is not individually identifiable health information.

Effect: Once properly de-identified, information is:

  • No longer PHI
  • Not subject to Privacy Rule restrictions
  • Can be used and disclosed without authorization
  • Not subject to breach notification requirements

Two methods for de-identification:

  1. Expert Determination [§ 164.514(b)]
  2. Safe Harbor [§ 164.514(b)(2)]

Section 164.514(b): Implementation Specifications

Method 1: Expert Determination [§ 164.514(b)(1)]

A person with appropriate knowledge and experience determines:

Requirements:

  • Applies generally accepted statistical and scientific principles
  • Determines risk that information could identify individual is very small
  • Documents methods and results

Who qualifies as expert:

  • Statistician
  • Epidemiologist
  • Data scientist with privacy expertise
  • Anyone with appropriate training and experience

Process:

  1. Expert analyzes information and context
  2. Considers:
    • Fields of data
    • Population characteristics
    • Other reasonably available information
    • Combination risk
  3. Determines re-identification risk
  4. Documents analysis and conclusion

Standard to meet: Risk is “very small” (not zero)

Advantages:

  • Can retain more useful data
  • Tailored to specific dataset
  • Flexible approach

Disadvantages:

  • Requires expert
  • More expensive
  • Must document thoroughly
  • Subject to challenge

Example application:

  • Dataset contains zip code, age, and diagnosis
  • Expert analyzes: small rural area with unique age/diagnosis combination
  • Expert determines: must generalize zip code to 3 digits
  • Risk now “very small”
  • Documents analysis

Method 2: Safe Harbor [§ 164.514(b)(2)]

Remove the following 18 identifiers of the individual, relatives, employers, household members:

IdentifierDescriptionExamples
(A) NamesAny names (full, last, first, maiden)John Smith, J. Smith
(B) Geographic subdivisionsSmaller than state (except initial 3 zip digits if ≥20,000 people)Street address, city, county, precinct
(C) DatesAll dates directly related to individual except yearBirth date, admission date, discharge date, death date
(D) Telephone numbersAll telephone numbers(555) 123-4567
(E) Fax numbersAll fax numbers(555) 123-4568
(F) Email addressesAll email addressesjohn.smith@example.com
(G) Social security numbersAll SSNs123-45-6789
(H) Medical record numbersAll MRNsMR-123456
(I) Health plan numbersAll health plan beneficiary numbersHP-789012
(J) Account numbersAll account numbersACCT-345678
(K) Certificate/license numbersAll certificate or license numbersLicense #901234
(L) Vehicle identifiersAll vehicle IDs and serial numbers including license platesABC-1234, VIN: 1HGBH41JXMN109186
(M) Device identifiers/serial numbersAll device identifiers and serial numbersSerial: DEV123456
(N) URLsAll web URLshttps://example.com/patient
(O) IP addressesAll Internet Protocol addresses192.168.1.1
(P) Biometric identifiersFinger/voice prints, full face photosFingerprint image, facial photograph
(Q) Full face photosAny comparable imagesFacial photograph
(R) Other unique identifiersAny other unique identifying characteristic or codeInternal patient ID, study ID

Additional requirement: No actual knowledge that remaining information could identify individual

Geographic subdivisions (Identifier B) details:

May retain:

  • State
  • First 3 digits of zip code IF geographic unit of those 3 digits contains ≥20,000 people

Must remove:

  • Specific address
  • City
  • County
  • Precinct
  • Zip codes or portions with <20,000 population

Special rule for small population zip codes:

  • If population <20,000, replace with 000
  • Maintains consistency without creating new identifier

Example:

  • Original: 123 Main St, Smallville, State, 12345
  • De-identified: State only (zip 12345 has <20,000 pop)

Dates (Identifier C) details:

Must remove:

  • Exact dates (except year)
  • Birth date (can keep birth year)
  • Admission date
  • Discharge date
  • Date of service
  • Date of death (can keep year)

May retain:

  • Year (unless individual is >89 years old)
  • Age in years (unless >89)

Special rule for individuals ≥90:

  • All ages >89 aggregated into single category “90 or older”
  • All dates (including year) for ages >89 may be removed

Rationale: Small population of very elderly makes them identifiable

Biometric identifiers (Identifier P) details:

Includes:

  • Fingerprints
  • Voiceprints
  • Retina scans
  • Iris scans
  • Facial geometry
  • DNA
  • Other unique physical characteristics

Full face photos (Identifier Q) details:

Must remove:

  • Full face photographic images
  • Any comparable images

May retain:

  • Images of body parts other than face
  • Images where face is not identifiable
  • De-identified images (face obscured)

Other unique identifiers (Identifier R) details:

Catch-all for:

  • Internal patient numbers
  • Study IDs
  • Any other code/characteristic that is unique or could identify

Critical: Even if not listed A-Q, if unique and identifying, must remove

Safe Harbor Advantages and Disadvantages

Advantages:

  • Clear, objective standard
  • No expert required
  • Lower cost
  • Less documentation
  • Defensible

Disadvantages:

  • Removes more data
  • Less useful dataset
  • Less flexibility
  • May over-redact

Re-identification Prohibition [§ 164.514(c)]

Rule: Covered entity or business associate may not use or disclose de-identified information for purpose of re-identifying individuals.

Prohibition applies to:

  • Any attempt to re-identify
  • Using additional information to identify
  • Matching against other datasets
  • Reverse engineering de-identification

Exception: May assign code or other means of record re-identification IF:

  • Derivation method not disclosed
  • Code is not derived from or related to individual information
  • Not otherwise capable of being translated to identify individual
  • Code not used for any other purpose

Practical effect:

  • Cannot create “key file” linking codes to identities AND disclose dataset
  • Internal re-identification possible (for research coordination)
  • External recipients cannot re-identify

Minimum Necessary [§ 164.514(d)]

Section 164.514(d)(1): Minimum Necessary Standard

When using or disclosing PHI or requesting PHI from another covered entity, covered entity must make reasonable efforts to limit PHI to the minimum necessary to accomplish intended purpose.

Applies to:

  • Uses of PHI (internal)
  • Disclosures of PHI (external)
  • Requests for PHI from others

Does NOT apply to:

  • Disclosures to or requests by health care provider for treatment
  • Uses or disclosures to the individual (or personal representative)
  • Uses or disclosures made pursuant to individual’s authorization
  • Uses or disclosures required by law
  • Uses or disclosures required for HHS compliance review
  • Disclosures to individual who is subject of information

Rationale for treatment exception: Clinical judgment requires access to full record

Section 164.514(d)(2)-(3): Implementation

For routine and recurring disclosures:

Covered entity must implement policies and procedures that:

  • Identify persons or classes of persons in workforce who need PHI
  • Identify PHI categories those persons need
  • Limit access accordingly
  • Establish conditions for those uses/disclosures

For non-routine disclosures:

Covered entity must:

  • Develop criteria for determining minimum necessary
  • Review each request on individual basis
  • Limit disclosure to minimum necessary per criteria

Examples:

ScenarioMinimum Necessary Approach
Billing departmentAccess to demographics, insurance, diagnosis codes, procedure codes - NOT clinical notes
Registration deskAccess to demographics, insurance - NOT diagnosis or clinical info
Physician referralFull clinical record relevant to condition - EXCEPT treatment exception applies
Lawyer subpoenaOnly records specified in subpoena - NOT entire medical record
Employer fitness-for-dutyMedical opinion on work capability - NOT underlying diagnosis or treatment details

Section 164.514(d)(4)-(5): Requests for PHI

When requesting PHI from another covered entity:

Must limit request to:

  • Reasonably necessary to accomplish purpose
  • Only information needed

When receiving requests:

May rely on requested disclosure as minimum necessary IF:

  • Request made by public official or agency acting under authority
  • Request made by another covered entity
  • Request made by professional member of workforce for treatment purpose
  • Request is documented research statement under § 164.512(i)

Reliance rule: Receiving entity can assume requesting entity determined minimum necessary

Exception to reliance: If receiving entity has actual knowledge request seeks more than minimum necessary

Example:

  • Attorney requests “all medical records”
  • Hospital may assume attorney determined this is minimum necessary
  • BUT: If litigation clearly only about broken arm, hospital knows full psychiatric records not necessary
  • Hospital should clarify request

Section 164.514(d)(6): Review Exception

HHS has authority to review minimum necessary determinations for compliance.

Limited Data Sets [§ 164.514(e)]

Section 164.514(e)(1): Standard

A limited data set is PHI that:

  • Excludes certain direct identifiers (specified below)
  • May be used or disclosed for research, public health, or health care operations
  • Only if recipient enters into data use agreement

Purpose: Balance between de-identified (limited utility) and fully identified (privacy risk)

Section 164.514(e)(2): Direct Identifiers Excluded

Limited data set must exclude these 16 identifiers:

IdentifierNotes
(i) NamesIndividual, relatives, employers, household members
(ii) Postal address infoEXCEPT town/city, state, zip code
(iii) Telephone numbersAll telephone numbers
(iv) Fax numbersAll fax numbers
(v) Email addressesAll email addresses
(vi) Social security numbersAll SSNs
(vii) Medical record numbersAll MRNs
(viii) Health plan beneficiary numbersAll health plan numbers
(ix) Account numbersAll account numbers
(x) Certificate/license numbersAll certificate or license numbers
(xi) Vehicle identifiersIncluding license plate numbers
(xii) Device identifiers/serial numbersAll device IDs and serial numbers
(xiii) URLsAll web URLs
(xiv) IP addressesAll Internet Protocol addresses
(xv) Biometric identifiersFinger/voice prints
(xvi) Full face photosAnd comparable images

May retain in limited data set:

  • Dates (including date of birth, admission, discharge, death)
  • Town/city, state, and zip code
  • Ages (including those over 89)
  • Other elements of dates (month, day, year)

Comparison to Safe Harbor:

ElementSafe HarborLimited Data Set
NamesRemoveRemove
AddressesRemove (except state, 3-digit zip if ≥20k)Remove (except city, state, full zip)
DatesRemove (except year, unless age >89)Retain
Ages >89Aggregate to “90+“Retain
Phone/fax/emailRemoveRemove
SSNRemoveRemove
Account numbersRemoveRemove
PhotosRemoveRemove

Key difference: Limited data sets retain dates and precise geographic information

Section 164.514(e)(3)-(4): Permitted Uses and Data Use Agreement

Limited data set may be used or disclosed ONLY for:

  • Research
  • Public health activities
  • Health care operations

Data use agreement required:

Covered entity may disclose limited data set ONLY if:

  • Recipient enters into data use agreement with covered entity
  • Agreement meets requirements of § 164.514(e)(4)

Data use agreement must establish:

RequirementDetails
Permitted usesSpecify uses and disclosures permitted
Limit usesRecipient may not use or disclose except as permitted
No identificationRecipient will not identify or contact individuals
SafeguardsRecipient will use appropriate safeguards
Report violationsRecipient will report unauthorized uses/disclosures
Ensure complianceRecipient will ensure agents comply with restrictions
No further disclosuresRecipient will not disclose to anyone not bound by data use agreement

Covered entity responsibility:

  • Report violations to recipient
  • Take reasonable steps to cure breach by recipient
  • If unsuccessful, terminate agreement
  • If termination not feasible, report to HHS

Example data use agreement:

Recipient agrees:

  1. To use limited data set ONLY for analysis of treatment outcomes
  2. Not to identify or contact any individual
  3. To implement administrative, physical, and technical safeguards
  4. To report any unauthorized use within 5 days
  5. To ensure subcontractors comply with these terms
  6. Not to further disclose limited data set

Comparison Chart

MethodPHI StatusCan Use Without Authorization?UtilityDifficulty
Identifiable PHIPHINo (except for TPO or other permitted uses)HighN/A
Limited data setStill PHIYes, with data use agreement for research/public health/opsMedium-HighEasy
Safe Harbor de-identificationNot PHIYesMediumEasy
Expert determination de-identificationNot PHIYesMedium-HighModerate

Practical Compliance

For De-identification

Using Safe Harbor method:

  1. ✅ Review 18 identifiers
  2. ✅ Remove all listed identifiers
  3. ✅ Check for other unique identifiers (R)
  4. ✅ Confirm no actual knowledge dataset could identify individual
  5. ✅ Document de-identification process
  6. ✅ Train staff on requirements

Using Expert Determination method:

  1. ✅ Engage qualified expert
  2. ✅ Expert analyzes data and context
  3. ✅ Expert determines re-identification risk is “very small”
  4. ✅ Expert documents methods and results
  5. ✅ Retain expert documentation
  6. ✅ Review periodically as context changes

For Minimum Necessary

Routine/recurring uses:

  1. ✅ Identify workforce roles
  2. ✅ Determine PHI needed per role
  3. ✅ Document role-based access policies
  4. ✅ Implement technical access controls
  5. ✅ Review and update as roles change
  6. ✅ Audit access logs

Non-routine disclosures:

  1. ✅ Establish criteria for determining minimum necessary
  2. ✅ Review request individually
  3. ✅ Limit disclosure to minimum necessary
  4. ✅ Document decision
  5. ✅ Train staff on assessment process

Example role-based access:

RoleAccess Needed
ReceptionistDemographics, insurance, appointment scheduling
Billing clerkDemographics, insurance, diagnosis codes, procedure codes, charges
NurseFull clinical record for assigned patients
PhysicianFull clinical record for patients under care
Quality analystDe-identified or limited data set
IT supportSystem access, not patient information

For Limited Data Sets

Using limited data sets:

  1. ✅ Remove 16 direct identifiers
  2. ✅ Confirm purpose is research, public health, or health care operations
  3. ✅ Draft data use agreement
  4. ✅ Obtain signed agreement before disclosure
  5. ✅ Monitor recipient compliance
  6. ✅ Respond to violations
  7. ✅ Document process

Common Mistakes

Assuming de-identification = Safe Harbor only:

  • Expert determination is valid alternative
  • May be better for some datasets
  • Both methods legally equivalent

Over-relying on “all medical records” requests:

  • Just because someone requests everything doesn’t mean you must provide everything
  • If you have actual knowledge more than minimum necessary, clarify request

Confusing limited data set with de-identified:

  • Limited data set is still PHI
  • Requires data use agreement
  • Different restrictions apply

Not documenting de-identification:

  • Both methods require documentation
  • Expert determination especially needs thorough documentation
  • Cannot prove compliance without documentation

Failing to implement role-based access:

  • Minimum necessary requires workforce access limits
  • Technical controls needed
  • Policies alone insufficient

Citation

45 CFR § 164.514 - Other requirements relating to uses and disclosures of protected health information

Sources

Contains public sector information licensed under the Open Government Licence v3.0 where applicable. This is not legal advice. Always refer to official sources for authoritative text.

llms.txt