A method where each record is indistinguishable from at least k-1 others.

Anonymization GDPR — EDPB Guidelines 2026

Q: What is the difference between anonymization and pseudonymization?

Anonymization is irreversible and takes the data outside GDPR scope. Pseudonymization is reversible and the data remain personal data.

Q: Are hashed data anonymous?

No. A plain hash is pseudonymization unless it is a keyed hash with a protected key and appropriate controls.

Q: Is true anonymization possible?

It is technically very difficult. EDPB requires a negative answer to all three tests — single-out, linkability and inference.

Q: Is breach notification required for pseudo data?

It depends on risk. If the re-identification key is safe, the risk is lower and subject notification may not be required.

Q: Where can I get a consultation?

At gdprbg.com — the dedicated GDPR team of Innovires Legal with a free initial consultation.

Legal framework

The GDPR does not explicitly define “anonymization” in its operative text. The key provision is Recital 26, which states that “the principles of data protection should not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable”. In other words, true anonymization takes the data outside the scope of the GDPR, while anything “in between” remains personal data.

Pseudonymization is defined in Art. 4(5) GDPR as “the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures”. Pseudonymized data remain personal data and are fully subject to the Regulation.

EDPB Guidelines 04/2025 on anonymization are the Board’s attempt to provide a new standard after Opinion 05/2014 of the Article 29 Working Party. They set stricter technical and legal requirements and must be read together with national supervisory authority practice, including the Bulgarian CPDP.

For help applying these guidelines, see our resources at gdprbg.com, where the Innovires Legal GDPR team publishes guides, templates and case notes.

Anonymization vs. pseudonymization — key differences

The differences between the two concepts have direct legal consequences. The table below summarizes the key points:

Criterion	Anonymization	Pseudonymization
GDPR scope	OUTSIDE scope (if truly anonymous)	Within scope — still personal data
Reversibility	Irreversible	Reversible (with key)
Legal basis	Not required	Required (after pseudonymization)
DPIA	Not required	May be required
Data subject rights	Do not apply	Apply
Breach notification	Not required	Required (but lower risk)
Third-country transfers	Free	Under GDPR rules

The practical significance: if you classify a dataset as “anonymous” when it is in fact only pseudonymized, the entire legal basis for processing may be wrongly determined. This is exactly why the EDPB proposes strict identifiability tests.

The three EDPB tests for true anonymization

Originating from Opinion 05/2014 of the Article 29 Working Party and confirmed and expanded in the new Guidelines, the EDPB uses three cumulative identifiability tests:

Single-out — can an individual data subject be isolated from the dataset, even without knowing their name? A unique combination of attributes is already a problem.
Linkability — can records relating to the same subject across different datasets (or within one dataset) be linked together?
Inference — can attributes of the subject be inferred with high probability based on other values in the dataset?

True anonymization requires a NEGATIVE answer to ALL THREE tests. If even one of them is positive, the dataset still contains personal data and must be treated as pseudonymization.

For practical testing and anonymization audits, our team at gdprbg.com offers a structured methodology including residual-risk documentation.

Anonymization techniques

There is no universal technique. The choice depends on the purpose of the processing, the data type and the acceptable trade-off between utility and protection. The main techniques:

Technique	Description	Weakness
Generalization	Replacement with a broader category (e.g. “Sofia” → “Bulgaria”)	Loss of utility
Suppression	Removing values or entire records	Incomplete dataset
k-anonymity	Each record indistinguishable from k-1 others	Does not protect from inference
l-diversity	Diversity of sensitive attributes within a group	Does not cover skewness
t-closeness	Group distribution close to the overall distribution	Complex implementation
Differential privacy	Mathematical privacy guarantee via added noise	Hard to apply generally
Synthetic data	New data with statistics similar to the original	Can still “memorize” rare records

In practice, several techniques are almost always combined, e.g. generalization + k-anonymity + l-diversity. A documentation-first approach (documenting each transformation) is mandatory for demonstrating compliance to the supervisory authority.

Pseudonymization techniques

Hashing — one-way transformation, but vulnerable to dictionary and rainbow-table attacks if no salt is used.
Keyed hash (HMAC) — hash with a secret key, which makes attacks significantly harder; however, compromise of the key renders all records re-identifiable.
Deterministic encryption — enables JOINs across datasets because the same input yields the same ciphertext.
Tokenization — replacement of values with random tokens stored in a separate token vault.
Encryption with an externally managed key — classic pseudonymization because the key is stored separately and under a different access regime.

Important: hashing alone, without additional controls, is almost never sufficient for anonymization. The EDPB and the Article 29 WP have consistently treated it as pseudonymization.

We cover the technical details in our GDPR audits at gdprbg.com — from algorithm choice to key management.

Typical business applications

Analytics and reporting without exposure of personal data — BI dashboards, management reporting.
Machine learning — training sets that do not require subject identity.
Research — scientific publications, clinical research, academic collaboration.
Data sharing with partners — B2B integrations where identity is not necessary.
Marketing analytics — privacy-friendly alternatives to GA4 and other tracking solutions.
Health data for research — under the strict regime of special categories in Art. 9 GDPR.
Fraud detection — anomaly detection without direct identification.

In all these cases the choice between anonymization and pseudonymization has direct implications for legal basis, retention period and data subject rights.

Risks and limitations

The main risk of anonymization is re-identification — revealing identity by combining with other public or private datasets. Several classic case studies illustrate this:

Netflix Prize — anonymous movie ratings combined with IMDb profiles led to de-anonymization of subscribers.
AOL search log — published “anonymous” search queries revealed user identities through query content.
Medical data — research shows that the combination of ZIP code + date of birth + gender is often enough for unique identification.

Residual risk always remains. True, absolute anonymization is almost impossible with rich datasets. This is why the EDPB requires a risk-based approach, including assessment of context, means of a hypothetical attacker and available external sources.

Step-by-step process

Define the purpose of processing — what will the output be used for.
Inventory of personal data in the source dataset — which fields, which categories.
Risk assessment for re-identification — who are likely “attackers” and what external data they have.
Choose a technique — anonymization vs. pseudonymization, based on purpose and risk.
Technical implementation of the chosen transformations.
Testing — the three EDPB tests (single-out, linkability, inference).
DPIA if the process is high-risk or involves special categories.
Documentation — records of every step, including decisions and justifications.
Regular review — the risk profile changes with new data and techniques.

If your process is complex, request a free GDPR audit at gdprbg.com.

Relation to DPIA

When you use pseudonymization as a risk-mitigation measure in a DPIA (Data Protection Impact Assessment), it reduces residual risk and is usually viewed favourably by the supervisory authority. Importantly, however, pseudonymization does not exempt you from the obligation to perform a DPIA — it is an element of the assessment, not an exception.

The DPIA report should explicitly describe: (i) the chosen pseudonymization method, (ii) the location and protection of the key, (iii) the circle of persons with access and (iv) the procedure for periodic review.

Further guidance and a DPIA template: DPIA guide at gdprbg.com, as well as the DPO role in the anonymization process.

Pseudonymization and data breaches

In a breach of pseudonymized data, the risk to subjects is often objectively lower — the attacker gets tokens or hashes rather than direct identity. Nevertheless, notification to the CPDP within 72 hours under Art. 33 GDPR remains mandatory if the breach poses a risk to the rights and freedoms of subjects.

If, alongside the pseudonymized data, the re-identification key (or tokenization vault) is also compromised, the risk increases sharply and notification of the data subjects themselves under Art. 34 GDPR becomes mandatory. Practical tip: store keys with a different provider or at least on separate infrastructure.

Full breach protocol: 72-hour breach protocol at gdprbg.com.

Frequently asked questions

What is the difference between anonymization and pseudonymization?

Anonymization is irreversible and takes the data outside the scope of the GDPR. Pseudonymization is reversible (with a key) and the data remain within scope as personal data.

Are hashed data anonymous?

No. A hash in itself is pseudonymization — unless a keyed hash with a protected key, strict access and appropriate technical and organizational measures is used. Even then, the EDPB recommends a careful assessment.

Is a DPIA required for pseudonymized data?

Yes, if the processing is high-risk under Art. 35 GDPR. Pseudonymization is only one of the mitigating factors, not an automatic exception.

Is true anonymization possible?

Technically very difficult. The EDPB requires a negative answer to all three tests — single-out, linkability, inference. In most practical cases, what is achieved is pseudonymization.

Is breach notification required for pseudo data?

It depends on the risk. If the re-identification key is safe, the risk to subjects is lower and notification under Art. 34 may not be required — but notification to the CPDP under Art. 33 is still due if there is residual risk.

What is k-anonymity?

A method where each record in the dataset is indistinguishable from at least k-1 others on the quasi-identifier attributes. Often combined with l-diversity and t-closeness.

Where can I get a consultation?

At gdprbg.com — a free initial consultation with the GDPR experts of Innovires Legal.

Legal notice: This article is for informational purposes only and does not constitute individual legal advice. For your specific situation, please consult a qualified lawyer. The legal framework may change after the publication date.

Need help with anonymization or pseudonymization?

Our dedicated GDPR team at gdprbg.com provides assessment, implementation and auditing of anonymization and pseudonymization techniques under EDPB Guidelines 04/2025. Request a free consultation or fill in the form below.

Anonymization and Pseudonymization of Personal Data — EDPB Guidelines and Practical Application (2026)