|Updated March 23, 2026

GDPR Article 25 in Practice: How We Built Privacy Into AI Infrastructure

Article 25 demands data protection by design — not as an afterthought. Here's how we mapped each requirement to specific infrastructure decisions: NetworkPolicies, column-level encryption, row-level security, and pseudonymous token bridges.

GDPRArticle-25privacy-by-designinfrastructureKubernetes

What Article 25 Actually Requires

GDPR Article 25(1) requires the controller to implement "appropriate technical and organisational measures, such as pseudonymisation, which are designed to implement data-protection principles [...] in an effective manner and to integrate the necessary safeguards into the processing." These measures must be determined "taking into account the state of the art, the cost of implementation and the nature, scope, context and purposes of processing."

Article 25(2) adds a default-minimisation rule: the controller must ensure that "by default, only personal data which are necessary for each specific purpose of the processing are processed." This applies to the amount of data collected, the extent of processing, the period of storage, and accessibility.

Recital 78 clarifies the intent: controllers should adopt "internal policies and implement measures which meet in particular the principles of data protection by design and data protection by default," including "minimising the processing of personal data, pseudonymising personal data as soon as possible, [and] transparency with regard to the functions and processing of personal data."

Most organisations treat these as policy requirements. We treated them as architecture requirements.

The Split-Knowledge Approach

The Veil enforces a structural invariant: Sandbox A knows WHO (identity, PII). Sandbox B knows WHAT (AI inferences, behavioural data). They cannot communicate directly. The only link between them is the ID Bridge, which issues opaque, non-reversible pseudonymous tokens.

This is not application-level redaction. A redaction bug leaks PII. Our separation is enforced at the Kubernetes network layer -- the AI processing environment has no network route to the identity vault. The enforcement mechanism is infrastructure, not code.

Mapping Article 25 to Infrastructure

Article 25 RequirementThe Veil ImplementationSpecific Technology
Pseudonymisation "as soon as possible" (Recital 78)Gateway strips identity before forwarding to AI sandbox; only opaque tokens reach Sandbox BContextValidator in Gateway with regex-based PII detection for email, SSN, IBAN, credit card, phone patterns
Data minimisation by default (Art. 25(2))Sandbox B receives only whitelisted context keys per vertical and use case; all other fields rejectedPer-vertical ContextConfig with allowed_context_keys map and max_context_value_length limits
Technical measures at time of processing (Art. 25(1))Identity and AI sandboxes run in separate Kubernetes namespaces with deny-all default NetworkPoliciesdefault-deny-all NetworkPolicy deployed to every namespace; explicit egress rules per service
Integrate safeguards into processing (Art. 25(1))Column-level encryption on high-sensitivity fields; row-level security isolates tenant dataAES-256-GCM with versioned key prefix in crypto.go; PostgreSQL RLS policy vertical_isolation on identities table
Limit accessibility by default (Art. 25(2))Sandbox B egress restricted to audit service only; no path to identity dataai-egress NetworkPolicy allows only dsa-audit:50051 and intra-namespace traffic
State of the art measures (Art. 25(1))Infrastructure-level enforcement rather than application-level controlsKubernetes NetworkPolicy with namespaceSelector and per-port restrictions across 7 namespaces

How the Gateway Strips PII

The Gateway service (dsa-edge) is the only component that sees both the user's identity and the inference token. It never forwards both to the same downstream service. Before any request reaches Sandbox B, two layers of PII filtering run:

On the return path, a separate DLP scanner (dlp.Scanner) checks AI-generated responses before they reach the client. If the LLM hallucinates PII from training data, the scanner replaces matches with tagged placeholders like [REDACTED-EMAIL] or [REDACTED-IBAN], flags dlp_redacted: true in the response, and emits an audit event with pattern match counts.

Column-Level Encryption in the Identity Vault

When we designed the identity vault (Sandbox A), we chose AES-256-GCM for column-level encryption of sensitive fields. The sensitive_fields column in the identities table stores encrypted bytes with a format of [version:1byte][nonce:12bytes][ciphertext+tag]. The version prefix enables key rotation without re-encrypting all rows at once -- new writes use the current key version, and reads check the version byte to select the correct decryption key.

For searchability on encrypted fields, the system computes HMAC-SHA256 blind indices. The email_hash and sensitive_hash columns store these values, allowing exact-match lookups without decrypting every row. This satisfies Article 25's data minimisation principle: the database can locate a record by its blind index without exposing the plaintext to the query engine.

Row-Level Security for Tenant Isolation

PostgreSQL row-level security enforces vertical (tenant) isolation directly at the database layer:

`sql

ALTER TABLE identities ENABLE ROW LEVEL SECURITY;

CREATE POLICY vertical_isolation ON identities

USING (vertical = current_setting('app.current_vertical', true));

ALTER TABLE identities FORCE ROW LEVEL SECURITY;

`

The FORCE ROW LEVEL SECURITY directive applies even to the table owner, providing defence-in-depth. If app.current_vertical is not set in the session, no rows are visible -- a safe default that prevents accidental cross-tenant data exposure. This maps directly to Article 25(2): data accessibility is restricted by default, not by policy.

Network Isolation: The Core Invariant

The Helm deployment includes 16 NetworkPolicy resources across 7 namespaces (dsa-edge, dsa-identity, dsa-bridge, dsa-ai, dsa-audit, dsa-observability, dsa-ingest). Every namespace starts with a default-deny-all policy that blocks all ingress and egress. Services then receive explicit, minimal egress rules.

The critical policies:

A Market Moving Toward Infrastructure Enforcement

The market is validating this direction. Gartner projects AI governance platforms at $492M in 2026, surpassing $1B by 2030. Confidential computing — processing data inside hardware-encrypted enclaves — is projected at $59.4B by 2028. A leader in that space raised $90M from Goldman Sachs with 150+ enterprise customers, proving that enterprises demand infrastructure-level guarantees over contractual promises.

GDPR enforcement is accelerating in parallel: €2.3B in fines in 2025 alone, a 38% increase year-over-year. Meta's €240M fine for Article 25 design-phase failures — before any breach occurred — set the precedent that architecture choices themselves are enforceable. France's CNIL is building tools (PANAME project) to detect PII in trained models, moving toward infrastructure validation rather than accepting software assurances.

The question is no longer whether infrastructure-level enforcement will be expected. It's whether your architecture will be ready when auditors start asking for it.

What This Means for Auditors

When an auditor asks "how do you implement data protection by design?", the answer is not a policy document. It is a set of Kubernetes manifests, Go source files, and SQL migrations that structurally prevent the AI system from accessing identity data. The verification scope shrinks: instead of auditing every application code path for PII leakage, the auditor checks that NetworkPolicies are deployed and that the identity vault's encryption and RLS migrations have run.

Article 25 says "appropriate technical and organisational measures." We chose to make the technical measures architectural -- enforced by infrastructure that does not depend on developers remembering to call the right function.