Architecture Deep Dive

How The Veil Works

The Veil splits every AI workflow into two network-isolated sandboxes. Sandbox A holds identity data. Sandbox B runs AI inference. They cannot communicate directly — an opaque-token bridge is the only link. The AI model never knows who the data belongs to.

Last Updated: April 2026

The Core Invariant

No single component holds both identity and AI insights. This is not a policy — it is an infrastructure constraint.

Kubernetes NetworkPolicies and Docker network segmentation enforce this boundary. A compromised application process in Sandbox B cannot open a TCP connection to Sandbox A. The network layer drops the packets.

Why Not Just Redact?

Redaction is a software promise. A single regex miss, an edge-case encoding, or an overlooked log statement leaks PII into the AI pipeline. The system is only as strong as every line of redaction code across every release.

The Veil replaces that software promise with an infrastructure guarantee:

ApproachFailure modeBlast radius
RedactionBug in regex / serializerFull PII exposed to AI model
The VeilApplication-level bugNetwork blocks the connection — AI still cannot reach identity store

This distinction matters most when the AI is an external cloud provider, when multiple parties need access (auditors, consultants), or when regulation demands architectural enforcement — as GDPR Article 25 and EU AI Act Article 10 do.

The Data Flow

Every request follows the same eight-step path. Identity and inference never share a network hop.

  1. 1User submits a request to the Gateway (the only public endpoint).
  2. 2Gateway PII firewall strips identity fields from the payload before any downstream forwarding.
  3. 3Sandbox A authenticates the user via OIDC, validates authorisation, and issues an opaque pseudonymous token through the ID Bridge.
  4. 4Gateway forwards the request to Sandbox B carrying only the opaque token and the context payload — no names, no emails, no account numbers.
  5. 5Sandbox B runs inference (LLM, scoring model, classifier). It processes context against the token. It cannot identify anyone.
  6. 6Sandbox B returns its result tagged to the pseudonymous token.
  7. 7Gateway maps the token back to the authenticated user session via the ID Bridge. This is the only point where token and identity coexist — in-memory, never persisted together.
  8. 8User sees the final result. The AI model never learned who they are.

Components

Gatewaydsa-edge

REST edge service and PII firewall. The only component that touches both the user session and the inference path — but it never forwards both identity and token to the same downstream service. In-memory session cache, wait-polling for async results, graceful shutdown.

Sandbox A — Identity Vaultdsa-identity

Stores and protects identity data. PostgreSQL with row-level security, column-level encryption, OIDC provider for authentication, and DSAR/erasure endpoints for GDPR compliance. Has no network route to Sandbox B.

ID Bridgedsa-bridge

Generates opaque pseudonymous tokens (HMAC-SHA256 with random nonce) with no derivable relationship to real identities. Manages epoch-based token rotation (configurable, default 7 days) with 14-day grace periods ensuring zero downtime. Master keys integrate with HashiCorp Vault, AWS KMS, or Azure Key Vault.

Re-linkage — mapping a token back to a real identity — is the most sensitive operation in the platform. Every re-linkage request must pass through a seven-step governance chain:

  1. 1Legal basis citation — the requester must specify the legal ground (e.g., GDPR Art. 6(1)(c), clinical emergency).
  2. 2Case reference — a traceable case ID linking the request to a specific business need.
  3. 3Requester role validation — only designated roles can initiate re-linkage.
  4. 4Jurisdiction check — re-linkage can be restricted by legal jurisdiction, enforcing data sovereignty.
  5. 5Four-eyes approval — a second independently authorised individual must approve.
  6. 6Time-limited access — re-linkage grants access to specific attributes for a configurable window, not permanent access.
  7. 7Mandatory post-access review — within a configurable period (default 7 days), the access event must be reviewed and signed off.

A break-glass mechanism exists for clinical or operational emergencies. Break-glass bypasses the four-eyes requirement but triggers: elevated audit logging, automatic DPO notification, and mandatory post-access review within 24 hours.

Sandbox B — AI Processingdsa-ai

Async inference engine with pluggable LLM adapters — Anthropic Claude, OpenAI, Mistral, or local models via Ollama. Redis job queue for async workloads. Receives only opaque tokens and context. Has no network route to Sandbox A or the identity database.

Audit Servicedsa-audit

Decision records are appended to a cryptographic hash chain. The application role has INSERT and SELECT only. Deletion is permitted only for GDPR Article 17 erasure requests and retention expiry, executed through audited security-definer functions that write a signed erasure event to a separate append-only log alongside every deletion.

Compliance Mapping

The Veil was designed against specific regulatory requirements, not retrofitted after the fact.

RegulationRequirementThe Veil Mechanism
GDPR Art. 25Data protection by design and by defaultPseudonymisation is the default processing mode. Identity isolation is enforced at the network layer. Data minimisation is structural — Sandbox B cannot access more data than needed.
GDPR Art. 32Security of processingColumn-level encryption in Sandbox A. Network-level isolation via Kubernetes NetworkPolicies (16 policies across 7 namespaces). Append-only audit hash chain. Token rotation with configurable policies.
EU AI Act Art. 10Data governance for high-risk AI systemsTraining and inference data passes through the Gateway PII firewall. Sandbox B only receives pseudonymised context. Full data lineage via the audit hash chain. AI-provider agnostic — swap models without changing the privacy posture.
EU AI Act Art. 15Accuracy, robustness, and cybersecurityInfrastructure-level isolation limits the attack surface for adversarial inputs. Pluggable LLM adapters allow model validation without architectural changes. Append-only logs support post-deployment monitoring and anomaly detection.

Breach Scenario: What an Attacker Gets

Assume full compromise of a single sandbox. What does the attacker walk away with?

Compromised ComponentAttacker obtainsAttacker cannot obtain
Sandbox ANames, emails, account numbers, authentication credentialsAI outputs, risk scores, behavioural inferences, model parameters — none of this exists in Sandbox A
Sandbox BAI outputs, pseudonymous tokens, model responsesReal identities — tokens are opaque and non-reversible without the ID Bridge
ID BridgeToken-to-identity mappingsAI outputs or inference results — the Bridge never receives them. An attacker still needs Sandbox B data to build a linked profile.

No single compromised component produces a linked profile of "person X has risk score Y." An attacker must breach multiple isolated systems across separate network boundaries, each with independent credentials, to reconstruct that link.