|Updated March 23, 2026

Privacy-Preserving AI for Financial Services: AML Without Identity Exposure

AML transaction monitoring requires AI to analyze behavioral patterns. It does not require the AI to know customer names, account numbers, or addresses. Here's how split-knowledge architecture enables compliant AML without identity exposure.

financeAMLfraud-detectionGDPRDORAAMLD6

What AML Transaction Monitoring Actually Needs

An AML model evaluates transaction patterns: amounts, frequencies, merchant categories, geographic distribution, velocity changes, structuring indicators. It flags anomalies — a sudden spike in cross-border transfers, transactions just below reporting thresholds, circular fund flows.

None of this requires the model to know that the account belongs to Marc Schuelke at Hauptstrasse 12, Fuerth. The model needs a consistent identifier to track behavioural patterns over time. It does not need that identifier to be a name, IBAN, or customer number.

This is the gap that split-knowledge architecture fills. The AI processes transaction patterns tagged to opaque tokens. It scores risk on behaviour, not identity. When the model flags suspicious activity, a human investigator triggers a controlled re-linkage to identify the customer for SAR filing.

How Transactions Enter Sandbox B

When a financial institution deploys dual-sandbox architecture for AML monitoring, the transaction flow works as follows:

  1. A customer initiates a transaction. The core banking system holds the full record: account number, customer name, counterparty details, amount, timestamp.
  2. The transaction reaches the Gateway (dsa-edge). The Gateway calls Sandbox A (dsa-identity) to verify the customer's authentication status.
  3. The Gateway calls the ID Bridge (dsa-bridge) to obtain a purpose-scoped token for AML monitoring. The token tok_m2p8q1r7 (purpose: fraud) has no mathematical relationship to the customer's identity or to tokens issued for other purposes like product recommendations.
  4. The Gateway constructs a pseudonymised transaction record: token, amount, merchant category, geographic region (generalised — country level, not street address), timestamp, and transaction type. Account numbers, IBANs, and counterparty names are stripped.
  5. The pseudonymised record flows to Sandbox B (dsa-ai), where the AML model ingests it alongside the token's historical transaction patterns stored in the feature store (Redis).
  6. Sandbox B scores the transaction against its behavioural model and returns a risk assessment tagged to the token.

At no point does Sandbox B see customer names, account numbers, addresses, or any direct identifier. The model operates on behavioural features and opaque tokens.

Risk Scoring on Tokens, Not People

The AML model in Sandbox B maintains behavioural profiles keyed to tokens. For tok_m2p8q1r7, the feature store holds:

The model scores this pattern as high risk (0.91). It does not know whether tok_m2p8q1r7 is a student, a business owner, or a politically exposed person. It evaluates the behaviour, not the person.

This separation has a regulatory advantage. GDPR Art. 22(1) restricts automated decisions that produce "legal effects" or "similarly significantly affect" individuals. An AML risk score attached to an opaque token is not, by itself, a decision about an identified person. The decision about the person — filing a SAR, freezing an account — happens only after re-linkage, with human involvement.

Re-Linkage for SAR Filing: Four-Eyes Approval

When Sandbox B flags tok_m2p8q1r7 with a risk score above the investigation threshold, the re-linkage protocol activates:

Step 1 — Investigation request: A compliance analyst reviews the pseudonymised behavioural data in Sandbox B's investigation dashboard. They see the transaction patterns, the risk score breakdown, and the model's feature attribution. They determine that a SAR investigation is warranted.

Step 2 — Re-linkage request: The analyst submits a re-linkage request to the ID Bridge, specifying:

Step 3 — Four-eyes approval: The Bridge's policy engine verifies:

Step 4 — Time-limited identity disclosure: The Bridge returns the customer identity with a 30-minute expiry window. The analyst can now complete the SAR with full customer details as required by the Financial Intelligence Unit.

Step 5 — Immutable audit trail: The re-linkage event is logged with: who requested, who approved, which token, which case, timestamp, and expiry. This audit trail satisfies both GDPR accountability (Art. 5(2)) and AMLD6 record-keeping requirements (Art. 40).

Regulatory Alignment: Three Frameworks, One Architecture

AMLD6 (Anti-Money Laundering Directive 6)

AMLD6 Art. 33 requires obliged entities to report suspicious transactions to their national Financial Intelligence Unit. Art. 40 requires record-keeping of customer due diligence data and transaction records for five years. Art. 42 requires that personal data processed for AML purposes be retained only as long as necessary and subject to appropriate safeguards.

Split-knowledge architecture satisfies all three: the re-linkage protocol enables SAR filing (Art. 33), Sandbox A retains KYC data and Sandbox B retains pseudonymised transaction records for the required period (Art. 40), and the architectural separation is itself the "appropriate safeguard" for retained data (Art. 42).

DORA (Digital Operational Resilience Act)

DORA Art. 6 requires financial entities to maintain an ICT risk management framework. Art. 9 mandates protection and prevention measures. Art. 11 requires ICT response and recovery plans.

The dual-sandbox architecture maps directly to DORA requirements:

DORA RequirementArticleHow Split-Knowledge Addresses It
ICT risk management frameworkArt. 6Each sandbox has a defined risk boundary. Risk assessment is modular — identity risk in A, AI model risk in B, correlation risk in Bridge.
Network security managementArt. 9(2)Kubernetes NetworkPolicies enforce sandbox isolation at the infrastructure layer. No configuration drift — policy-as-code via Helm charts.
Detection of anomalous activitiesArt. 10Independent monitoring per sandbox. Cross-sandbox correlation anomalies (timing attacks, access pattern matching) detected at the Bridge.
ICT response and recoveryArt. 11Each sandbox can be recovered independently. Bridge token rotation is the primary containment mechanism for cross-sandbox incidents.
Threat-led penetration testingArt. 26Each sandbox can be pen-tested independently. Testing Sandbox B does not require access to identity data. Reduced scope, focused tests.

GDPR

For financial services AML processing, three GDPR provisions are critical:

Breach Benefit Analysis: Finance-Specific

Financial services face the most severe regulatory consequences for data breaches. Consider the difference:

Monolithic AML system breach: The attacker obtains customer names, account numbers, transaction histories, AML risk scores, SAR filing status, PEP flags, and beneficial ownership records — all linked. This is a catastrophic breach. GDPR Art. 34 notification to all affected individuals is almost certainly required. The institution faces reputational damage, regulatory sanction, potential AMLD6 violations (disclosure of SAR status is a criminal offence in most EU jurisdictions under tipping-off provisions), and customer lawsuits.

Sandbox B breach: The attacker obtains transaction patterns and AML risk scores tagged to opaque tokens. They see that tok_m2p8q1r7 has a risk score of 0.91 and a pattern consistent with structuring. They cannot determine who this person is. No customer names, no account numbers, no addresses. The breach risk assessment under GDPR Art. 33 is materially different — pseudonymised data with intact safeguards may not trigger individual notification under Art. 34. No tipping-off violation is possible because the SAR filing status is not linked to an identifiable person in the breached system.

Sandbox A breach: The attacker obtains KYC data — names, addresses, ID documents. Serious, but they get no transaction histories, no AML risk scores, no SAR status. Standard identity breach protocols apply. The attacker cannot determine which customers are under investigation.

Bridge breach: The worst case — the attacker can link tokens to identities. Immediate response: rotate all tokens via the Bridge, invalidating old mappings. The Bridge's HSM-protected encryption, network isolation (no internet-facing access), and multi-party key management (2-of-3 quorum for administrative access) make this the hardest target in the architecture.

The net effect: split-knowledge architecture transforms a single catastrophic breach scenario into three contained scenarios, each with a smaller blast radius and a clearer response playbook.