DSA Scanner

What does your data actually expose?

The DSA Scanner is the wedge / entry product for The Veil Core. Run it on your real data to see exactly what PII your AI pipelines are leaking — names, addresses, IBANs, health data, quasi-identifiers. It is the conversation starter that opens the scoped assessment for The Veil Core.

The Visibility Gap

Most organisations don't know what PII their AI systems actually see. They assume their data is clean, or rely on regex patterns that miss context-dependent identifiers. The gap between what you think is exposed and what actually is exposed creates compliance risk you can't quantify.

Without visibility into what your AI pipelines ingest, you can't assess risk, prove compliance, or prioritise remediation. You need a scan of your real data — not a theoretical assessment.

How the Scanner Works

StepWhat HappensWhere
1. Provide DataRepresentative records from your systems — CSV, JSON, or ServiceNow export. Data stays on your infrastructure, never uploaded to external services.Your Environment
2. Three-Layer DetectionScanner runs known-entity matching (Cologne phonetics + Levenshtein), NER detection (Presidio with spaCy DE/EN), and optionally LLM PII Shield (fine-tuned Qwen 2.5 7B) in parallel.Scanner Engine
3. Exposure ReportDetailed report showing every PII element found, confidence scores, field-level risk assessment, and remediation recommendations.HTML Report

Three Layers of Detection

Layer 1 — Known-Entity Matching

Cologne phonetics and Levenshtein distance matching against known identities. Catches name variations, misspellings, and phonetic equivalents that rule-based systems miss.

Layer 2 — NER Detection

Presidio with spaCy DE/EN models plus custom German recognisers. Detects IBANs, tax IDs, health insurance numbers, addresses, and standard PII categories.

Layer 3 — LLM PII Shield

Fine-tuned Qwen 2.5 7B running on dedicated Ollama instance. Catches context-dependent PII that rule-based systems miss. Runs in parallel with Layer 2 for ensemble detection.

Layers 1–2 run in the initial scan; Layer 3 (LLM PII Shield) runs inside the full Assessment engagement for The Veil Core. The DSA Scanner is not a self-serve SaaS — every scan is scoped via email before it starts.

Book an Assessment

Book an Assessment