Everything about how Atlas works, where its data comes from, how it forms its judgments, and how to get the most out of it.
Atlas maps Siemens Energy's publicly disclosed portfolio against cyber-regulation regimes per country. It answers three questions for Irina and the wider SCADA / Comms / Cyber org:
| Data | Source | How it's refreshed |
|---|---|---|
| Countries & tiers | Curated list (SE legal entities + HVDC/Gamesa footprint + extraterritorial regimes) | Manual seed; admins can re-run or promote tier in UI |
| Jurisdictions | Known regulators + frameworks (NCSC, BSI, FERC/NERC, NCA, ANSSI, etc.) | Curated; extensible via admin |
| Portfolio items | siemens-energy.com public product hub - LLM-navigated (no blind crawl) | Admin action "Discover portfolio" |
| Regulation text | Local authoritative baselines (CRA, CAF v3.1/v3.2/v4) + curated summaries (NIS2, NERC CIP, PSTI, NCA OTCC) | Admin action "Ingest local baselines" (~10-15 min) |
| Horizon instruments | Public trackers + government announcements (CRSB, CRA phased deadlines, NERC CIP-015, EU AI Act Annex III, CAF v4 adoption, OTCC v2 draft) | Admin action "Seed regulations" or daily horizon-scan |
| Applicability assertions | Computed: embedding retrieval → rerank → analyst LLM judgment, grounded in regulation chunks | Admin action "Run applicability engine" |
Colour of each country polygon:
Click any country to drill into it. Drag to rotate (auto-rotation pauses for 20s). Scroll / pinch to zoom. A bottom-left legend mirrors the colour meaning.
For a mapped country, Atlas shows:
The computation:
applies → row score = confidence × 1.0partially_applies → row score = confidence × 0.5does_not_apply / uncertain → excludedFor actual preparedness (met / partial / not met per obligation) use the Preparedness Workbook export and fill in the Compliance Status column locally. The workbook's charts and formulas recompute real preparedness as you type.
Instruments that have been announced, drafted, or assigned an effective date within the next 24 months. Examples currently tracked:
Open from the right-hand dock (🌐 icon). Badge number = countries with at least one upcoming instrument. Click an item to jump to that country.
A retrieval-augmented chat scoped to the regulation corpus. Ask things like:
Every factual claim in Atlas's answer is followed by a numeric citation [1], which
maps to a regulation chunk Atlas used as evidence. Click a citation to open the full source text
in a modal.
Atlas will not hallucinate regulation content - if the corpus doesn't cover your question, it will say so and suggest where to look. Questions outside the corpus (weather, personal chat, roleplay, opinions) and classic prompt-injection patterns ("ignore previous instructions…") are politely refused.
Chat responses are rendered with minimal markdown - **bold**, *italic*, `code`, and bullet lists - with inline [N] citations and tappable source cards below each answer.
Three flavours, each with "INDICATIVE · pending review" and the public-data disclaimer on every page:
This is the tool that bridges Atlas (public data) and SE's internal compliance position (confidential data).
What Atlas ships you:
Usage: fill in the Compliance Status column (dropdown: Met / Partially Met / Not Met / N/A / Unknown). All charts, scores, and the dashboard update live. Nothing is uploaded back to Atlas.
Per-product per-country PDF with cover, executive summary, applicable obligations (with Atlas's verdict, confidence, and rationale), verbatim source extracts from the regulation text, a table of obligations that don't apply (with reasons), and a provenance page listing regulation version dates and source URLs. Irina can hand this unchanged to a customer procurement or legal team.
The core Atlas judgment. For each (portfolio item × obligation × country) triple:
bge-reranker-v2-m3analyst role (Gemma 4 26B) for a grounded verdict: applies / partially_applies / does_not_apply / uncertainAt scale we use an embedding-based relevance pre-filter (top-20 most-similar obligations per item, cosine ≥ 0.25) to avoid asking the LLM to judge obviously-irrelevant pairs (e.g. PSTI default-passwords for a subsea transformer).
Runs daily at 03:17 UK time (APScheduler). For each registered horizon source URL:
evidence_chunk_hashes intersects the removed set is marked stale.| Role | Model | Used for |
|---|---|---|
analyst | Gemma 4 26B-A4B-heretic Q8_0 (round-robin across 4 DGX Spark nodes) | Applicability judgment, obligation extraction, Ask-Atlas answers, regulation diff summaries |
coder | Qwen3.6-35B-A3B-heretic (spark-53:8002) | Tool-calling portfolio discovery, structured JSON extraction from siemens-energy.com pages |
embed-large | BGE-large-en-v1.5 (spark-52:8000, 1024-dim) | Chunk embedding for vector retrieval; relevance pre-filter for assertions |
rerank | bge-reranker-v2-m3 (spark-51:8000) | Top-K reranking of retrieved chunks before handing to analyst |
All LLM inference is on-premise (Bill's homelab DGX Spark cluster). No request leaves the
internal network. Thinking mode is explicitly disabled (chat_template_kwargs.enable_thinking = false)
for latency; model temperatures and sampler settings match the fleet's published no-think recipes.
Admins manage users from Actions → Manage users: add, reset passwords, disable, promote/demote, delete.
| Action | What it does | Time |
|---|---|---|
| Seed countries | Idempotent curated seed of 45 countries + 71 jurisdictions | Instant |
| Seed regulations | Curated summaries of NIS2 / CRA / PSTI / NERC CIP / OTCC + horizon instruments | ~15s |
| Ingest local baselines | Full-text ingest of CRA + CAF v3.1 / v3.2 / v4 from the read-only baselines mount - replaces curated summaries with authoritative text, cascade-invalidates dependent assertions | ~10-15 min |
| Discover portfolio | LLM-driven navigation of siemens-energy.com to rediscover portfolio items | ~5-8 min |
| Run applicability engine | Full cross-product with embedding pre-filter → top 20 per item → analyst judgment, 10-way parallel | ~10-20 min |
| Run horizon scan | Fires the daily scheduler manually | ~1-5 min |
Two different concepts, easy to confuse. Atlas today answers one of them:
When v0.3 lands, the globe will have a colour-mode toggle: Regulatory exposure vs SE delivery presence vs Both.
/docs)
Three-pane reader for every regulation Atlas has ingested. Left: instrument
list. Middle: outline built from the document's heading path. Right: clause
text with a language toggle (EN / DE / FR / IT / ES) and in-document semantic
search. Every clause has a copy-link button that yields a permanent URL
like /docs#CAF-v3.1/clause-42 — drop those links into emails or
decks and they open straight to the clause.
Translations are generated on demand, cached on the chunk content hash, and survive future re-ingests. "Show original" toggle keeps the English source one click away.
Separate from regulatory applicability. Tracks where Siemens Energy has publicly-announced project deliveries — HVDC interconnectors, offshore grid connections, gas turbines, transformers, GIS, syncons, hydrogen, grid software. All sources are public (press releases, investor maps, regulator-published project lists). Closes the confusing "country shows nothing" state from v0.2 by answering the other natural question: "does SE actually sell/operate here?".
Actions → 🔮. Pick any horizon instrument and Atlas runs a dry-run cascade: if this instrument came into force today, how many live assertions would go stale, how many portfolio items affected, which in-force instruments overlap with it. Uses embedding similarity over the horizon instrument's summary. Nothing is persisted.
Actions → 🔗. Type a keyword or concept ("supply chain", "incident reporting in 24h", "MFA for remote access") and Atlas returns the semantically equivalent clauses across all ingested regulations. Useful for proving "a control that meets X also covers Y% of Z". Each result deep-links into the Regulation Browser at the exact clause.
Actions → ▶ Play pitch demo. Atlas takes over the camera and flies through UK → Germany → USA → Saudi → India with 50 seconds of voice-over subtitles explaining the value at each stop. Esc to stop. Useful when you want Atlas to tell its own story.
v0.5 shipped 23 features in one pass. None of them replaced existing behaviour - all are additive, reachable from the Actions menu, the header, or keyboard shortcuts. Everything here respects the public-data contract: nothing internal to SE is stored.
| Feature | Where | What it does |
|---|---|---|
| 3D obligation graph | /graph | Force-directed network of obligations, edges between semantically-grouped cross-instrument pairs, coloured by theme. |
| Regulatory Gantt | /gantt | 2020-2030 timeline of every instrument and milestone. Today-line overlay. Purple = active, amber = horizon, dashed = draft. |
| Spec-compare | Actions → Spec-compare | Paste or drop a product spec. Atlas extracts claims, KNN-matches each to regulation clauses, returns obligation map. |
| Obligation → control mapper | Any obligation panel | Click "Controls for this" - maps the obligation to ISO 27001 Annex A / NIST CSF / IEC 62443 candidates. |
| Citation copy | Obligation panel | One-click copy of a formatted citation string to clipboard. |
| TL;DR | Assertion panel | One-line LLM summary of any assertion or clause. |
| Rebuttal generator | Actions → Rebuttal | Paste a client pushback, get a citation-backed counter plus plain-English explanation. |
| Chart wizard | Actions → Chart | Ask for a chart in natural language, get an inline SVG back. |
| Weekly snapshot | Actions → Save snapshot | Point-in-time JSON of state. Lets Irina do reproducible before/after comparisons. |
| Threaded comments | Any target page | Comments (with @mentions) attach to countries, obligations, or instruments. |
| Export sign-off | Export modal | Submit an export for reviewer ack. The approval chain is stored in export_signoffs. |
| Saved views | Actions → Save view | Encodes the current state (country, filters, timeline, dock) into a URL hash for sharing. |
| Model card | Actions → Model card | Public accountability: which LLMs we use, where they run, what for, and calibration state. |
| Transparency report | Actions → Transparency | Quarterly numbers: assertions, calibration agreement %, red-team refusal rate, change events. |
| Payload signing | POST /api/v1/sign | HMAC-SHA256 over any JSON payload for non-repudiation demos. Verify via /verify. |
| Source verifier | Actions → Verify sources | Re-fetches every cited URL and flags 404s, redirects, or content drift vs stored hash. |
| DOCX drag-and-drop | Actions → Drop DOCX | Drop a regulation doc onto Atlas. It extracts text, chunks, embeds, and stages for admin review. |
| In-app feedback | Floating 💬 (bottom-right) | Sends a message straight to Bill's Telegram; also stored in the feedback table. |
| Weekly post composer | Actions → Weekly post | Turns the week's change events into a LinkedIn-ready 3-paragraph post. |
| Voice commands | Shift+V | WebSpeechAPI listener. Spoken phrases route to searches, page jumps, or power-tool actions. |
| Delivery timeline player | Actions → Play timeline | Animated walkthrough of the change-event feed - good for demos. |
| Power tools group | Actions menu | New menu section bundles all the above into one reachable place. |
| Globe tier filter | Sidebar | Hide indexed / watchlist tiers to declutter the globe for presentations. |
Forecast vs actual: ~110 human-hours forecast, ~10.5 minutes Rex-time actual. Compression is above the usual 1h=1min baseline because v0.5 reuses v0.4's routing, modal, dock, and action-menu scaffolding - this is surface work, not new architecture.
Atlas's corpus landed in real-content shape on 2026-04-24. Before this release, only the CAF UK baselines had full text; everything else was 3–6 chunks of curated summary. Now ~30 instruments carry real clause text, and every EU member state Atlas tracks has proper applicability coverage (not just Germany).
| Metric | Before | After |
|---|---|---|
| Instruments with real content | CAF only | ~30 |
| Total obligations | ~1,440 | 6,266 |
| Active assertions | ~1,440 | 2,667 |
| EU member states with coverage | 1 (DEU, against a duplicate stub) | 14 |
| Canonical instruments (deduped) | mixed duplicates | 49 unique |
Noteworthy architecture shifts:
countries.py UNIONs
EU-wide instruments into any member state's view via an
EU_MEMBER_ISO3 constant. See memory-api decision
#55 for the rationale./docs. Regulations
whose authoritative text can't be programmatically fetched
(ANEEL-964 behind Cloudflare, KISA-ISMS-P behind Java form-POSTs,
SOCI-RMP by design) now carry a visible red or amber banner
explaining the access limitation and the resolution path./graph, narrates the network + auto-types a
search, then to /gantt and auto-zooms to 2026,
then back to the globe for the outro.http://192.168.0.251:8080/fetch
with {url, country, session, preload_url, verify_tls}
to grab the content through a country-selected Proton WG exit.
Brazil ANEEL is unwinnable via VPN (Cloudflare blanket-blocks
all datacentre/VPN IPs); residential fetch only.Track key third-party suppliers and which portfolio items they feed into. Atlas joins this against active applicability assertions to produce the supply-chain risk matrix: for every supplier × country × product, how many regulatory obligations fire. The hotspots surface first.
Access via Actions → 🏭 Suppliers + risk matrix. Every supplier carries:
The risk matrix respects the markets filter — a UK-only supplier shows only UK rows, even if the product it feeds is deployed in 10 countries.
Atlas can ingest a regulation end-to-end — fetch the authoritative source, parse it, translate if needed, chunk, embed, extract obligations — from a single admin endpoint. This turns "we have 5 chunks of PSTI" into "we have 262 chunks of CRA with 434 extracted obligations" in one pass.
Sources live in a research manifest (one JSON file per
region at /app/data/research/manifest-*.json) listing
authoritative URL, fallback URLs, format, language, and known gotchas
per instrument. The pipeline is driven by the manifests — adding a
regulation is a manifest edit plus one instrument-placement mapping
in stub_ingest.py.
Endpoints (Actions → Admin → Stub ingestion):
POST /admin/stubs/fetch — download + parse +
stage. Idempotent; skips already-staged files. Short_codes query
param filters.POST /admin/stubs/ingest — process staged files:
translate non-English via Gemma 4/Qwen3.6, chunk, embed, extract
obligations. Requires fetch first.POST /admin/stubs/full — fetch then ingest in one go.
Used by the overnight cron.Known limitations per source:
/app/data/stub_staging/<short_code>/source.bin.publications.europa.eu/resource/celex/... which serves
clean XHTML.verify=False quirk handles these.Actions → ▶ Play pitch demo — a 60-90 second auto-narrated walkthrough. Camera flies between countries, a UK-English voice reads the captions, and the obligation counts in the script are pulled live from the corpus so they never go stale.
Controls bottom-right while playing: ⏸ pause (Space), ⏭ skip to next scene (→), ✕ stop (Esc). Scene progress shows N/12.
The underlying Narrated Tour player is a zero-dep reusable
pattern — drop-in files at /files/appdata/config/shared/narrated-tour/
for any other project that wants the same behaviour.
Maintained by Bill for Irina · all information used is available to the general public, nothing internal to Siemens Energy.