Irina's Atlas ← Back to Atlas

Guide

Everything about how Atlas works, where its data comes from, how it forms its judgments, and how to get the most out of it.

What Atlas is

Atlas maps Siemens Energy's publicly disclosed portfolio against cyber-regulation regimes per country. It answers three questions for Irina and the wider SCADA / Comms / Cyber org:

Where the data comes from

DataSourceHow it's refreshed
Countries & tiersCurated list (SE legal entities + HVDC/Gamesa footprint + extraterritorial regimes)Manual seed; admins can re-run or promote tier in UI
JurisdictionsKnown regulators + frameworks (NCSC, BSI, FERC/NERC, NCA, ANSSI, etc.)Curated; extensible via admin
Portfolio itemssiemens-energy.com public product hub - LLM-navigated (no blind crawl)Admin action "Discover portfolio"
Regulation textLocal authoritative baselines (CRA, CAF v3.1/v3.2/v4) + curated summaries (NIS2, NERC CIP, PSTI, NCA OTCC)Admin action "Ingest local baselines" (~10-15 min)
Horizon instrumentsPublic trackers + government announcements (CRSB, CRA phased deadlines, NERC CIP-015, EU AI Act Annex III, CAF v4 adoption, OTCC v2 draft)Admin action "Seed regulations" or daily horizon-scan
Applicability assertionsComputed: embedding retrieval → rerank → analyst LLM judgment, grounded in regulation chunksAdmin action "Run applicability engine"

The globe

Colour of each country polygon:

Click any country to drill into it. Drag to rotate (auto-rotation pauses for 20s). Scroll / pinch to zoom. A bottom-left legend mirrors the colour meaning.

Country panel (left sidebar)

For a mapped country, Atlas shows:

Regulatory exposure score

The computation:

For actual preparedness (met / partial / not met per obligation) use the Preparedness Workbook export and fill in the Compliance Status column locally. The workbook's charts and formulas recompute real preparedness as you type.

Horizon radar

Instruments that have been announced, drafted, or assigned an effective date within the next 24 months. Examples currently tracked:

Open from the right-hand dock (🌐 icon). Badge number = countries with at least one upcoming instrument. Click an item to jump to that country.

Ask Atlas (chat)

A retrieval-augmented chat scoped to the regulation corpus. Ask things like:

Every factual claim in Atlas's answer is followed by a numeric citation [1], which maps to a regulation chunk Atlas used as evidence. Click a citation to open the full source text in a modal.

Atlas will not hallucinate regulation content - if the corpus doesn't cover your question, it will say so and suggest where to look. Questions outside the corpus (weather, personal chat, roleplay, opinions) and classic prompt-injection patterns ("ignore previous instructions…") are politely refused.

Chat responses are rendered with minimal markdown - **bold**, *italic*, `code`, and bullet lists - with inline [N] citations and tappable source cards below each answer.

Exports

Three flavours, each with "INDICATIVE · pending review" and the public-data disclaimer on every page:

Preparedness Workbook

This is the tool that bridges Atlas (public data) and SE's internal compliance position (confidential data).

What Atlas ships you:

Usage: fill in the Compliance Status column (dropdown: Met / Partially Met / Not Met / N/A / Unknown). All charts, scores, and the dashboard update live. Nothing is uploaded back to Atlas.

Evidence packs

Per-product per-country PDF with cover, executive summary, applicable obligations (with Atlas's verdict, confidence, and rationale), verbatim source extracts from the regulation text, a table of obligations that don't apply (with reasons), and a provenance page listing regulation version dates and source URLs. Irina can hand this unchanged to a customer procurement or legal team.

Applicability assertions

The core Atlas judgment. For each (portfolio item × obligation × country) triple:

  1. Embed the obligation text (BGE-large, 1024-dim)
  2. Retrieve top candidate chunks from the regulation corpus via vector kNN (sqlite-vec)
  3. Rerank with bge-reranker-v2-m3
  4. Ask the analyst role (Gemma 4 26B) for a grounded verdict: applies / partially_applies / does_not_apply / uncertain
  5. Require numeric citations into the evidence chunks; store confidence + rationale
  6. Freeze the evidence chunk hashes so that if the regulation later changes, Atlas can identify every assertion whose evidence moved

At scale we use an embedding-based relevance pre-filter (top-20 most-similar obligations per item, cosine ≥ 0.25) to avoid asking the LLM to judge obviously-irrelevant pairs (e.g. PSTI default-passwords for a subsea transformer).

Change cascade (horizon scan)

Runs daily at 03:17 UK time (APScheduler). For each registered horizon source URL:

  1. HEAD → compare ETag / Last-Modified. Unchanged → bail.
  2. Fetch body → SHA256 hash. Unchanged → bail.
  3. Re-chunk + per-chunk embed diff (cosine < 0.98). Pinpoint what moved.
  4. LLM summary of the diff → change_events row.
  5. Any assertion whose frozen evidence_chunk_hashes intersects the removed set is marked stale.
  6. Telegram digest is sent to Bill.

How the LLMs are used

RoleModelUsed for
analystGemma 4 26B-A4B-heretic Q8_0 (round-robin across 4 DGX Spark nodes)Applicability judgment, obligation extraction, Ask-Atlas answers, regulation diff summaries
coderQwen3.6-35B-A3B-heretic (spark-53:8002)Tool-calling portfolio discovery, structured JSON extraction from siemens-energy.com pages
embed-largeBGE-large-en-v1.5 (spark-52:8000, 1024-dim)Chunk embedding for vector retrieval; relevance pre-filter for assertions
rerankbge-reranker-v2-m3 (spark-51:8000)Top-K reranking of retrieved chunks before handing to analyst

All LLM inference is on-premise (Bill's homelab DGX Spark cluster). No request leaves the internal network. Thinking mode is explicitly disabled (chat_template_kwargs.enable_thinking = false) for latency; model temperatures and sampler settings match the fleet's published no-think recipes.

Users & roles

Admins manage users from Actions → Manage users: add, reset passwords, disable, promote/demote, delete.

Admin actions (what each button does)

ActionWhat it doesTime
Seed countriesIdempotent curated seed of 45 countries + 71 jurisdictionsInstant
Seed regulationsCurated summaries of NIS2 / CRA / PSTI / NERC CIP / OTCC + horizon instruments~15s
Ingest local baselinesFull-text ingest of CRA + CAF v3.1 / v3.2 / v4 from the read-only baselines mount - replaces curated summaries with authoritative text, cascade-invalidates dependent assertions~10-15 min
Discover portfolioLLM-driven navigation of siemens-energy.com to rediscover portfolio items~5-8 min
Run applicability engineFull cross-product with embedding pre-filter → top 20 per item → analyst judgment, 10-way parallel~10-20 min
Run horizon scanFires the daily scheduler manually~1-5 min

Applicability vs commercial presence

Two different concepts, easy to confuse. Atlas today answers one of them:

When v0.3 lands, the globe will have a colour-mode toggle: Regulatory exposure vs SE delivery presence vs Both.

Regulation browser (/docs)

Three-pane reader for every regulation Atlas has ingested. Left: instrument list. Middle: outline built from the document's heading path. Right: clause text with a language toggle (EN / DE / FR / IT / ES) and in-document semantic search. Every clause has a copy-link button that yields a permanent URL like /docs#CAF-v3.1/clause-42 — drop those links into emails or decks and they open straight to the clause.

Translations are generated on demand, cached on the chunk content hash, and survive future re-ingests. "Show original" toggle keeps the English source one click away.

Delivery Footprint Layer

Separate from regulatory applicability. Tracks where Siemens Energy has publicly-announced project deliveries — HVDC interconnectors, offshore grid connections, gas turbines, transformers, GIS, syncons, hydrogen, grid software. All sources are public (press releases, investor maps, regulator-published project lists). Closes the confusing "country shows nothing" state from v0.2 by answering the other natural question: "does SE actually sell/operate here?".

Scenario simulator

Actions → 🔮. Pick any horizon instrument and Atlas runs a dry-run cascade: if this instrument came into force today, how many live assertions would go stale, how many portfolio items affected, which in-force instruments overlap with it. Uses embedding similarity over the horizon instrument's summary. Nothing is persisted.

Obligation cross-walk

Actions → 🔗. Type a keyword or concept ("supply chain", "incident reporting in 24h", "MFA for remote access") and Atlas returns the semantically equivalent clauses across all ingested regulations. Useful for proving "a control that meets X also covers Y% of Z". Each result deep-links into the Regulation Browser at the exact clause.

Auto-narrated pitch demo

Actions → ▶ Play pitch demo. Atlas takes over the camera and flies through UK → Germany → USA → Saudi → India with 50 seconds of voice-over subtitles explaining the value at each stop. Esc to stop. Useful when you want Atlas to tell its own story.

v0.5 show-stopper pack

v0.5 shipped 23 features in one pass. None of them replaced existing behaviour - all are additive, reachable from the Actions menu, the header, or keyboard shortcuts. Everything here respects the public-data contract: nothing internal to SE is stored.

FeatureWhereWhat it does
3D obligation graph/graphForce-directed network of obligations, edges between semantically-grouped cross-instrument pairs, coloured by theme.
Regulatory Gantt/gantt2020-2030 timeline of every instrument and milestone. Today-line overlay. Purple = active, amber = horizon, dashed = draft.
Spec-compareActions → Spec-comparePaste or drop a product spec. Atlas extracts claims, KNN-matches each to regulation clauses, returns obligation map.
Obligation → control mapperAny obligation panelClick "Controls for this" - maps the obligation to ISO 27001 Annex A / NIST CSF / IEC 62443 candidates.
Citation copyObligation panelOne-click copy of a formatted citation string to clipboard.
TL;DRAssertion panelOne-line LLM summary of any assertion or clause.
Rebuttal generatorActions → RebuttalPaste a client pushback, get a citation-backed counter plus plain-English explanation.
Chart wizardActions → ChartAsk for a chart in natural language, get an inline SVG back.
Weekly snapshotActions → Save snapshotPoint-in-time JSON of state. Lets Irina do reproducible before/after comparisons.
Threaded commentsAny target pageComments (with @mentions) attach to countries, obligations, or instruments.
Export sign-offExport modalSubmit an export for reviewer ack. The approval chain is stored in export_signoffs.
Saved viewsActions → Save viewEncodes the current state (country, filters, timeline, dock) into a URL hash for sharing.
Model cardActions → Model cardPublic accountability: which LLMs we use, where they run, what for, and calibration state.
Transparency reportActions → TransparencyQuarterly numbers: assertions, calibration agreement %, red-team refusal rate, change events.
Payload signingPOST /api/v1/signHMAC-SHA256 over any JSON payload for non-repudiation demos. Verify via /verify.
Source verifierActions → Verify sourcesRe-fetches every cited URL and flags 404s, redirects, or content drift vs stored hash.
DOCX drag-and-dropActions → Drop DOCXDrop a regulation doc onto Atlas. It extracts text, chunks, embeds, and stages for admin review.
In-app feedbackFloating 💬 (bottom-right)Sends a message straight to Bill's Telegram; also stored in the feedback table.
Weekly post composerActions → Weekly postTurns the week's change events into a LinkedIn-ready 3-paragraph post.
Voice commandsShift+VWebSpeechAPI listener. Spoken phrases route to searches, page jumps, or power-tool actions.
Delivery timeline playerActions → Play timelineAnimated walkthrough of the change-event feed - good for demos.
Power tools groupActions menuNew menu section bundles all the above into one reachable place.
Globe tier filterSidebarHide indexed / watchlist tiers to declutter the globe for presentations.

Forecast vs actual: ~110 human-hours forecast, ~10.5 minutes Rex-time actual. Compression is above the usual 1h=1min baseline because v0.5 reuses v0.4's routing, modal, dock, and action-menu scaffolding - this is surface work, not new architecture.

v0.6.0 — corpus consolidation

Atlas's corpus landed in real-content shape on 2026-04-24. Before this release, only the CAF UK baselines had full text; everything else was 3–6 chunks of curated summary. Now ~30 instruments carry real clause text, and every EU member state Atlas tracks has proper applicability coverage (not just Germany).

MetricBeforeAfter
Instruments with real contentCAF only~30
Total obligations~1,4406,266
Active assertions~1,4402,667
EU member states with coverage1 (DEU, against a duplicate stub)14
Canonical instruments (deduped)mixed duplicates49 unique

Noteworthy architecture shifts:

Suppliers + risk matrix

Track key third-party suppliers and which portfolio items they feed into. Atlas joins this against active applicability assertions to produce the supply-chain risk matrix: for every supplier × country × product, how many regulatory obligations fire. The hotspots surface first.

Access via Actions → 🏭 Suppliers + risk matrix. Every supplier carries:

The risk matrix respects the markets filter — a UK-only supplier shows only UK rows, even if the product it feeds is deployed in 10 countries.

Stub ingestion pipeline

Atlas can ingest a regulation end-to-end — fetch the authoritative source, parse it, translate if needed, chunk, embed, extract obligations — from a single admin endpoint. This turns "we have 5 chunks of PSTI" into "we have 262 chunks of CRA with 434 extracted obligations" in one pass.

Sources live in a research manifest (one JSON file per region at /app/data/research/manifest-*.json) listing authoritative URL, fallback URLs, format, language, and known gotchas per instrument. The pipeline is driven by the manifests — adding a regulation is a manifest edit plus one instrument-placement mapping in stub_ingest.py.

Endpoints (Actions → Admin → Stub ingestion):

Known limitations per source:

Automatic tour (pitch demo)

Actions → ▶ Play pitch demo — a 60-90 second auto-narrated walkthrough. Camera flies between countries, a UK-English voice reads the captions, and the obligation counts in the script are pulled live from the corpus so they never go stale.

Controls bottom-right while playing: ⏸ pause (Space), ⏭ skip to next scene (→), ✕ stop (Esc). Scene progress shows N/12.

The underlying Narrated Tour player is a zero-dep reusable pattern — drop-in files at /files/appdata/config/shared/narrated-tour/ for any other project that wants the same behaviour.

Non-goals

Maintained by Bill for Irina · all information used is available to the general public, nothing internal to Siemens Energy.