harden(epay): cart-hygiene invariant uses confirmed cart count + add service architecture plan

- cartCount tracks actual cart rows (decrement only on confirmed delete) so a
  failed cleanup delete can't trigger a false dirty-cart abort.
- docs/plans/006: the multi-tenant CF-service architecture (DB-backed
  fulfiller, account pool, catalog dedup, per-tenant credential model,
  reversible flag flip) — the executable next phase. The Phase-F flag flip is
  gated on the orchestrator fulfiller existing (Plan 003 Faza F was wrong).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude VM
2026-06-05 00:06:06 +03:00
parent f49fdb1da0
commit 28c870fb12
6 changed files with 1703 additions and 11 deletions
@@ -0,0 +1,822 @@
# Plan 001 v2 — DB GIS Central Multi-App
**Status:** FINAL APPROVED — 2026-04-20 (replaces v1 DRAFT)
**Autor:** Marius Tarau + Claude Code
**Scope:** Arhitectură unificată GIS pentru ArchiTools, eterra.live, pug.digital, planhub.ro + viitoare servicii
**Execution trigger:** user sends "go" in new session → orchestrator begins Sprint 1
---
## 0. Cum recunoaște Claude "GO"
În sesiunea următoare, la cuvântul **"go"** singular:
1. Re-read `/home/orchestrator/Code/ArchiTools/docs/plans/001-v2-central-gis-db-multi-app.md` complet
2. Read `project_plan001_GO_sequence.md` din memorii
3. Verifică secrete pre-loaded în Infisical (listă la §20)
4. Pornește **Sprint 1 — Day 1** (§16)
---
## 1. Executive summary
**Ce construim:** platformă GIS centralizată pe server shop (10.10.10.84) care servește 4 produse (ArchiTools intern, eterra.live SaaS topografi, pug.digital SaaS primării, planhub.ro SaaS arhitecți) + viitoare servicii.
**Durată:** 5 săptămâni dev + 1 weekend cutover + 1 săptămână cleanup = ~7 săptămâni total.
**Echipă:** 1 dev (Marius) + Claude Code pair.
**Rezultat:**
- DB unică (PostgreSQL 18.3 + PostGIS 3.6.3) schemă split, RLS multi-tenant
- Tile serving unificat (Martin + TiTiler + PMTiles + Cloudflare CDN)
- Sync orchestrator cu multi-account ANCPI shuffle
- Backup multi-tier (NAS + B2 + viitor Azure + viitor CJ2)
- Dashboard unificat admin eterra.live
- MDLPA export compliance pentru PUG/PUZ/PUD/PMUD
---
## 2. State curent (baseline)
### Infrastructură
- **Shop** (10.10.10.84): Xeon Gold 6430, 128 vCPU, 251GB RAM, 3.3TB NVMe free
- **Satra** (10.10.10.166): Ubuntu, 194GB disk (61% plin), architools_postgres curent
- **Proxy** (10.10.10.199): Traefik v3.6.8
- **NAS NewAmun** (10.10.10.10): NETGEAR ReadyNAS, 40.88TB free, btrfs
### App-uri curente
- ArchiTools (tools.beletage.ro) — Next.js, Prisma, conectat architools_postgres satra
- eterra.live — Next.js v5 NextAuth, per-user AES-256-GCM, același DB
- Martin tile server v1.4.0 pe satra port 3010
### POC postgres-gis (2026-04-19)
- Deployed pe shop la `/home/dnz/postgres-gis/`
- PG 18.3 + PostGIS 3.6.3 + pgBouncer v1.24.1-p1
- `architools_db` restored from satra: 16GB, 24 tabele, 100% paritate row counts
- Performance test: **9.4× faster** ST_Intersects vs satra
- Status: **RUNNING sandbox**, zero impact production
### Securitate rotate 2026-04-19
- AUTHENTIK_CLIENT_SECRET ✅
- NEXTAUTH_SECRET ✅
- DB_PASS Postgres ✅
- MINIO_ACCESS/SECRET_KEY ✅ (user nou gen-hex)
- NOTIFICATION_CRON_SECRET ✅
- ETERRA_USERNAME/PASSWORD ✅ (mutate Infisical)
### Backup infrastructure testată 2026-04-20
- NAS rsync via SSH key RSYNC-ONLY flag, `gisbkp/``/HDD/gisbkp` (symlink), 50 MB/s LAN
- Backblaze B2 bucket `beletage-gis-backups` (EU, Object Lock Compliance, SSE-B2), key scoped
---
## 3. Decizii arhitecturale (finalizate)
### Golden rule (aplicat întreg planul)
Preferă investment upfront pentru câștiguri pe termen lung în: **safety, speed, resources, future-proof**.
### Tehnologii
- **PostgreSQL 18.3** (async I/O, skip scans, temporal constraints, UUID v7)
- **PostGIS 3.6.3** (ST_RemoveIrrelevantPointsForView, SFCGAL modern)
- **pgBouncer 1.24.1-p1** (transaction mode, 1000 max clients, 50 default pool)
- **Martin 1.5.0** tile server (vector MVT)
- **TiTiler** (raster COG)
- **Next.js 16 + Hono + Prisma + Zod** pentru gis-api
- **pg-boss** queue (NU Redis)
- **pgBackRest** backup engine
- **Docker Compose** (NU Kubernetes)
- **MinIO** object storage
- **Authentik** OIDC (existent)
### Versioning policy
- Latest stable ≥3 luni în producție + ≥1 patch
- Pin exact în compose (NU `:latest`)
- Manual upgrades după test pe snapshot
- Periodic review trimestrial
### Domains (brand strategy)
| Domain | Rol |
|---|---|
| `gis.ac` | Infra neutru — tiles, api, pmtiles, s3 |
| `eterra.live` | SaaS topografi |
| `pug.digital` | SaaS primării (fost SmartCity360) |
| `planhub.ro` | SaaS multi-tenant arhitecți |
| `app.beletage.ro` | Beletage tenant CNAME rebrand |
| `vreau.digital` | Rezervă — portal cetățeni |
| `buildini.ai` | Rezervă — AI design |
| `puz.digital` | SEO redirect → pug.digital |
| `beletage.ro` | Intern + admin |
---
## 4. Target stack shop (13 containere)
```
shop (10.10.10.84) — /home/dnz/postgres-gis/ + /opt/gis-stack/:
├── postgres-gis PG 18.3 + PostGIS 3.6.3 [✅ POC]
├── postgres-gis-pgbouncer transaction mode, 1000 clients [✅ POC]
├── martin-public vector tiles publice, zero auth [Sprint 1]
├── martin-private vector tiles auth via JWT ForwardAuth [Sprint 1]
├── titiler raster COG + DEM proxy [Sprint 1]
├── gis-api Next.js 16 + Hono + Prisma + Zod [Sprint 2]
├── gis-sync-orchestrator pg-boss workers + session pool [Sprint 3]
├── nginx-pmtiles static serve MinIO pmtiles [Sprint 1]
├── minio-gis buckets: pmtiles, cog, cf-pdfs, dxf, backups [Sprint 1]
├── pgadmin intern doar (pgadmin.beletage.ro) [Sprint 1]
├── prometheus scrape metrics [Sprint 4]
├── grafana dashboards [Sprint 4]
└── alertmanager alerts → n8n webhook [Sprint 4]
```
**Resource utilization estimat:** ~8% CPU shop, ~15% RAM shop. Loaded la plural, idle majority.
---
## 5. Schema DB split
```
postgres-gis / gis (database):
├── schema: gis_core Cadastru ANCPI (public readable)
│ ├── terenuri Parcele (25M estimate)
│ ├── cladiri Clădiri cadastrale
│ ├── uats 3,186 UAT România
│ ├── administrativ Intravilan + arii protejate
│ └── uats_z0/5/8/12 Simplified views per zoom
├── schema: gis_urban PUG/PUZ/PUD (multi-tenant RLS per siruta)
│ ├── plan_spatial MDLPA-compliant root (SIRUTA, Judet, HCL, stadiu)
│ ├── zf_existenta, zf_propusa Zone funcționale + HILUCS codes
│ ├── zona_reglementare Zone protecție (monumente, sanitară, etc.)
│ ├── utr Unități Teritoriale Referință + regulament JSONB
│ ├── puz, pud Sub-plans
│ ├── cai_comunicatie Drumuri
│ ├── retele_edilitare Apă, canal, gaz, electric, telecom
│ ├── echipare_edilitara Hidranți, stații (puncte)
│ ├── regulament_local Text regulament
│ ├── avize_acorduri PDF metadata
│ ├── *_drafts Versiuni work-in-progress arhitecți
│ └── puz_in_avizare Public preview în timpul avizării
├── schema: gis_enrichment GDPR sensitive (CF, proprietari, adrese)
│ ├── cf_extracts Criptat, RLS strict
│ ├── proprietari Nume + CNP
│ ├── adrese Normalizare nomenclator
│ └── shared_pool GDPR-safe shareable (nr_cad, suprafață, UTR)
├── schema: gis_meta Orchestrare + audit
│ ├── sync_runs Log sync eTerra
│ ├── sync_rules Planning sync
│ ├── audit ENCRYPTED cu pgcrypto (audit metadata)
│ ├── eterra_sessions Session pool persistent
│ ├── eterra_accounts Multi-account ANCPI shuffle
│ ├── eterra_account_usage_log Per-action audit
│ └── raster_sources COG registry (upload metadata)
├── schema: tenants Control plane multi-tenancy
│ ├── tenants id, name, tenant_type, is_beletage_group, siruta_scope[]
│ ├── members user → tenant mapping
│ └── enrichment_scopes per-user flag (none/basic/full)
├── schema: eterra App-specific eterra.live
├── schema: pug App-specific pug.digital
├── schema: archi App-specific ArchiTools
├── schema: planhub App-specific planhub.ro (multi-tenant)
├── schema: queue pg-boss queue
└── schema: public Extensii (postgis, pg_stat_statements, pgcrypto, pg_trgm)
```
### Super-tenant Beletage group
Tabela `tenants.is_beletage_group BOOLEAN`. Tenants marked (Beletage SRL, Studii de Teren, Urban Switch, Cubitron, + viitoare firme asociate) au acces NERESTRICȚIONAT la toate datele. Controlat prin admin Beletage.
---
## 6. Roluri Postgres + RLS
### DB users
- `gis_app_rw` — gis-api main
- `gis_sync_rw` — orchestrator worker (UNICUL scriitor în gis_core)
- `gis_public_ro` — Martin public (read-only)
- `gis_private_ro` — Martin-private (read-only cu RLS)
- `gis_titiler_ro` — TiTiler raster
- `gis_admin_dba` — pgAdmin + Marius (NO BYPASSRLS — Golden)
### Session variables setate la fiecare tranzacție
```sql
SET LOCAL app.user_id = ...;
SET LOCAL app.tenant_id = ...;
SET LOCAL app.is_beletage_group = 'true'|'false';
SET LOCAL app.enrichment_scope = 'none'|'basic'|'full';
SET LOCAL app.allowed_sirutas = 'csv,list';
SET LOCAL app.roles = 'csv,list';
```
### RLS policy patterns
- `gis_core`: zero RLS, public read
- `gis_urban.*_drafts`: tenant-isolated (own tenant_id OR Beletage group)
- `gis_urban.*_aprobate`: public read, write admin_primaria sau Beletage
- `gis_enrichment.cf_extracts`:
```sql
USING (
current_setting('app.is_beletage_group', true)::boolean = true
OR fetched_by_user_id = current_setting('app.user_id', true)::uuid
OR fetched_by_tenant_id = current_setting('app.tenant_id', true)::uuid
OR (is_shareable = true AND current_setting('app.enrichment_scope', true) IN ('basic', 'full'))
)
```
- `FORCE ROW LEVEL SECURITY` pe toate tabelele RLS (fallback deny)
---
## 7. Auth flow (Authentik OIDC)
### JWT custom claims
```json
{
"sub": "user-uuid",
"email": "...",
"tenant_id": "tenant-uuid",
"is_beletage_group": true,
"roles": ["topograf", "firm_admin"],
"enrichment_scope": "full",
"allowed_sirutas": ["54975", "155243"]
}
```
### User attributes sync
- **Webhook HMAC-signed** din planhub/eterra.live la user create/update
- Idempotency key dedupe 24h
- Exponential backoff retry (6 tentative: 1,2,4,8,16,32s)
- Dead Letter Queue în pg-boss `auth-sync-dlq`
### Super-tenant check
**Dynamic DB query** la fiecare JWT emit (Authentik custom expression), cache 60s.
### JWT expiry
- Access: 1h
- Refresh: 30d
---
## 8. Sync orchestrator + multi-account ANCPI
### Componente
- **Queue:** pg-boss (schema `queue`), max 8 workers, retention 30d
- **Workers:**
- `eterra-sync-worker` (delta/full UAT sync)
- `enrichment-worker` (CF fetch + GDPR pool split)
- `pug-ingest-worker` (GPKG upload → gis_urban drafts)
- `pmtiles-rebuild-worker` (tippecanoe → MinIO)
- `cache-invalidate-worker` (Cloudflare purge API)
### Multi-account ANCPI shuffle
```sql
gis_meta.eterra_accounts (
id, username, password_encrypted (AES-256-GCM, key GIS_SYNC_AES_KEY),
account_type, quota_per_hour DEFAULT 500,
usage_current_hour, last_reset_at,
status, notes
)
```
Round-robin pe conturi active cu quota disponibilă. Failover auto dacă blocked. Distribution load ANCPI = zero detection singular-account patterns.
### Session pool persistent (TTL 20 min)
Înlocuiește in-memory din eterra-live + hardcoded din ArchiTools.
### LISTEN/NOTIFY events cross-app
- `sync:uat:done` → apps refresh UI
- `pmtiles:rebuild:done` → cache invalidate
- `enrichment:new` → user UI notify
### Scheduler (pg-boss cron)
- Delta sync: Lu-Vi 2:00
- Deep sync: Weekend 23:00
- PMTiles rebuild: Lu-Vi 4:00
- Cleanup sessions: 6h
### ArchiTools/eterra-live post-migration
- **ArchiTools:** DELETE eterra-client.ts + setInterval. Read-only consumer + trigger sync via API.
- **eterra-live:** Păstrează UI eterra login (per-user), trigger sync prin orchestrator API.
- **Orchestrator = UNICUL scriitor gis_core.**
---
## 9. Tile serving (4 subdomenii)
### `tiles.gis.ac` (PUBLIC)
Martin 1 instanță, zero auth, CF rate limit 100 req/s/IP.
Layer groups:
- `/cadastru`, `/uat`, `/pug-public`, `/zone-protectie`, `/drumuri`, `/retele-publice`
- `/puz-aprobate`, `/pud-aprobate`, `/analize-publice`, `/puz-in-avizare`
### `tiles-private.gis.ac` (AUTH)
Martin-private container + Traefik ForwardAuth JWT validation. Zero cache.
Layer groups:
- `/pug-drafts/{tenant}`, `/puz-drafts/{tenant}`, `/pud-drafts/{tenant}`
- `/observatii-primarie/{siruta}`, `/analize-interne/{tenant}`
### `raster.gis.ac` + `dem.gis.ac`
TiTiler (NU Martin — Martin only vector).
- `raster.gis.ac` serves COG din MinIO bucket
- `dem.gis.ac` = **proxy cu cache CF** la ANCPI MNT (geoportal.ancpi.ro/maps/rest/services/ANCPI/MNT)
- Hillshade + contour generate local din DEM
### `pmtiles.gis.ac`
nginx static serve din MinIO bucket `pmtiles`. CF cache 7 zile.
### CF TTL config
| Path | TTL |
|---|---|
| tiles.gis.ac/uat | 30 zile |
| tiles.gis.ac/cadastru | 6h + auto-purge la sync |
| tiles.gis.ac/pug-public | 24h |
| tiles.gis.ac/puz-in-avizare | 1h |
| tiles.gis.ac/puz-aprobate, pud | 24h |
| pmtiles.gis.ac | 7 zile |
| tiles-private.gis.ac | zero |
| raster.gis.ac, dem.gis.ac | 30 zile |
Bot Fight Mode: Medium default.
### PUZ lifecycle visibility
| Stadiu | Vizibil | Cache |
|---|---|---|
| Draft intern | tiles-private (tenant only) | zero |
| Avizare publică | tiles.gis.ac/puz-in-avizare | 1h |
| Aprobat | tiles.gis.ac/puz-aprobate (overlay pug-public) | 24h |
### Raster Library module (admin eterra.live)
- Upload TIFF/GeoTIFF → background gdal_translate → COG + overviews
- Preview thumbnail 512px
- Public/private toggle
- Delete/replace
- DB: `gis_meta.raster_sources`
---
## 10. API layer (api.gis.ac)
### Stack
Next.js 16 + Hono router intern + Prisma + Zod + Authentik JWT middleware.
### Endpoint exemple
```
POST /enrichment/parcela auth: tier≥basic
GET /parcela/{id} auth: optional
POST /pug/zone-functionala auth: admin_primaria
POST /sync/uat auth: admin Beletage
POST /raster/upload auth: architect cu scope
GET /search?q=... auth: optional
```
### Rate limiting (Redis per JWT sub)
| Tier | Req/oră | Enrichments/zi |
|---|---|---|
| Free | 100 | 10 |
| Basic | 1000 | 100 |
| Pro | 5000 | unlimited |
| Admin | nelimitat | nelimitat |
Managed prin admin eterra.live.
---
## 11. MDLPA compliance (Ordin 904/2023)
**Abordare Golden:** schema internă flexibilă + export layer separat conform MDLPA.
### Internal schema gis_urban.plan
Include ALL MDLPA required fields + extras (history, drafts, workflows, comments).
### PG18 temporal constraint
```sql
ALTER TABLE gis_urban.plan_spatial ADD CONSTRAINT no_overlap_plan
EXCLUDE USING gist (siruta WITH =, doc_type WITH =,
daterange(data_aprob, data_exp, '[)') WITH &&);
```
Garantează zero PUG-uri suprapuse pe aceeași UAT.
### Export worker
- `pug-export-worker` (pg-boss job)
- Input: `{plan_id}`
- Output: ZIP cu 5 subdirectoare + GPKG MDLPA-compliant + PDF-uri
- Tooling: ogr2ogr (GDAL) pentru GPKG, custom orchestrator Node
- Drop Z/M dimensions la export
- Default HILUCS_N1 dacă missing intern
- ZIP naming: `^[A-Z][A-Z]_.*?_\d{1,10}_(PUG|PUZ|PATJ|PMUD)_[0-9]{8}`
### Validator MDLPA
**Faza 1:** manual pre-submit by user (download Validator_2.0.5, run local).
**Faza 2:** Marius primește repo validator de la friends MDLPA → containerizăm.
**Faza 3:** custom pre-validation 80% reguli (blocker-i obvious) înaintea ZIP download.
### Doc types în scope
PUG (prio 1), PUZ (prio 1), PUD (prio 2), PMUD (da), PATJ (opțional, max 1 județ).
---
## 12. Backup strategy (multi-tier)
### Tier 1 — NAS local (LAN fast)
- Path: `gis-backup@10.10.10.10:gisbkp/` (symlink → `/HDD/gisbkp`)
- btrfs + COW + compression
- Retention: 30 daily + 12 monthly + 5 yearly snapshots
- Protocol: rsync over SSH, RSYNC-ONLY key flag
### Tier 2 — Backblaze B2 (offsite hot)
- Bucket: `beletage-gis-backups` (region EU)
- Object Lock Compliance mode 90d + SSE-B2
- Weekly full + monthly archive encrypted AES-256 client-side
### Tier 3 — Azure (deferred, credit €2400/year)
- Azure PostgreSQL Flexible Server (DR cloud fallback, ~50-80€/lună)
- Azure CDN + Blob Hot (PMTiles secondary, ~20€/lună)
- When: scale 10k+ users sau geo-redundanță
### Tier 4 — CJ2 standby (deferred ~oct 2026)
- Streaming replication async, RPO~0, RTO<5min
- Aștept workstation AI nou + disk dedicat
### pgBackRest setup
- 2 repos: NAS + B2
- PITR window: **30 zile**
- Continuous WAL archive
- Compress zstd + encrypt AES-256
- Parallel restore workers
### DR drill
- **Trimestrial** manual full recovery simulation
- **Săptămânal** smoke test (read-only integrity check)
- **Lunar** test restore automation (ephemeral container, 30 min, 20GB temp)
### Backup Dashboard
Modul în admin.eterra.live, tab "Backup & Recovery":
- Cards pentru fiecare tier (NAS, B2, Azure placeholder, CJ2 placeholder)
- Plugin architecture `BackupTarget` interface — extensibil
- Slot-uri viitoare: email archival, Gitea repos, workstations, configs
- Alerting Golden low-noise (email doar la red)
---
## 13. Monitoring (Dashboard admin eterra.live)
### 6 cards simple (RO tooltip clar)
1. Hărți (req/s)
2. Baza de date (query latency)
3. Spațiu disc (%)
4. Topografi activi
5. Sincronizare (ultim)
6. Backup (ultim)
### Culori: verde/galben/roșu
### Alert doar la roșu (zero spam)
### Grafana full = collapsed "Detalii avansate"
### Prometheus metrics scraped
- Martin (/metrics), Postgres (pg_stat_statements + exporter), pgBouncer, MinIO, gis-api, orchestrator
---
## 14. Securitate — posture final
### Authentication
- Authentik OIDC pentru toate apps
- JWT 1h access + 30d refresh
- JWKS cache 1h
### Authorization
- RLS multi-tenant (siruta + tenant_id + is_beletage_group)
- Session variables via SET LOCAL (zero leak)
- Zero Postgres role cu BYPASSRLS permanent
### Secrets management
- Infisical single source of truth
- Rotate schedule: annual pentru keys critice, automatic pentru JWT
- Encryption keys: Infisical + paper print sealed Beletage safe (YubiKey viitor)
### Network
- PG 5433 bind 127.0.0.1 (intern shop)
- pgBouncer 6432 bind 127.0.0.1
- Martin/TiTiler/gis-api prin Traefik only
- Sophos DNAT doar 80/443 către proxy
### Audit
- `gis_meta.audit` ENCRYPTED cu pgcrypto (key AUDIT_ENCRYPTION_KEY)
- Retention: 1 an hot PG + 5 ani archive MinIO
- Decryption doar la query admin explicit
### Data at rest
- Backups encrypted (pgBackRest + client-side AES pentru B2)
- ENCRYPTION_SECRET rotate pending (re-encrypt 15 vault entries)
---
## 15. Ce DISPARE post-cutover
- `setInterval` cron-in-cod din ArchiTools
- `eterra-client.ts` din ArchiTools (mut în orchestrator)
- In-memory session pool eterra-live (persistent DB)
- PG 5432 exposed 0.0.0.0 satra (shop bind LAN)
- PMTiles webhook broken (înlocuit de pmtiles-rebuild-worker)
- systemd pmtiles-webhook parole în args (container nou)
- Hardcoded ETERRA_USERNAME/PASSWORD ArchiTools (shared pool)
- AES IV=16 eterra-live (upgrade 12)
- Rate limit in-memory eterra-live (Redis)
---
## 16. Migration timeline (6 săptămâni)
### Sprint 1 — Foundation stack (săpt 1)
**Day 1** (GO trigger):
- Deploy `martin-public` container pe shop
- Deploy `martin-private` container + Traefik ForwardAuth middleware
- Deploy `titiler` container
- Deploy `nginx-pmtiles` container
- Deploy `minio-gis` container + bucket setup
**Day 2:**
- Deploy `pgadmin` (intern pgadmin.beletage.ro)
- Traefik routes pentru toate subdomeniile GIS
**Day 3-4:**
- Cloudflare zones: add `gis.ac`, `eterra.live`, `pug.digital`, `planhub.ro`
- DNS records + Page Rules TTL + Rate Limits + Bot Fight Medium
- Test tile serving basic
**Day 5:**
- Schema split pe shop:
- `CREATE SCHEMA gis_core, gis_urban, gis_enrichment, gis_meta, tenants, eterra, pug, archi, planhub, queue`
- `ALTER TABLE public.X SET SCHEMA gis_core` pentru tabele existente
- Create views + roles
### Sprint 2 — Auth + data layer (săpt 2)
**Day 1-2:**
- Authentik custom property mappings (tenant_id, is_beletage_group, enrichment_scope, allowed_sirutas)
- Webhook signed HMAC din eterra.live → Authentik sync
- PG roluri aplicație + grants
**Day 3-4:**
- RLS policies per tabel (gis_urban, gis_enrichment)
- FORCE ROW LEVEL SECURITY pe toate
- gis-api boilerplate (Next.js 16 + Hono + Prisma + Zod)
- Auth middleware JWT verify + SET LOCAL
**Day 5:**
- Cypress test suite RLS (6 scenarii obligatorii)
### Sprint 3 — Sync orchestrator (săpt 3)
**Day 1-2:**
- `gis-sync-orchestrator` container
- pg-boss setup schema queue
- eTerra client extract în shared lib
**Day 3:**
- Session pool persistent `gis_meta.eterra_sessions` + multi-account `gis_meta.eterra_accounts`
- Workers: eterra-sync, enrichment, pug-ingest
**Day 4:**
- Workers: pmtiles-rebuild, cache-invalidate (CF API)
- LISTEN/NOTIFY events implementation
**Day 5:**
- Admin UI module eterra.live — sync jobs control panel
### Sprint 4 — Backup + Dashboard (săpt 4)
**Day 1-2:**
- pgBackRest deploy shop
- 2 repos: NAS (rsync push via cron) + B2 (direct S3)
- Scheduled jobs: daily incremental, weekly full, monthly archive
- WAL archive continuous
**Day 3:**
- Backup Dashboard UI (admin eterra.live tab "Backup & Recovery")
- Plugin architecture `BackupTarget` interface
- Cards: NAS, B2, Azure placeholder, CJ2 placeholder
**Day 4:**
- DR runbook scripts
- Monthly restore test automation (ephemeral container cron)
- Prometheus + Grafana + Alertmanager deploy
**Day 5:**
- Grafana dashboards (GIS Overview, Martin Perf, PG Health, API Rate Limits, Raster Library)
- 6 simple cards admin eterra.live
- Alertmanager → n8n webhook (email + Telegram on red)
### Sprint 5 — Test & parity (săpt 5)
**Day 1-2:**
- Shadow sync: orchestrator rulează în paralel cu satra setInterval (both read from ANCPI)
- Compare row counts ambele, checksum-uri, latency
**Day 3:**
- Load test: 1000 concurrent API requests, 500 QPS tile server
- Performance comparisons satra vs shop end-to-end
**Day 4:**
- Security audit final: TLS scan, dependency audit, pentest basic
- CORS, CSP, rate limits verify
**Day 5:**
- GO/NO-GO decision meeting
- Final checklist review
### Weekend cutover (if GO)
**Friday 22:00:**
- Notify users (eterra.live + ArchiTools) maintenance 4h
- Traefik maintenance page
- Stop ArchiTools + eterra-live containers
- pg_dump satra final snapshot → shop (delta apply)
- Diff check paritate
- Switch DATABASE_URL containere → shop
- Deploy ArchiTools nou (cu eterra-client DELETED)
- Cloudflare DNS update: tiles.gis.ac → shop Martin
- Start orchestrator cron jobs
- Smoke test full flow
**Saturday 06:00:**
- Monitor dashboard
- On-call standby
**Saturday 20:00:**
- All green → GO production
- satra architools_postgres READ-ONLY (fallback 30 zile safety)
**Sunday:**
- Observation + bug bash
### Cleanup (săpt 6)
- git filter-repo ArchiTools (rescrie istoria, notify re-clone)
- Structural fix compose (env_file + stack.env in .gitignore)
- Remove satra containers (după 30 zile)
- ENCRYPTION_SECRET rotate + re-encrypt 15 vault entries
- Documentation final în busc-infra
---
## 17. Runbooks (DR scenarios)
### R1 — Corruption parcela X la 14:30
```bash
pgbackrest --stanza=gis --type=time --target="2026-04-20 14:29:00" restore
# Verify integrity + promote
```
### R2 — Crash shop total → failover CJ2 (când activ)
1. Promote CJ2 standby to primary
2. Update DNS: tiles.gis.ac, api.gis.ac → CJ2 IP
3. Restart apps pointing CJ2
4. RTO target: <5 min
### R3 — Ransomware
- B2 Object Lock Compliance = fișiere immutable
- Restore latest clean snapshot pre-ransomware
- Wipe compromised + rebuild from B2
### R4 — Accidental DROP TABLE
```bash
pgbackrest restore --type=time --target="<moment înainte de DROP>"
# Replay WAL până exact înainte de comanda distructivă
```
### R5 — Disaster regional (shop + CJ2 down)
- Restore din B2 arhive lunar
- Bootstrap new cluster pe alt host
- RTO: 4-8 ore
---
## 18. Next features post-production
### Imediat după stable (month 1-2)
- MDLPA validator containerizat (după repo de la friends MDLPA)
- Geopackage export pipeline complet
- Raster Library module + upload UI
### Medium term (month 3-6)
- PMUD workflow (rețele transport, piste, parcare)
- SmartCity360 / pug.digital launch (primul tenant primărie)
- planhub.ro SaaS pivot multi-tenant (primii arhitecți externi)
- CJ2 standby activation (după workstation AI nou)
### Long term (month 6-12)
- PATJ suport (1 județ pilot)
- Azure activation (Azure PostgreSQL DR + Blob Hot CDN)
- Cross-region geo-redundanță dacă scale justifies
---
## 19. Risk register
| Risc | Probabilitate | Impact | Mitigare |
|---|---|---|---|
| Cutover fail | Low | High | Rollback DNS → satra, 30d safety window |
| Data loss cutover window | Very Low | High | Shadow sync săpt 5, delta final |
| Orchestrator bugs prod | Medium | Medium | Săpt 5 shadow prinde majoritatea |
| ANCPI block multi-account | Low | High | Shuffle + quota + alerting |
| B2 cloud outage | Very Low | Medium | NAS + CJ2 (când activ) alternative |
| Marius unavailable | Medium | High | Claude Code has full plan + memory |
---
## 20. Secrete pre-loaded în Infisical (verificate 2026-04-20)
### Active (ArchiTools production)
- AUTHENTIK_CLIENT_ID, AUTHENTIK_CLIENT_SECRET
- NEXTAUTH_SECRET
- DB_PASS (satra Postgres user, will rotate at cutover)
- ETERRA_USERNAME, ETERRA_PASSWORD
- NOTIFICATION_CRON_SECRET
- BREVO_BELETAGE_API_KEY
### Shop stack (pre-loaded pentru sprint 1)
- POSTGRES_GIS_SUPERUSER_PASSWORD
- POSTGRES_GIS_ARCHITOOLS_PASS
- POSTGRES_GIS_ETERRA_PASS
- POSTGRES_GIS_MARTIN_PASS
- POSTGRES_GIS_SYNC_PASS
- POSTGRES_GIS_PUG_PASS
- POSTGRES_GIS_PLANHUB_PASS
- MINIO_GIS_ROOT_USER
- MINIO_GIS_ROOT_PASSWORD
- PGADMIN_DEFAULT_PASSWORD
### Security/Crypto
- GIS_SYNC_AES_KEY
- AUTHENTIK_WEBHOOK_HMAC_SECRET
- AUDIT_ENCRYPTION_KEY
- BACKUP_DASHBOARD_ADMIN_TOKEN
### Backup
- BACKUP_ARCHIVE_KEY (AES-256 B2 client-side)
- PGBACKREST_REPO1_CIPHER_PASS
- STANDBY_REPLICATION_PASS (pentru CJ2 viitor)
- B2_APPLICATION_KEY_ID
- B2_APPLICATION_KEY
- B2_BUCKET_NAME (beletage-gis-backups)
- B2_ENDPOINT (s3.eu-central-003.backblazeb2.com)
### De generat la moment (viitor)
- POSTGRES_GIS_TITILER_PASS (day 1 sprint 1)
- POSTGRES_GIS_APP_RW_PASS (day 1 sprint 2)
- GRAFANA_ADMIN_PASSWORD (sprint 4)
- PROMETHEUS_BASIC_AUTH (sprint 4)
- AZURE_STORAGE_ACCOUNT, AZURE_STORAGE_SAS (când activăm)
---
## 21. Hardware/infrastructure check-list pre-GO
- ✅ Shop 10.10.10.84 reachable SSH (user dnz, groups docker+wheel)
- ✅ Shop docker 29.2.1 instalat
- ✅ Shop disk 3.3TB free
- ✅ Shop RAM 225GB avail
- ✅ Shop ports 5433, 6432 libere (POC ocupate)
- ✅ Satra SSH (user bulibasa) reachable
- ✅ Proxy Traefik v3.6.8 (10.10.10.199)
- ✅ Sophos 80/443 only WAN (LAN internal safe)
- ✅ NAS 10.10.10.10 rsync key deployed, symlink OK
- ✅ Backblaze B2 bucket + scoped key tested
- ✅ Authentik auth.beletage.ro v2025.2.4 operational
- ✅ Infisical toate secretele pre-loaded
- ✅ Cloudflare API token disponibil
---
## 22. Acceptance criteria (Definition of Done)
Production considerat stable când:
- [ ] Toate 13 containere shop running + healthy
- [ ] Cypress RLS suite 100% pass
- [ ] Load test 500 QPS tiles + 100 QPS api sub 200ms p95
- [ ] Backup test restore successful (monthly)
- [ ] ArchiTools + eterra.live live pe shop DATABASE_URL
- [ ] Orchestrator UNICUL scriitor gis_core
- [ ] Zero error logs 72h continuous
- [ ] Dashboard admin eterra.live arată toate verde
- [ ] DR runbook validat (scenario R1 + R4 testat)
- [ ] Grafana dashboards populated
- [ ] Users notificați + feedback pozitiv
---
## 23. Contacts & ownership
- **Owner:** Marius Tarau (m.tarau@beletage.ro)
- **Pair:** Claude Code (via Anthropic Opus 4.7 1M context)
- **Escalation:** m.tarau@beletage.ro + (telefon Beletage)
- **On-call during cutover weekend:** Marius primary, Claude assist
---
## 24. Changelog
- **v1 DRAFT** (2026-04-18) — initial propunere
- **v2 FINAL** (2026-04-20) — approved, includes all decisions from Pas 1-7, POC results, security rotation status, multi-account ANCPI shuffle, super-tenant Beletage group, Azure credit alloc, NAS + B2 tested, Backup Dashboard plugin architecture
---
**END OF PLAN.**
Next action: when Marius sends "go" → Sprint 1 Day 1 begins.