initial: split from gov-agreg — vreau.digital standalone platform

Moved from gov-agreg/src/pages/achizitii/* to root (drop prefix).
- 22 pages migrated, 127 files total
- All internal links: /achizitii/X → /X (176 occurrences fixed)
- AchizitiiLayout subnav rewritten: /X paths, top-right link to vreaudigital.ro hub
- BaseLayout new (vreau.digital branding, OG tags, site URL)
- astro.config.mjs: site https://vreau.digital, server output (was static)
- docker-compose: port 5096 (vreaudigital is 5095), container vreau-digital
- deploy.sh: paths /opt/vreau-digital, log /var/log/vreau-digital-deploy.log

Backend shared with gov-agreg:
- PostgreSQL satra (same schemas: seap, firms, anaf, anre, ...)
- Photon, Martin tiles
- Infisical /vreaudigital path (DATABASE_URL etc. shared)

build: PASS (npx astro check 0 errors, npm run build 5s vite + 10s server)
This commit is contained in:
Claude VM
2026-05-13 00:10:32 +03:00
commit a6c03a091e
352 changed files with 75295 additions and 0 deletions
+170
View File
@@ -0,0 +1,170 @@
# ASF — Autoritatea de Supraveghere Financiară
Public registries of authorized financial entities — insurers, brokers,
pension funds, asset managers, intermediaries.
## Status (2026-05-10)
**MVP ingest complete: 849 entities, 100% CUI coverage.**
Captures `data.asfromania.ro/scr/ra` via free-text term enumeration.
| Register type | Active | Radiated |
|---|---|---|
| Asigurători (RA-NNN) | 24 | 37 |
| Brokeri (RBK-NNN) | 245 | 543 |
**Cross-source signal (validated):** 69 ASF-licensed firms hold 3,530 SEAP
contracts totaling **€614 mln**. Top: ASIROM (RA-023) — 523 contracts,
€283 mln; ALLIANZ-ȚIRIAC (RA-017) — 467 contracts, €50 mln; GROUPAMA
(RA-009) — 315 contracts, €41 mln. Zero contracts won post-radiere
(positive integrity signal).
Files:
- SQL: `services/seap-scraper/sql/034_asf.sql` — schema `asf` (entitati, scrape_log, mv_entitati_per_cui).
- Scraper: `services/seap-scraper/src/scrape-asf.ts`
- Wrapper: `services/seap-scraper/cron/scrape-asf.sh`
## Source map (ASF registers ecosystem)
| Sub-register | Volume | URL | Status |
|---|---|---|---|
| Asigurători (RA-NNN) + Intermediari principali (RBK-NNN) — active + radiate | ~860 | `data.asfromania.ro/scr/ra` | **Done — this scraper** |
| Intermediari secundari (RIS) | ~variable | `asfromania.ro/ro/a/1704` | TODO |
| Specialiști constatare daune | ~variable | `asfromania.ro/ro/a/1999` | TODO |
| Furnizori programe formare | ~variable | `asfromania.ro/ro/a/2068` | TODO |
| Lectori | ~variable | `asfromania.ro/ro/a/2067` | TODO |
| Piață de capital (SSIF/AOPC/SAI/depozitari) | ~30-50 | `asfromania.ro/ro/a/1705` | TODO |
| Pensii private (Pillar 2 + 3 + administratori) | ~20 | `asfromania.ro/ro/a/2365` + `data.asfromania.ro/scr/adeziuniFP` | TODO |
| Asigurători din SEE (passporting) | ~~hundreds | `asfromania.ro/ro/a/2082` | TODO |
## Critical scraping insight (the trick)
`data.asfromania.ro/scr/ra/cautare` POST endpoint is fronted by Google
reCAPTCHA Enterprise but **the server only validates the captcha if the
form field `g-recaptcha-response` is present in the body**. When that
field is OMITTED entirely, the captcha check is skipped and the server
returns full results. (When sent with any value, even empty, server tries
to verify and rejects with "Verificare captcha eșuată".)
Fields per response (HTML inside `raspuns`):
- Number registration (RA-XXX / RBK-XXX) — globally unique per type
- LEI 20-char, CUI, RC code (J40/2226/2006)
- Authorization number + date, registration date, radiation date (active=NULL)
- Type (Societate de asigurare / Intermediar principal)
- Legal form, address, phone, fax
- Authorized classes (general + life — array)
- Executives (Conducere executivă)
## Constraints
- Server-side validation: `termen` must be ≥4 characters.
- Free-text search hits multiple fields (denumire, CUI, adresă, județ, classes).
- `sectiune` (1=active / 2=radiate) and `tipCompanie` (0=insurer / 1=broker)
appear to be IGNORED by the search endpoint when `termen` is given —
results span all sections regardless.
## Strategy used
1. **Seed phase** — 11 broad terms (ASIGURA, BROKER, BUCU, CLUJ, TIMI, BRAS,
RETRA, RADI, FUZIO, ...) covering active + radiated. Yields ~840 entities.
2. **Gap-fill phase** — for each prefix (RA-, RBK-) compute observed sequence,
probe gaps + 5 entries past the max via direct register-no lookup.
Yields the final ~20 missing.
## Next steps (TODO for follow-up agents)
### Quick wins (1-2h each)
1. **Pensii private**`data.asfromania.ro/scr/adeziuniFP` likely has same
captcha-bypass trick. ~7-15 fund administrators is small but high-value
(NN, BCR Pensii, Allianz-Țiriac Pensii, etc.).
2. **SEE passporting list**`asfromania.ro/ro/a/2082`. EU-wide insurers
selling RCA in Romania. Probably HTML table on the page itself.
### Medium (3-5h)
3. **Piață de capital register** (`SSIF`, `SAI`, `AOPC`, depozitari) —
typically PDF/Excel attachments at `asfromania.ro/uploads/articole/`. ~50
entities total. Replicates the `fonduri.beneficiar_anunt` Excel-parser
pattern.
4. **Intermediari secundari (RIS)** — large (~thousands) but mostly
individuals (no CUI). May not be worth the effort vs. corporate registers.
## Cross-source recipe
**"Asigurători + brokeri ASF cu contracte SEAP"** — financial firms licensed
by ASF that have won state insurance/financial-services contracts.
```sql
-- Recipe: ASF-licensed firms × SEAP wins
SELECT
a.register_no,
a.register_type,
a.section_status,
a.name AS asf_name,
a.cui,
a.data_autorizare,
a.data_radiere,
COUNT(DISTINCT n.id) AS seap_contracts,
SUM(COALESCE(n.awarded_value, n.estimated_value)) AS total_seap_value,
COUNT(DISTINCT n.authority_cui) AS distinct_authorities,
MIN(n.publication_date) AS first_seap_win,
MAX(n.publication_date) AS last_seap_win,
-- Red-flag: still winning contracts after radiere
COUNT(*) FILTER (WHERE a.data_radiere IS NOT NULL
AND n.publication_date::date > a.data_radiere) AS contracts_post_radiere
FROM asf.entitati a
JOIN seap.announcements n ON n.supplier_cui = a.cui
WHERE a.cui IS NOT NULL
GROUP BY a.id, a.register_no, a.register_type, a.section_status, a.name, a.cui, a.data_autorizare, a.data_radiere
ORDER BY total_seap_value DESC NULLS LAST
LIMIT 100;
```
**Companion recipe:** "Brokeri ASF cu datorii ANAF" — brokers in ANAF datornici
list still active in ASF register. Combines `asf.mv_entitati_per_cui` with
`anaf.datornici_curent`.
```sql
SELECT
a.register_no,
a.name,
a.cui,
d.suma_totala_datorii,
d.luna_raportare
FROM asf.mv_entitati_per_cui m
JOIN asf.entitati a ON a.cui = m.cui
JOIN anaf.datornici_curent d ON d.cui = m.cui
WHERE m.nr_active > 0
ORDER BY d.suma_totala_datorii DESC
LIMIT 50;
```
## Schema reference
```
asf.entitati (
id, register_type, section_status, register_no, name, name_normalized,
cui, cod_rc, cod_lei, nr_autorizatie,
data_autorizare, data_inmatriculare, data_radiere,
tip_companie, forma_juridica,
adresa, telefon, fax, email, web, observatii,
clase_autorizate jsonb, conducere jsonb, raw_html,
fetched_at
)
UNIQUE (register_type, register_no)
asf.mv_entitati_per_cui (cui, nr_total, nr_asigurator, nr_broker, ...)
```
## Refresh policy
Recommended: weekly cron (registry changes are slow — new authorizations
~weekly, radiation events monthly). Estimated full scrape: ~10 min wall.
```cron
# Sunday 3:30 AM
30 3 * * 0 root /opt/vreaudigital/services/seap-scraper/cron/scrape-asf.sh
```