Files
vreau-digital/services/seap-scraper/ASF-PLAN.md
T
Claude VM a6c03a091e initial: split from gov-agreg — vreau.digital standalone platform
Moved from gov-agreg/src/pages/achizitii/* to root (drop prefix).
- 22 pages migrated, 127 files total
- All internal links: /achizitii/X → /X (176 occurrences fixed)
- AchizitiiLayout subnav rewritten: /X paths, top-right link to vreaudigital.ro hub
- BaseLayout new (vreau.digital branding, OG tags, site URL)
- astro.config.mjs: site https://vreau.digital, server output (was static)
- docker-compose: port 5096 (vreaudigital is 5095), container vreau-digital
- deploy.sh: paths /opt/vreau-digital, log /var/log/vreau-digital-deploy.log

Backend shared with gov-agreg:
- PostgreSQL satra (same schemas: seap, firms, anaf, anre, ...)
- Photon, Martin tiles
- Infisical /vreaudigital path (DATABASE_URL etc. shared)

build: PASS (npx astro check 0 errors, npm run build 5s vite + 10s server)
2026-05-13 00:10:32 +03:00

6.5 KiB
Raw Blame History

ASF — Autoritatea de Supraveghere Financiară

Public registries of authorized financial entities — insurers, brokers, pension funds, asset managers, intermediaries.

Status (2026-05-10)

MVP ingest complete: 849 entities, 100% CUI coverage. Captures data.asfromania.ro/scr/ra via free-text term enumeration.

Register type Active Radiated
Asigurători (RA-NNN) 24 37
Brokeri (RBK-NNN) 245 543

Cross-source signal (validated): 69 ASF-licensed firms hold 3,530 SEAP contracts totaling €614 mln. Top: ASIROM (RA-023) — 523 contracts, €283 mln; ALLIANZ-ȚIRIAC (RA-017) — 467 contracts, €50 mln; GROUPAMA (RA-009) — 315 contracts, €41 mln. Zero contracts won post-radiere (positive integrity signal).

Files:

  • SQL: services/seap-scraper/sql/034_asf.sql — schema asf (entitati, scrape_log, mv_entitati_per_cui).
  • Scraper: services/seap-scraper/src/scrape-asf.ts
  • Wrapper: services/seap-scraper/cron/scrape-asf.sh

Source map (ASF registers ecosystem)

Sub-register Volume URL Status
Asigurători (RA-NNN) + Intermediari principali (RBK-NNN) — active + radiate ~860 data.asfromania.ro/scr/ra Done — this scraper
Intermediari secundari (RIS) ~variable asfromania.ro/ro/a/1704 TODO
Specialiști constatare daune ~variable asfromania.ro/ro/a/1999 TODO
Furnizori programe formare ~variable asfromania.ro/ro/a/2068 TODO
Lectori ~variable asfromania.ro/ro/a/2067 TODO
Piață de capital (SSIF/AOPC/SAI/depozitari) ~30-50 asfromania.ro/ro/a/1705 TODO
Pensii private (Pillar 2 + 3 + administratori) ~20 asfromania.ro/ro/a/2365 + data.asfromania.ro/scr/adeziuniFP TODO
Asigurători din SEE (passporting) ~~hundreds asfromania.ro/ro/a/2082 TODO

Critical scraping insight (the trick)

data.asfromania.ro/scr/ra/cautare POST endpoint is fronted by Google reCAPTCHA Enterprise but the server only validates the captcha if the form field g-recaptcha-response is present in the body. When that field is OMITTED entirely, the captcha check is skipped and the server returns full results. (When sent with any value, even empty, server tries to verify and rejects with "Verificare captcha eșuată".)

Fields per response (HTML inside raspuns):

  • Number registration (RA-XXX / RBK-XXX) — globally unique per type
  • LEI 20-char, CUI, RC code (J40/2226/2006)
  • Authorization number + date, registration date, radiation date (active=NULL)
  • Type (Societate de asigurare / Intermediar principal)
  • Legal form, address, phone, fax
  • Authorized classes (general + life — array)
  • Executives (Conducere executivă)

Constraints

  • Server-side validation: termen must be ≥4 characters.
  • Free-text search hits multiple fields (denumire, CUI, adresă, județ, classes).
  • sectiune (1=active / 2=radiate) and tipCompanie (0=insurer / 1=broker) appear to be IGNORED by the search endpoint when termen is given — results span all sections regardless.

Strategy used

  1. Seed phase — 11 broad terms (ASIGURA, BROKER, BUCU, CLUJ, TIMI, BRAS, RETRA, RADI, FUZIO, ...) covering active + radiated. Yields ~840 entities.
  2. Gap-fill phase — for each prefix (RA-, RBK-) compute observed sequence, probe gaps + 5 entries past the max via direct register-no lookup. Yields the final ~20 missing.

Next steps (TODO for follow-up agents)

Quick wins (1-2h each)

  1. Pensii privatedata.asfromania.ro/scr/adeziuniFP likely has same captcha-bypass trick. ~7-15 fund administrators is small but high-value (NN, BCR Pensii, Allianz-Țiriac Pensii, etc.).

  2. SEE passporting listasfromania.ro/ro/a/2082. EU-wide insurers selling RCA in Romania. Probably HTML table on the page itself.

Medium (3-5h)

  1. Piață de capital register (SSIF, SAI, AOPC, depozitari) — typically PDF/Excel attachments at asfromania.ro/uploads/articole/. ~50 entities total. Replicates the fonduri.beneficiar_anunt Excel-parser pattern.

  2. Intermediari secundari (RIS) — large (~thousands) but mostly individuals (no CUI). May not be worth the effort vs. corporate registers.

Cross-source recipe

"Asigurători + brokeri ASF cu contracte SEAP" — financial firms licensed by ASF that have won state insurance/financial-services contracts.

-- Recipe: ASF-licensed firms × SEAP wins
SELECT
  a.register_no,
  a.register_type,
  a.section_status,
  a.name AS asf_name,
  a.cui,
  a.data_autorizare,
  a.data_radiere,
  COUNT(DISTINCT n.id)                                    AS seap_contracts,
  SUM(COALESCE(n.awarded_value, n.estimated_value))       AS total_seap_value,
  COUNT(DISTINCT n.authority_cui)                         AS distinct_authorities,
  MIN(n.publication_date)                                 AS first_seap_win,
  MAX(n.publication_date)                                 AS last_seap_win,
  -- Red-flag: still winning contracts after radiere
  COUNT(*) FILTER (WHERE a.data_radiere IS NOT NULL
                   AND n.publication_date::date > a.data_radiere) AS contracts_post_radiere
FROM asf.entitati a
JOIN seap.announcements n ON n.supplier_cui = a.cui
WHERE a.cui IS NOT NULL
GROUP BY a.id, a.register_no, a.register_type, a.section_status, a.name, a.cui, a.data_autorizare, a.data_radiere
ORDER BY total_seap_value DESC NULLS LAST
LIMIT 100;

Companion recipe: "Brokeri ASF cu datorii ANAF" — brokers in ANAF datornici list still active in ASF register. Combines asf.mv_entitati_per_cui with anaf.datornici_curent.

SELECT
  a.register_no,
  a.name,
  a.cui,
  d.suma_totala_datorii,
  d.luna_raportare
FROM asf.mv_entitati_per_cui m
JOIN asf.entitati a ON a.cui = m.cui
JOIN anaf.datornici_curent d ON d.cui = m.cui
WHERE m.nr_active > 0
ORDER BY d.suma_totala_datorii DESC
LIMIT 50;

Schema reference

asf.entitati (
  id, register_type, section_status, register_no, name, name_normalized,
  cui, cod_rc, cod_lei, nr_autorizatie,
  data_autorizare, data_inmatriculare, data_radiere,
  tip_companie, forma_juridica,
  adresa, telefon, fax, email, web, observatii,
  clase_autorizate jsonb, conducere jsonb, raw_html,
  fetched_at
)
UNIQUE (register_type, register_no)

asf.mv_entitati_per_cui (cui, nr_total, nr_asigurator, nr_broker, ...)

Refresh policy

Recommended: weekly cron (registry changes are slow — new authorizations ~weekly, radiation events monthly). Estimated full scrape: ~10 min wall.

# Sunday 3:30 AM
30 3 * * 0 root /opt/vreaudigital/services/seap-scraper/cron/scrape-asf.sh