initial: split from gov-agreg — vreau.digital standalone platform
Moved from gov-agreg/src/pages/achizitii/* to root (drop prefix). - 22 pages migrated, 127 files total - All internal links: /achizitii/X → /X (176 occurrences fixed) - AchizitiiLayout subnav rewritten: /X paths, top-right link to vreaudigital.ro hub - BaseLayout new (vreau.digital branding, OG tags, site URL) - astro.config.mjs: site https://vreau.digital, server output (was static) - docker-compose: port 5096 (vreaudigital is 5095), container vreau-digital - deploy.sh: paths /opt/vreau-digital, log /var/log/vreau-digital-deploy.log Backend shared with gov-agreg: - PostgreSQL satra (same schemas: seap, firms, anaf, anre, ...) - Photon, Martin tiles - Infisical /vreaudigital path (DATABASE_URL etc. shared) build: PASS (npx astro check 0 errors, npm run build 5s vite + 10s server)
This commit is contained in:
@@ -0,0 +1,170 @@
|
||||
# ASF — Autoritatea de Supraveghere Financiară
|
||||
|
||||
Public registries of authorized financial entities — insurers, brokers,
|
||||
pension funds, asset managers, intermediaries.
|
||||
|
||||
## Status (2026-05-10)
|
||||
|
||||
**MVP ingest complete: 849 entities, 100% CUI coverage.**
|
||||
Captures `data.asfromania.ro/scr/ra` via free-text term enumeration.
|
||||
|
||||
| Register type | Active | Radiated |
|
||||
|---|---|---|
|
||||
| Asigurători (RA-NNN) | 24 | 37 |
|
||||
| Brokeri (RBK-NNN) | 245 | 543 |
|
||||
|
||||
**Cross-source signal (validated):** 69 ASF-licensed firms hold 3,530 SEAP
|
||||
contracts totaling **€614 mln**. Top: ASIROM (RA-023) — 523 contracts,
|
||||
€283 mln; ALLIANZ-ȚIRIAC (RA-017) — 467 contracts, €50 mln; GROUPAMA
|
||||
(RA-009) — 315 contracts, €41 mln. Zero contracts won post-radiere
|
||||
(positive integrity signal).
|
||||
|
||||
Files:
|
||||
- SQL: `services/seap-scraper/sql/034_asf.sql` — schema `asf` (entitati, scrape_log, mv_entitati_per_cui).
|
||||
- Scraper: `services/seap-scraper/src/scrape-asf.ts`
|
||||
- Wrapper: `services/seap-scraper/cron/scrape-asf.sh`
|
||||
|
||||
## Source map (ASF registers ecosystem)
|
||||
|
||||
| Sub-register | Volume | URL | Status |
|
||||
|---|---|---|---|
|
||||
| Asigurători (RA-NNN) + Intermediari principali (RBK-NNN) — active + radiate | ~860 | `data.asfromania.ro/scr/ra` | **Done — this scraper** |
|
||||
| Intermediari secundari (RIS) | ~variable | `asfromania.ro/ro/a/1704` | TODO |
|
||||
| Specialiști constatare daune | ~variable | `asfromania.ro/ro/a/1999` | TODO |
|
||||
| Furnizori programe formare | ~variable | `asfromania.ro/ro/a/2068` | TODO |
|
||||
| Lectori | ~variable | `asfromania.ro/ro/a/2067` | TODO |
|
||||
| Piață de capital (SSIF/AOPC/SAI/depozitari) | ~30-50 | `asfromania.ro/ro/a/1705` | TODO |
|
||||
| Pensii private (Pillar 2 + 3 + administratori) | ~20 | `asfromania.ro/ro/a/2365` + `data.asfromania.ro/scr/adeziuniFP` | TODO |
|
||||
| Asigurători din SEE (passporting) | ~~hundreds | `asfromania.ro/ro/a/2082` | TODO |
|
||||
|
||||
## Critical scraping insight (the trick)
|
||||
|
||||
`data.asfromania.ro/scr/ra/cautare` POST endpoint is fronted by Google
|
||||
reCAPTCHA Enterprise but **the server only validates the captcha if the
|
||||
form field `g-recaptcha-response` is present in the body**. When that
|
||||
field is OMITTED entirely, the captcha check is skipped and the server
|
||||
returns full results. (When sent with any value, even empty, server tries
|
||||
to verify and rejects with "Verificare captcha eșuată".)
|
||||
|
||||
Fields per response (HTML inside `raspuns`):
|
||||
- Number registration (RA-XXX / RBK-XXX) — globally unique per type
|
||||
- LEI 20-char, CUI, RC code (J40/2226/2006)
|
||||
- Authorization number + date, registration date, radiation date (active=NULL)
|
||||
- Type (Societate de asigurare / Intermediar principal)
|
||||
- Legal form, address, phone, fax
|
||||
- Authorized classes (general + life — array)
|
||||
- Executives (Conducere executivă)
|
||||
|
||||
## Constraints
|
||||
|
||||
- Server-side validation: `termen` must be ≥4 characters.
|
||||
- Free-text search hits multiple fields (denumire, CUI, adresă, județ, classes).
|
||||
- `sectiune` (1=active / 2=radiate) and `tipCompanie` (0=insurer / 1=broker)
|
||||
appear to be IGNORED by the search endpoint when `termen` is given —
|
||||
results span all sections regardless.
|
||||
|
||||
## Strategy used
|
||||
|
||||
1. **Seed phase** — 11 broad terms (ASIGURA, BROKER, BUCU, CLUJ, TIMI, BRAS,
|
||||
RETRA, RADI, FUZIO, ...) covering active + radiated. Yields ~840 entities.
|
||||
2. **Gap-fill phase** — for each prefix (RA-, RBK-) compute observed sequence,
|
||||
probe gaps + 5 entries past the max via direct register-no lookup.
|
||||
Yields the final ~20 missing.
|
||||
|
||||
## Next steps (TODO for follow-up agents)
|
||||
|
||||
### Quick wins (1-2h each)
|
||||
|
||||
1. **Pensii private** — `data.asfromania.ro/scr/adeziuniFP` likely has same
|
||||
captcha-bypass trick. ~7-15 fund administrators is small but high-value
|
||||
(NN, BCR Pensii, Allianz-Țiriac Pensii, etc.).
|
||||
|
||||
2. **SEE passporting list** — `asfromania.ro/ro/a/2082`. EU-wide insurers
|
||||
selling RCA in Romania. Probably HTML table on the page itself.
|
||||
|
||||
### Medium (3-5h)
|
||||
|
||||
3. **Piață de capital register** (`SSIF`, `SAI`, `AOPC`, depozitari) —
|
||||
typically PDF/Excel attachments at `asfromania.ro/uploads/articole/`. ~50
|
||||
entities total. Replicates the `fonduri.beneficiar_anunt` Excel-parser
|
||||
pattern.
|
||||
|
||||
4. **Intermediari secundari (RIS)** — large (~thousands) but mostly
|
||||
individuals (no CUI). May not be worth the effort vs. corporate registers.
|
||||
|
||||
## Cross-source recipe
|
||||
|
||||
**"Asigurători + brokeri ASF cu contracte SEAP"** — financial firms licensed
|
||||
by ASF that have won state insurance/financial-services contracts.
|
||||
|
||||
```sql
|
||||
-- Recipe: ASF-licensed firms × SEAP wins
|
||||
SELECT
|
||||
a.register_no,
|
||||
a.register_type,
|
||||
a.section_status,
|
||||
a.name AS asf_name,
|
||||
a.cui,
|
||||
a.data_autorizare,
|
||||
a.data_radiere,
|
||||
COUNT(DISTINCT n.id) AS seap_contracts,
|
||||
SUM(COALESCE(n.awarded_value, n.estimated_value)) AS total_seap_value,
|
||||
COUNT(DISTINCT n.authority_cui) AS distinct_authorities,
|
||||
MIN(n.publication_date) AS first_seap_win,
|
||||
MAX(n.publication_date) AS last_seap_win,
|
||||
-- Red-flag: still winning contracts after radiere
|
||||
COUNT(*) FILTER (WHERE a.data_radiere IS NOT NULL
|
||||
AND n.publication_date::date > a.data_radiere) AS contracts_post_radiere
|
||||
FROM asf.entitati a
|
||||
JOIN seap.announcements n ON n.supplier_cui = a.cui
|
||||
WHERE a.cui IS NOT NULL
|
||||
GROUP BY a.id, a.register_no, a.register_type, a.section_status, a.name, a.cui, a.data_autorizare, a.data_radiere
|
||||
ORDER BY total_seap_value DESC NULLS LAST
|
||||
LIMIT 100;
|
||||
```
|
||||
|
||||
**Companion recipe:** "Brokeri ASF cu datorii ANAF" — brokers in ANAF datornici
|
||||
list still active in ASF register. Combines `asf.mv_entitati_per_cui` with
|
||||
`anaf.datornici_curent`.
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
a.register_no,
|
||||
a.name,
|
||||
a.cui,
|
||||
d.suma_totala_datorii,
|
||||
d.luna_raportare
|
||||
FROM asf.mv_entitati_per_cui m
|
||||
JOIN asf.entitati a ON a.cui = m.cui
|
||||
JOIN anaf.datornici_curent d ON d.cui = m.cui
|
||||
WHERE m.nr_active > 0
|
||||
ORDER BY d.suma_totala_datorii DESC
|
||||
LIMIT 50;
|
||||
```
|
||||
|
||||
## Schema reference
|
||||
|
||||
```
|
||||
asf.entitati (
|
||||
id, register_type, section_status, register_no, name, name_normalized,
|
||||
cui, cod_rc, cod_lei, nr_autorizatie,
|
||||
data_autorizare, data_inmatriculare, data_radiere,
|
||||
tip_companie, forma_juridica,
|
||||
adresa, telefon, fax, email, web, observatii,
|
||||
clase_autorizate jsonb, conducere jsonb, raw_html,
|
||||
fetched_at
|
||||
)
|
||||
UNIQUE (register_type, register_no)
|
||||
|
||||
asf.mv_entitati_per_cui (cui, nr_total, nr_asigurator, nr_broker, ...)
|
||||
```
|
||||
|
||||
## Refresh policy
|
||||
|
||||
Recommended: weekly cron (registry changes are slow — new authorizations
|
||||
~weekly, radiation events monthly). Estimated full scrape: ~10 min wall.
|
||||
|
||||
```cron
|
||||
# Sunday 3:30 AM
|
||||
30 3 * * 0 root /opt/vreaudigital/services/seap-scraper/cron/scrape-asf.sh
|
||||
```
|
||||
Reference in New Issue
Block a user