Moved from gov-agreg/src/pages/achizitii/* to root (drop prefix). - 22 pages migrated, 127 files total - All internal links: /achizitii/X → /X (176 occurrences fixed) - AchizitiiLayout subnav rewritten: /X paths, top-right link to vreaudigital.ro hub - BaseLayout new (vreau.digital branding, OG tags, site URL) - astro.config.mjs: site https://vreau.digital, server output (was static) - docker-compose: port 5096 (vreaudigital is 5095), container vreau-digital - deploy.sh: paths /opt/vreau-digital, log /var/log/vreau-digital-deploy.log Backend shared with gov-agreg: - PostgreSQL satra (same schemas: seap, firms, anaf, anre, ...) - Photon, Martin tiles - Infisical /vreaudigital path (DATABASE_URL etc. shared) build: PASS (npx astro check 0 errors, npm run build 5s vite + 10s server)
8.9 KiB
AFIR Historical Backfill — Plan & Status
Current state (2026-05-09)
| source_year | rows | distinct beneficiars | sum UE (EUR) | fund |
|---|---|---|---|---|
| 2023 | 474,720 | 320,230 | 1,411,870,796 | FEADR |
| 2024 | 563,310 | 316,304 | 1,373,722,134 | FEADR |
| Total | 1,037,930 | — | ~2.79 mld EUR | FEADR |
Schema: fonduri.afir_plati (migration 017_fonduri_afir.sql).
Importer: cron/import-afir-historical.sh + scripts/import-afir-historical.py.
Source survey
AFIR official portal — https://www.afir.ro/rapoarte/beneficiari-de-fonduri-europene/
Two complementary pages:
/date-deschise/— only the most recent two years are linked.- Currently exposes 2023 + 2024 for FEADR (xlsx) and 2023 + 2024 for FEGA (rar).
/beneficiari-fega-si-feadr/— ASP.NET portal athttps://plati.afir.info/Plati/AfisareListaPlatii. Year selector currently exposes only 2023 and 2024. 3.7M total records in the live query interface but no programmatic XLSX dump older than 2023.
data.gov.ro CKAN — searched q=afir, q=fega, q=apia, q=feadr
Findings (relevant package IDs only):
| Dataset | URL | Notes |
|---|---|---|
Date privind proiectele PNDR (a2884dcf-…) |
proiectepndr2020.csv (2014-2020), proiectepndr2013.csv (2007-2013) |
Project-level, not payment-level. Useful for joining contracts/projects but does not replace plati. Worth ingesting separately. |
Contracte AFIR (8845aa0d-…) |
contracte-achizitii-publice-peste-5000-euro-2000.xlsx, centralizator-…2021_2022.xlsx |
Procurement contracts >5K EUR run by AFIR itself; not beneficiary payments. Different schema. |
Lista Fermierilor Campania APIA 2024 (39e5465d-…) |
lista-fermieri-apia-2024.xlsx |
One-off small dataset; APIA campaign list. |
Parcele Agricole APIA LPIS 2025 etc. |
shapefiles (.zip) | Geographic parcels, not payments. Useful later for map overlays. |
Conclusion: data.gov.ro does not have listaplati_2020/2021/2022_* payment dumps. They exist nowhere public.
opendata.afir.info
A separate CKAN-style portal (http://opendata.afir.info/) lists ProiectePNDR2020 (53K views), ProiectePS2027, AchizitiiPrivate2020. The page itself doesn't expose direct download URLs without account login. Worth investigating in next session — it may contain the 2020-2022 payment data behind an export interface.
Importer architecture
Pipeline (FEADR XLSX)
AFIR XLSX ──curl──▶ satra:/tmp/afir-historical-{YEAR}-{FUND}/
│
▼
openpyxl read_only (skips 9 banner rows)
│
▼
pipe-delimited TSV (RO decimals "12.345,67" → "12345.67")
│
▼
\\copy → fonduri.staging_afir
│
▼
DELETE FROM afir_plati WHERE source_year=YEAR (idempotent)
│
▼
INSERT INTO afir_plati (source_year=YEAR, NULLIF + ::numeric casts)
Why pipe delimiter
Beneficiar names contain commas ("FULOP ZOLTAN, GERGELY"), Obiectiv contains
both , and quote chars. Pipe is safer than comma + quoting and the loader
already replaces any literal | in source text with / before serialization.
Idempotency
DELETE WHERE source_year = N runs only on full ingests (not when
LIMIT is set for smoke tests). Re-running for the same year is safe and
produces consistent counts.
Smoke test mode
./import-afir-historical.sh URL YEAR feadr 1000
The 4th arg (LIMIT) skips the DELETE step and truncates the TSV to N rows before COPY, so you can validate end-to-end without trampling production data.
Next-session work
1. FEGA ingest (HIGHEST IMPACT, 30-60 min)
Volume: 2,476,897 rows in 2023 alone, ~580 MB CSV inside 23 MB RAR. Source URLs:
- 2023:
https://www.afir.ro/media/sxcnuvwc/listaplati_2023_fega_corectat.rar - 2024:
https://www.afir.ro/media/dqjddti2/lista-plati-beneificiari-fega-2024.rar
Schema differences vs FEADR XLSX (column-by-column):
| FEADR XLSX (RO header) | FEGA CSV (concat header) | Notes |
|---|---|---|
| Numele beneficiarului | DenumireBeneficiar |
same |
| Numele de familie | NumeFamilie |
same |
| Denumirea societatii-mama si codul de inregistrare fiscala | Cui |
FEGA CSV exposes a real CUI column (mostly empty for natural persons, populated for SRL/PFA — bonus enrichment vs FEADR XLSX) |
| Localitate | Localicate (typo in source) |
same content |
| Codul masurii/tipului de interventie | Masura |
same; FEGA codes look like MICA / scheme acronyms instead of M 06 etc |
| Obiectiv | ObiectivSpecific |
longer descriptions |
| Data inceperii / Data incheierii | DataIncepere / DataSfarsit |
usually empty |
| Cuantum {Operatiune,Total} {FEGA,FEADR} | same 4 columns | decimals already in . format (English-locale, no comma swap needed) |
| Cuantum aferent operatiunii | CuantumAferentOperatiune |
same |
| Cuantum total cofinantare beneficiari | CuantumTotalCofinantareBeneficiar |
same |
| Cuantum total UE Beneficiar | CuantumtotalUEBenefeciar (typo in source) |
same |
Implementation choices:
Option A — augment afir_plati with tip_fond discriminator.
Add ALTER TABLE fonduri.afir_plati ADD COLUMN tip_fond text CHECK (tip_fond IN ('FEADR','FEGA'));
Re-tag existing rows as 'FEADR'. Importer writes both. Uniform downstream query.
Option B — separate table fonduri.fega_plati.
Different cardinality (5x rows), different measure code namespace; some
queries naturally separate. But duplicates the index/MV maintenance burden.
Recommendation: Option A. The schema is identical, the differences are
namespace-of-codes only. A single discriminator keeps things simple, fits
the existing gin_trgm name index, and lets the recipe code do
WHERE tip_fond='FEGA' cheaply (b-tree on tip_fond if needed).
FEGA importer changes vs current FEADR script:
- Download →
unrar x(already installed on satra now:apt install unrarwas run). - New python normalizer
import-afir-historical-fega.py— reads CSV not XLSX; column-name remapping; no RO-decimal swap. - Pass new
FUND=fegaflag → script writestip_fond='FEGA'and uses CSV path. - Cui column passthrough — write directly into the existing
cuicolumn when non-empty, withcui_match_method='afir_self_reported'andcui_match_score=1.0. Skip fuzzy matcher for these.
Volume budget: 2.48M rows × 2 years = ~5M rows. Same staging table works (TRUNCATE between runs). Postgres COPY @ ~100K rows/s → ~25s/year for COPY, plus ~60s for INSERT. Total ~5 min per year.
2. Historical FEADR 2020/2021/2022 (BLOCKED on source)
Status: not publicly available.
Investigation outcome:
- AFIR
/date-deschise/page shows only 2023+2024. plati.afir.infoportal shows only 2023+2024.- data.gov.ro CKAN has no
listaplati_<year>resources.
Options to unblock (in order of cost):
- Email AFIR direct —
comunicare@afir.infoand request the historical payment lists 2020-2022 under Law 544/2001 (FOIA equivalent). They are legally obligated to provide. Expected: 2-4 week response. - Wayback Machine archive — check
https://web.archive.org/web/2023*/afir.ro/rapoarte/beneficiari-de-fonduri-europene/date-deschise/for snapshots that still link to old XLSX files. URLs may still resolve (AFIR media folder is content-addressed:/media/<hash>/file.xlsx). - opendata.afir.info account — the dataset titles
AchizitiiPrivate2020,ProiectePNDR2020suggest historical exports may live here, but the download interface needs login. Apply for an open-data access account.
Estimated row counts when obtained: ~450K-500K per year (extrapolating from 2023 = 475K and 2024 = 563K).
3. APIA-specific datasets (LOWER PRIORITY)
Lista Fermierilor Campania APIA 2024 (small file, ~50K rows expected).
This is a subset of FEGA payments (only certain campaigns), so once FEGA
2024 is ingested, this dataset is partially redundant. Worth ingesting
into a separate fonduri.apia_fermieri table only if it carries the
geographic columns (parcel codes) the FEGA dump lacks.
Geographic LPIS shapefiles (Parcele Agricole APIA LPIS 2025,
Categorii de Folosință) are map data, not payment data — defer to
when we add map overlays to /achizitii/firma/[cui] profile pages.
Files modified/added in this session
- NEW
services/seap-scraper/scripts/import-afir-historical.py— XLSX→TSV normalizer - NEW
services/seap-scraper/cron/import-afir-historical.sh— orchestrator - NEW
services/seap-scraper/AFIR-HISTORICAL-PLAN.md(this file)
fonduri.afir_plati schema unchanged — no migration. The DELETE+INSERT
flow uses the existing table as-is. Adding tip_fond discriminator is
a follow-up migration when FEGA ingest is implemented.