Moved from gov-agreg/src/pages/achizitii/* to root (drop prefix). - 22 pages migrated, 127 files total - All internal links: /achizitii/X → /X (176 occurrences fixed) - AchizitiiLayout subnav rewritten: /X paths, top-right link to vreaudigital.ro hub - BaseLayout new (vreau.digital branding, OG tags, site URL) - astro.config.mjs: site https://vreau.digital, server output (was static) - docker-compose: port 5096 (vreaudigital is 5095), container vreau-digital - deploy.sh: paths /opt/vreau-digital, log /var/log/vreau-digital-deploy.log Backend shared with gov-agreg: - PostgreSQL satra (same schemas: seap, firms, anaf, anre, ...) - Photon, Martin tiles - Infisical /vreaudigital path (DATABASE_URL etc. shared) build: PASS (npx astro check 0 errors, npm run build 5s vite + 10s server)
8.6 KiB
APIA — Lista Fermieri (data.gov.ro CKAN ingest)
Current state (2026-05-10)
| metric | value |
|---|---|
| Schema | apia.fermieri + apia.staging_fermieri + apia.scrape_log + apia.mv_per_cui |
| Migration | services/seap-scraper/sql/036_apia.sql |
| Importer (python) | services/seap-scraper/scripts/import-apia-fermieri.py |
| Importer (bash wrapper) | services/seap-scraper/cron/import-apia-fermieri.sh |
| Rows ingested | 191 (Găgești, jud. Vaslui, campaign 2024) |
| Resources | 1 / 1 discoverable on data.gov.ro |
| Comune | 13 (rezident vs. proprietar — Găgești + diaspora) |
| Suprafață totală | 1 575,17 ha |
| PJ (is_legal_person) | 2 (PFA, SRL) |
| CUI matched (firms.entities) | 1 / 2 (50%) — SC WARDAMA SRL (CUI 28501796) |
| Cross-source AFIR FEGA hits | 1 firmă (WARDAMA, 2 plăți FEGA, 26.28 EUR) |
| Cross-source ANAF datornici | 0 |
Reality check: data.gov.ro APIA scope
The prompt's expectation was 500–700K farmers in a single national XLSX. That dataset does not exist on CKAN. The only published "Lista fermieri APIA" XLSX on data.gov.ro covers a single comuna (Găgești, Vaslui, ~192 farmers).
Why this matters
- AFIR's FEGA dump (
fonduri.afir_plati WHERE tip_fond='FEGA', 4 290 976 rows for 2023+2024) is the actual national farmer-payment dataset. APIA "Lista fermieri" publishes declarations (suprafață, responsabil UAT, centru APIA) — APIA is the paying agency, AFIR records the actual payments. - The two are complementary, not redundant:
- APIA list → "who declared and how many ha"
- AFIR FEGA → "who actually got paid and how much"
- A future-proof importer that auto-discovers any new
lista-fermieri-*package on data.gov.ro is what we built. When more UATs publish, re-run and it ingests them automatically (idempotent onsource_resource_id).
APIA national-level data (unblocked)
The actual national list of beneficiaries lives at https://www.apia.org.ro/ but the site returns HTTP 403 for non-browser User-Agents. Out of scope for this pass. Options to unblock (in cost order):
- Email APIA direct — request structured data under Law 544/2001.
- Browserless / Playwright scraper — render JS, fetch the table. Adds infra cost (one more Docker container, captcha risk).
- Fall back on AFIR FEGA — already ingested; covers the question "who got subsidies in 2023/2024" at national scale, just without the suprafață breakdown.
Schema highlights
CREATE TABLE apia.fermieri (
id bigserial PRIMARY KEY,
campaign_year smallint NOT NULL,
name text NOT NULL,
name_normalized text,
cui text,
cui_match_method text, -- 'exact_norm' | 'trgm_unique'
cui_match_score numeric(4,3),
is_legal_person boolean, -- detected from name shape (SRL/SA/PFA/II/IF/SC/COOPERATIVA)
judet text,
comuna_oras text,
sat text,
centru_apia text, -- e.g. 'MURGENI'
responsabil_uat text, -- UAT employee, not the farmer
suprafata_ha numeric(12,4), -- declared hectares (precedent campaign)
source_dataset_id text NOT NULL,
source_resource_id text NOT NULL,
source_url text NOT NULL,
fetched_at timestamptz NOT NULL DEFAULT now(),
UNIQUE NULLS NOT DISTINCT (campaign_year, name, comuna_oras, sat)
);
Importer pipeline
CKAN package_search?q=lista+fermieri+APIA
│
▼ (jq filter dataset name `lista-fermier*`, format=XLSX)
download XLSX on satra (curl)
│
▼
openpyxl read → header detect → pipe-TSV
(NR.CRT, NUME PRENUME, RESPONSABIL UAT, COMUNA/ORAS, SAT, CENTRU APIA, SUPRAFATA)
│
▼
TRUNCATE apia.staging_fermieri
\\copy apia.staging_fermieri FROM ... pipe-delimited
│
▼
DELETE FROM apia.fermieri WHERE source_resource_id = $RID -- idempotent
│
▼
INSERT ... DISTINCT ON (year, name, comuna, sat) -- in-batch dedupe
ON CONFLICT (...) DO UPDATE -- cross-batch dedupe
│
▼
apia.match_cui() -- exact_norm + trgm fallback
│
▼
REFRESH MATERIALIZED VIEW apia.mv_per_cui
│
▼
INSERT INTO apia.scrape_log (rows_seen, rows_inserted, duration_ms, ...)
Operational
# Full discovery + ingest (default)
./cron/import-apia-fermieri.sh
# Specific year
./cron/import-apia-fermieri.sh 2024
# Smoke test (only first resource)
./cron/import-apia-fermieri.sh 2024 1
Idempotent: re-running re-deletes by source_resource_id and re-inserts. Safe to put on
a monthly cron — new UAT publications are picked up automatically.
Cross-source recipes
1. "Fermier (PJ) primește subvenții și are datorii la stat"
SELECT
f.name,
f.cui,
f.comuna_oras,
f.suprafata_ha AS ha_declarate,
d.suma_datorata_lei
FROM apia.fermieri f
JOIN anaf.datornici d ON d.cui = f.cui
ORDER BY d.suma_datorata_lei DESC NULLS LAST;
-- Currently: 0 hits (only 1 PJ matched in this dataset). Will scale with more UATs.
2. "Fermier APIA × FEGA AFIR plăți reale"
SELECT
f.name,
f.cui,
f.comuna_oras,
f.suprafata_ha AS ha_declarate_apia,
COUNT(a.id) AS plati_fega,
ROUND(SUM(a.ue_total)::numeric, 2) AS total_fega_eur,
ROUND((SUM(a.ue_total) / NULLIF(f.suprafata_ha, 0))::numeric, 2) AS eur_per_ha
FROM apia.fermieri f
JOIN fonduri.afir_plati a
ON a.cui = f.cui
AND a.tip_fond = 'FEGA'
GROUP BY f.name, f.cui, f.comuna_oras, f.suprafata_ha
ORDER BY total_fega_eur DESC;
-- Validated: SC WARDAMA SRL (28501796) → 2 plăți FEGA, 26.28 EUR pentru 1.1 ha.
3. "Outlier EUR/ha — fermă cu plăți disproporționate"
SELECT *
FROM (
SELECT
f.name,
f.cui,
f.suprafata_ha,
SUM(a.ue_total) AS total_fega_eur,
SUM(a.ue_total) / NULLIF(f.suprafata_ha, 0) AS eur_per_ha
FROM apia.fermieri f
JOIN fonduri.afir_plati a ON a.cui = f.cui AND a.tip_fond = 'FEGA'
GROUP BY f.name, f.cui, f.suprafata_ha
) x
WHERE eur_per_ha > 500
ORDER BY eur_per_ha DESC
LIMIT 50;
-- Threshold 500 EUR/ha is high for plăți FEGA directe (~150-300 EUR/ha standard);
-- > 500 = atipic (cuplate cu măsuri de mediu sau scheme speciale).
4. "Fermier (PF) cu suprafață mare în mai multe comune"
SELECT
name,
array_agg(DISTINCT comuna_oras) AS comune,
COUNT(*) AS aparitii,
SUM(suprafata_ha) AS total_ha
FROM apia.fermieri
WHERE is_legal_person IS NOT TRUE
GROUP BY name
HAVING COUNT(*) > 1
ORDER BY total_ha DESC;
-- Detectează "fermieri-fantomă" cu același nume în mai multe UAT-uri.
5. "Cross UAT — responsabili APIA cu cele mai multe ferme"
SELECT
responsabil_uat,
centru_apia,
COUNT(*) AS n_ferme,
SUM(suprafata_ha) AS ha_totale
FROM apia.fermieri
WHERE responsabil_uat IS NOT NULL
GROUP BY responsabil_uat, centru_apia
ORDER BY ha_totale DESC NULLS LAST;
-- Operational view — cine la APIA gestionează ce volum.
Files added in this pass
- NEW
services/seap-scraper/sql/036_apia.sql - NEW
services/seap-scraper/scripts/import-apia-fermieri.py - NEW
services/seap-scraper/cron/import-apia-fermieri.sh - NEW
services/seap-scraper/APIA-PLAN.md(this file)
No edits to lib/, pages/, or any existing scraper. Slot 036 chosen to
avoid collision with parallel agents who picked 035 for Curtea de Conturi
and GNM (Garda Mediu). 022/023 remain reserved by other parallel agents.
Next steps (low priority until more data)
- Watch CKAN for new resources — set up monthly cron to re-run discovery.
- Browserless scraper for apia.org.ro — only worth it if national lists are needed for a specific recipe page. Otherwise FEGA covers the same question at national scale.
- Geographic enrichment — the LPIS shapefiles (
Parcele Agricole APIA LPIS 2025) could overlay on a map view of /achizitii/firma/[cui]; defer to map-feature work. - judet field population — currently NULL. When more UATs ingest, derive from centru_apia mapping (centre APIA → judet is 1-N but enumerable).