Be Civic — Architecture

Canonical system specifications for the Be Civic project.

Be Civic — Architecture

This sub-spec covers the non-negotiable principles (§3), the architecture overview with its system diagram and storage split (§4), the repository layout (§5), the reference consumer (§13) and its agent interface split (§13.1), the anti-patterns catalogue (§16), settled decisions (§17), open questions (§18), out-of-scope items (§19), and the full consumer-side runtime specification (§24).

For all JSON schemas (skill frontmatter, submission schemas, MDX tags), see schemas.md. For the PII pipeline and consumer-side state contract, see privacy.md. For the state machine and branch policy, see lifecycle.md. For the website rendering substrate, see website.md. For the trust model, citation rot, provider protocol, and MCP server, see protocol.md.

3. Non-negotiable principles

  1. Every factual claim cites a source. Articles of the Code, Moniteur belge refs, INAMI / SPF / commune URLs. No uncited assertions in any skill body.
  2. Freshness is explicit. Each skill carries last_verified and verification_notes in frontmatter. Staleness handling lives in agent guidelines, not in the skill itself: agents treat skills past a recency threshold as suspect and re-fetch volatile values from cited sources.
  3. Skills scaffold, they don't replace iteration. Skills surface statutory hooks, document hierarchy, and known pitfalls. They do not enumerate every commune or interpret edge cases. When in doubt, prefer fewer branches and clearer escalation prompts.
  4. No PII in any contribution. PII protection is structural, not promissory. Schema-level bans on identity-shaped fields, hard length caps on free-text, salted hashed per-IP correlation only (daily-rotating salt for rate limits; per-artefact salt scoped to an artefact's alpha/beta lifetime for self-validation prevention and distinct-IP counting), no request-body logging anywhere, consumer pre-flight scrub (regex from canonical rules + LLM contextual judgment), Worker hard-gate scrub on submit, NER on commit. NER detection holds the submission in a human-review queue (the same queue handling injection-flag quarantines); it is not auto-revert. PII never reaches the public corpus without human eyes when NER flags. See §8.5 (see privacy.md).
  5. Disclaim authority. Skills are starter scaffolds. Users must verify with their commune. This statement appears at the top of every skill and in the README.
  6. Substrate-agnostic data shapes. The skill format and submission protocol are specified independently of any specific agent runtime or backend. GitHub + Cloudflare Workers + D1 are the v1 substrate; the data shapes and protocol do not assume them.

6b. Everything open, no closed parts (corpus side). The user's LLM is the runtime; it needs to read the design (composition graph, scrub mechanism, state machine, validation protocol) to use the system correctly. There is no "closed core" to protect because the architecture requires transparency to function. All source, docs, schemas, prompts, and design narratives are public under CC-BY-4.0. (The provider-integration protocol layer described in §21 (see protocol.md) is an additive commercial layer; its provider contracts and partner-relationship details are not part of the public corpus, but the eligibility criteria and the editorial-firewall principles are.) 7. Rejected submissions never leave private state. Submissions go to the Worker, not to a public surface like a GitHub Issue. On rejection at the Worker (regex / schema fail / capability mismatch / self-validation) the submission is returned as 4xx with a category-only error and never persists anywhere. On NER detection at commit, the submission is held in the review queue rather than auto-published; if the maintainer discards it, the public corpus is unaffected. 8. Platform-agnostic by capability declaration. Be Civic tests capability declarations, not vendor behaviour. The four capability tiers per submission type (§6.7 (see schemas.md)) are vendor-neutral. Agents self-classify against those declarations, disclose limitations to the user, and recommend stepping up to a capable mode in their own ecosystem if they fall below tier (Anthropic → Claude Code; OpenAI → ChatGPT with code interpreter / Codex CLI; Google → Gemini code execution; Mistral → Le Chat with Canvas; etc.). Be Civic does not maintain per-vendor adapters; it maintains a recommendations page that consumers' agents can amend as ecosystems shift. (Per D.1 redirect, 2026-04-27.) 9. Designed for generic users, not the maintainer. Skills are written for a generic third-country expat using a generic capable agent. The maintainer is a test user, not the target. Skills do not embed maintainer-specific context, examples, or assumptions. Agent capability is declared explicitly per skill (see §6.1, §6.7) so consumers can determine compatibility. 10. Honor-system contribution, made visible — and opting out is first-class. There is no enforcement mechanism for submission. The system's data quality depends on consumer agents honestly submitting validated content. The README documents this explicitly; auto-generated activity dashboards (per-skill at point of use, plus global) make the honor system tangible. The submission contract requires only a one-off framed message at the user's first session — explaining that submissions are anonymous, automatic, and reviewable in the local submissions log, with cancellation available within the 24-hour staging window. No per-event consent prompts; the burden is on the consuming agent to scrub correctly and on the user to set the policy once.

**Opt-out is first-class.** Users who decline to submit (privacy-sensitive populations, refugees, anyone who simply doesn't want to) must receive the same skill-loading and guidance experience as users who do submit. The system must be useful for users who never submit a single submission. The honor-system framing is about aggregate data quality across the user population, not about pressuring individual users. Consumer agents that degrade UX for non-submitters are non-compliant.
  1. Customer-side state only. Be Civic operates with no server-side per-customer state of any kind; customer-side state MUST satisfy all three of the following clauses: the customer can read it directly with standard tools, the customer can delete it unilaterally as a single artifact, and the agent can both read and write it in-session (§8.7 (see privacy.md)). This is a Tier C invariant: changing it requires explicit protocol amendment.

  2. Deterministic submission paths for code-driven signals. LLM-driven submission is permitted only where semantic judgment is required (for example, drafting an observation gap_description). Submission of session-lifecycle events, hook-fired analytics, and structured catalogue updates MUST follow deterministic code paths with no LLM in the submission loop (§24; see also §6.2.1 (see schemas.md) and §6.2.5 (see schemas.md) on the analytics endpoint).

  3. Plugin-as-bootstrap. The primary consumer-side delivery mechanism is a Cowork plugin distributed via becivic.be. Installing the plugin is the canonical onboarding path; it delivers the harness skill, all shipped procedure skills (or — under the live-fetch architecture — the meta-skills that fetch procedures), starter state files, and schemas in a single install gesture — no zip download, no folder selection at install time, no manual paste of Project Instructions. The plugin runtime exposes the plugin install root as ${CLAUDE_PLUGIN_ROOT} and a writable plugin-data location as ${CLAUDE_PLUGIN_DATA}; the harness uses these for read-only plugin assets and plugin-internal state respectively. User-picked-parent BeCivic/ folders for per-procedure state are created later by bc-onboarding via mcp__cowork__request_cowork_directory after form submit (see cowork-plugin.md §3.3). Paste-prompt sampler and free-tier read-only paths are retained as degraded fallbacks and explicitly not the recommended entry point. Cowork-specific instantiation lives in cowork-plugin.md; sibling harnesses (ChatGPT app, etc.) will get their own harness specs.

  4. Be Civic is a tool for the user's agent, not an agent itself. The user installs the plugin; the user's own agent loads it; Be Civic gives that agent the verified Belgian procedures. The user never speaks to "Be Civic" — they speak to their agent, which uses Be Civic. Customer-facing copy MUST reflect this framing: "your agent will use Be Civic to walk through your apostille" is correct; "Be Civic will help you with your apostille" is not. Trust posture depends on this: the user's data stays in their agent's environment, the agent is the one running the procedure, and Be Civic is the verified-content layer the agent reads. This is consistent with principle 11 (customer-side state only) and principle 13 (plugin-as-bootstrap).

4. Architecture overview

                ┌─────────────────────────────────────────┐
                │  Consumer-side (personal agent + user)  │
                │  ─ self-classifies against capability   │
                │    requirements (§6.7); recommends      │
                │    stepping up if below tier            │
                │  ─ reads relevant skill from becivic.be  │
                │    (one canonical.md per skill at its    │
                │    current status: alpha/beta/stable)   │
                │  ─ applies user context                 │
                │  ─ guides user through admin task       │
                │  ─ runs pre-flight validation locally   │
                │    (regex scrub + cross-ref script)     │
                │  ─ POSTs one of five feedback types     │
                │    + analytics + rating (post-2026-05-15):│
                │    concern | amendment | validation |   │
                │    draft | feedback | rating | analytics │
                │  ─ logs cancel_token + cancel_url to    │
                │    submissions.jsonl                    │
                │  ─ NO GitHub account needed             │
                └────────────────┬────────────────────────┘
                                 │  HTTPS POST /<submission-type>
                                 ▼
                ┌─────────────────────────────────────────┐
                │  Staging service @ becivic.be            │
                │  (Cloudflare Worker (api/) + cron       │
                │   Worker (tools/staging-worker/))       │
                │  ─ seven endpoints (/concerns,          │
                │    /amendments, /drafts, /validations,  │
                │    /feedback-channel, /ratings,         │
                │    /analytics) + /feedback envelope     │
                │  ─ target_type-keyed schema validation  │
                │  ─ capability self-declaration check    │
                │  ─ regex scrub (canonical rules)        │
                │  ─ identity-field ban (defensive)       │
                │  ─ self-validation prevention           │
                │    (validations only)                   │
                │  ─ rate limits per IP per day           │
                │  ─ stage in KV with 24h TTL             │
                │    (concerns/amendments/drafts/         │
                │    feedback/ratings); validations       │
                │    apply immediately                    │
                │  ─ cohort_anchor Worker-stamp on        │
                │    target_type ∈ {skill, path} rows     │
                │  ─ return cancel_token                  │
                │  ─ scheduled commit Worker (5–15 min)   │
                │  ─ catalogue/concern/validation/        │
                │    rating writes go to D1; draft and    │
                │    amendment (target_type=skill | path) │
                │    open PRs on GitHub via GitHub App    │
                │    credentials                          │
                └────────────────┬────────────────────────┘
                                 │  D1 INSERT
                                 │  + GitHub App API: open PR /
                                 │    auto-merge on CI green
                                 ▼
                ┌─────────────────────────────────────────┐
                │  Repository state (canonical, on main)  │
                │   skills/<id>/canonical.md  (one file   │
                │     per skill; status in frontmatter:   │
                │     draft | alpha | beta | stable)      │
                │   skills/<id>/process.mmd               │
                │   data/communes.json                    │
                │   data-snapshot/  (D1 backup snapshots; │
                │     fallback for build-time tag         │
                │     resolution; see §6.10)              │
                │   docs/submission-contract-v<N>.mdx     │
                │   index.mdx (landing; see §20.2)        │
                │   agents.mdx (agent entry; see §13.1)   │
                └────────────────┬────────────────────────┘
                                 │  on commits:
                                 ▼
                ┌─────────────────────────────────────────┐
                │  D1 (catalogues + signals)              │
                │   volatile_values   (val-NNNNN entries) │
                │   references        (ref-NNNNN entries) │
                │   concerns          (con-NNNNN entries; │
                │                      target_type keyed) │
                │   amendments        (amd-NNNNN entries; │
                │                      target_type keyed) │
                │   drafts            (drf-NNNNN entries; │
                │                      target_type keyed) │
                │   validations       (val_*; target_type │
                │                      keyed, 6 values)   │
                │   feedback_channel  (fbk-NNNNN entries; │
                │                      operator triage)   │
                │   ratings           (rtg-NNNNN entries; │
                │                      three-axis stars)  │
                │   analytics_events  (anl-NNNNN entries; │
                │                      aggregate-only)    │
                │  ─ INSERT-with-supersede update model   │
                │    (current row WHERE superseded_at IS  │
                │    NULL); full history queryable        │
                │  ─ daily JSONL backup snapshot to Git   │
                │    under data-snapshot/ (archival only) │
                └────────────────┬────────────────────────┘
                                 │
                                 ▼
                ┌─────────────────────────────────────────┐
                │  Commit-side checks                     │
                │  ─ Presidio NER on submitted prose      │
                │    fields (notes, rationale,            │
                │    injection_reason)                    │
                │    → on flag: hold in review queue      │
                │      (§8.5)                             │
                │  ─ Cross-ref validator (§10.1)          │
                │  ─ State machine (§9):                  │
                │    automated promotion of skills        │
                │    and paths (frontmatter status edits  │
                │    via PR) and catalogue rows (D1       │
                │    UPDATE); threshold-driven, no human  │
                │    in loop except on draft PRs          │
                │    (target_type=skill | path)           │
                └────────────────┬────────────────────────┘
                                 │
                                 ▼
                ┌─────────────────────────────────────────┐
                │  Apex router (site/router-worker.js)    │
                │  ─ owns becivic.be/ apex routing        │
                │  ─ /api/* → staging Worker (above)      │
                │  ─ everything else → renderer (below)   │
                └────────────────┬────────────────────────┘
                                 │
                                 ▼
                ┌─────────────────────────────────────────┐
                │  Renderer Worker (§20)                  │
                │  Cloudflare Workers Static Assets +     │
                │  custom Worker entry (worker.ts);       │
                │  serves everything that isn't /api/*:   │
                │  ─ becivic.be/  (marketing landing)     │
                │  ─ becivic.be/agents  (agent entry,     │
                │    ~40 lines + manifest.json + per-     │
                │    endpoint pages; §13.1)               │
                │  ─ becivic.be/skills/<id>               │
                │    one canonical URL per skill;         │
                │    status banner in frontmatter when    │
                │    not stable                           │
                │  ─ becivic.be/docs/*                    │
                │  ─ llms.txt + llms-full.txt             │
                │  ─ <VV>, <Ref>, <Observations> MDX      │
                │    components resolved at build/fetch   │
                │    time (build-fetch from /api/*;       │
                │    fallback to data-snapshot/) — §20.3  │
                │  ─ runtime hooks: HTMLRewriter beacon,  │
                │    _redirects dedup (§20.3)             │
                └────────────────┬────────────────────────┘
                                 │  in parallel:
                                 ▼
                ┌─────────────────────────────────────────┐
                │  MCP Worker (§23)                       │
                │  mcp.becivic.be — stateless;            │
                │  createMcpHandler; ~6 intent tools      │
                │  wrapping becivic.be/api/*              │
                └────────────────┬────────────────────────┘
                                 │
                                 ▼
                ┌─────────────────────────────────────────┐
                │  Maintainer review queue                │
                │  (constitutional court only)            │
                │  ─ injection-flag quarantines           │
                │  ─ NER-held submissions                 │
                │  ─ skill_draft PR review (S31)          │
                │  ─ provider-eligibility decisions (§21) │
                │  ─ protocol amendments                  │
                │  ─ NOT routine corpus growth            │
                └─────────────────────────────────────────┘

Storage split. Authored prose lives in Git; records live in D1.

Item Where Reasoning
Skills (canonical.md, process.mmd) Git Authored prose; renderer Worker emits HTML; agents read via URL or content-negotiated .md
Volatile values D1 Records, indexation churn
References D1 Records, citation maintenance
Observations D1 High volume, body is short, vote-rankable
Validations D1 Pure records, very high volume; target_type keyed across skills/VVs/refs/observations
Votes (on concerns) D1 Subset of validations, target_type='observation' (the wire surface name preserved per the locked OPEN-13 wire-vs-render split; the underlying D1 lookup goes to the concerns table)

A daily backup routine dumps every D1 table to JSONL under data-snapshot/ in Git for archival. The renderer build does not depend on the dump in the steady-state path; it is the fallback for build/fetch-time MDX-tag resolution when /api/* is unreachable from the build environment (§6.10 (see schemas.md), §20.3 (see website.md)).

Unified status state machine. Every skill, volatile value, and reference advances through a single status enum (§9 (see lifecycle.md)). The skill's status lives in canonical.md frontmatter; the catalogue row's status lives in D1.

Status Meaning Path
draft Skeleton or in 24h staging not yet committed (or skeleton frontmatter)
alpha Committed, awaiting initial validations canonical.md (skills) or current row in D1 (catalogues)
beta Promoted from alpha; broader validation underway same path; status advanced
stable Consensus-promoted; the canonical version served as default at /skills/<id> (skills); current row WHERE superseded_at IS NULL (catalogues)
quarantined Pulled for cause; not rendered; audit-only maintainer-action terminal state; body replaced with quarantine notice
deprecated Superseded or no longer recommended; may remain readable with superseded_by maintainer-action terminal state

Rollback works by D1 supersession (catalogue rows: a fresh INSERT marks the prior row's superseded_at) or by git revert (skill bodies). The status enum includes quarantined (pulled for cause, not rendered, audit-only) and deprecated (superseded, may remain readable with superseded_by) as maintainer-action terminal states alongside the four promotion states (draft | alpha | beta | stable). History remains queryable via the history endpoints (§9 (see lifecycle.md), §28 design conversation S36).

Trust shape: a skill at stable status is citation-grade and consensus-validated; a skill at alpha or beta is visible to consumers but flagged with a banner; observations are signal, not edits, and are surfaced inline via the <Observations> MDX component on each skill page (§6.10 (see schemas.md)).

Editorial neutrality, commercial sustainability. Be Civic is operated by an independent company. The corpus is published under CC-BY-4.0 (§17). The provider-integration protocol layer (§21 (see protocol.md)) is a v1.x ambition that funds the work; eight editorial-firewall principles (§21 (see protocol.md)) keep the corpus partner-blind, prevent ranking or recommendation, and make removal-for-cause contractually mandatory and revenue-blind. Skills cite authoritative sources and describe processes; they never name specific providers in the body.

Human / agent entry split. becivic.be/ is human-facing landing and explanatory copy. becivic.be/agents is the agent entry — overview (~40 lines after S52 implementation), with protocol details in /agents/manifest.json and per-endpoint pages at /agents/submit/* (§13.1). Both are served by the same renderer Worker per §20 (see website.md). Three reinforcing channels (llms.txt nav order, MCP tool surfacing, per-skill body line) direct agents to /agents first. (Per G.12.)

5. Repository layout

be-civic/                                 # public repo (github.com/Be-Civic/be-civic); the corpus
├── README.md                            # what the repo is, who it's for, how skills are loaded
├── CONTRIBUTING.md                      # rules for human and agent contributors
├── LICENSE                              # CC-BY-4.0 — applies to everything (code AND content); see §17
├── index.mdx                            # marketing landing rendered at becivic.be/; see §20
├── agents.mdx                           # agent entry overview (becivic.be/agents); ~40 lines after S52 — see §13.1
├── docs.json                            # navigation source-of-truth read by the renderer; auto-regenerated on commits (state-machine validator)
├── site/                                # bc-infra-side renderer source (multi-site monorepo: core/ + sites/<site-id>/); deployed as Cloudflare Workers Static Assets; see §20 and §22
├── api/                                 # Cloudflare Worker source for the four POST endpoints under /api/* (observations, skill-amendments, skill-drafts, validations); see §10 and §17
│   ├── index.html
│   ├── style.css
│   ├── fonts/                           # self-hosted Manrope woff2 (700, 800), latin subset
│   ├── logo/                            # light.svg, dark.svg
│   └── favicon.svg
├── docs/
│   ├── submission-contract-v<N>.mdx     # versioned contribution contract (5 feedback types + analytics + rating)
│   ├── skill-conventions.md             # file structure, frontmatter, body layering
│   ├── agent-guidelines.md              # how consuming agents should behave
│   ├── retraction-protocol.md           # incident response for skills and submissions
│   ├── reference-consumer.md            # capability requirements + per-ecosystem capable-mode recommendations
│   ├── activity/                        # auto-generated, regenerated by activity-dashboard.yml
│   │   ├── global.json
│   │   └── global.md
│   └── website/                         # internal design brief for the human-facing surface (not served by the renderer; lives in bc-operations only); see §20
├── schemas/
│   ├── skill.schema.json                # JSON Schema for skill frontmatter
│   ├── observation.schema.json          # JSON Schema for observation submissions
│   ├── skill-amendment.schema.json      # JSON Schema for amendment submissions
│   ├── skill-draft.schema.json          # JSON Schema for draft submissions
│   ├── validation.schema.json           # JSON Schema for validation submissions
│   ├── communes.schema.json             # JSON Schema for the commune list
│   ├── types.json                       # canonical I/O types
│   ├── categories.json                  # category list (open enum, deterministic guards per G.3)
│   ├── source-classes.json              # primary / secondary / tertiary source allowlists (§15.2)
│   └── regex-rules.schema.json          # JSON Schema for tools/scrub/regex-rules.json
├── data/
│   └── communes.json                    # REFNIS-derived commune list (pinned nomenclature date)
├── data-snapshot/                       # daily JSONL backup of D1 catalogues + signals; archival + build-time tag-resolution fallback (see §6.10)
│   ├── volatile-values.jsonl
│   ├── references.jsonl
│   ├── observations.jsonl
│   ├── validations.jsonl
│   └── votes.jsonl
├── skills/
│   ├── index.json                       # auto-generated; activity stats per skill
│   └── <skill-id>/                      # one folder per skill
│       ├── canonical.md                 # the skill body; frontmatter status: draft | alpha | beta | stable
│       ├── process.mmd                  # mermaid graph (required when branching)
│       └── templates/                   # optional sample documents
│
│   # Volatile values, references, observations, validations, and votes
│   # (on observations) are not files in this tree — they live in D1 and
│   # are referenced from skill bodies via <VV>, <Ref>, <Observations> MDX
│   # components (§6.10). Daily JSONL backup snapshots are written to
│   # data-snapshot/ above for archival and as a build-time fallback.
├── tests/
│   └── fixtures/
│       ├── observations/{valid,invalid}/*.json
│       ├── skill-amendments/{valid,invalid}/*.json
│       ├── skill-drafts/{valid,invalid}/*.json
│       ├── validations/{valid,invalid}/*.json
│       ├── pii-samples/*.txt
│       └── nrn-checksum/*.json
├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── skill-error-report.yml       # accuracy concerns; routes to maintainer
│   │   └── bug-report.yml
│   └── workflows/
│       ├── ner-on-commit.yml            # Presidio NER on prose fields → review queue
│       ├── state-machine.yml            # scans validations/*, updates frontmatter, regenerates docs.json
│       ├── skills-index.yml             # regenerates skills/index.json
│       ├── activity-dashboard.yml       # regenerates docs/activity/global.{json,md}
│       ├── communes-refresh.yml         # quarterly REFNIS diff
│       ├── citation-linkcheck.yml       # monthly skill-citation URL check
│       └── deploy-worker.yml            # deploys tools/staging-worker/ on push
├── tools/
│   ├── scrub/
│   │   ├── regex-rules.json             # canonical detector rules (Worker AND consumer)
│   │   ├── presidio-recognizers/        # Belgian-specific NER recognisers (commit-side Action only)
│   │   └── checksum/                    # NRN, IBAN, BCE mod-97 implementations
│   ├── staging-worker/                  # v1 reference impl of staging service
│   │   ├── worker.ts                    # POST /<type>, DELETE /<type>/{id}, scheduled commit
│   │   ├── wrangler.toml                # Cloudflare Worker config; KV namespace bindings
│   │   ├── setup.sh                     # one-shot: create KV namespace, configure secrets
│   │   └── README.md                    # deployment instructions
│   └── scripts/
│       ├── validate-cross-refs.ts       # deterministic graph + frontmatter validator (§10.1, G.10)
│       ├── audit-categories.ts          # monthly orphan + edit-distance audit (G.3)
│       ├── state-machine-tick.ts        # called by state-machine.yml; advances proposal frontmatter
│       └── regenerate-docs-json.ts      # called by state-machine.yml; rebuilds renderer navigation
├── launch/                              # internal launch / brand / channel planning artefacts (mockups, brand exploration); not served by the renderer (lives in bc-operations); see §20
└── runtime/                              # gitignored — local Action-side state if needed
    └── ner-review-queue.log              # appended by ner-on-commit Action when holding; audit only

Note: meta-validate-skill-graph is not a skill in v1. The graph validator is the deterministic script tools/scripts/validate-cross-refs.ts (per G.10) — invoked from three points: consumer pre-flight (via tool_execution), Worker hard-gate, and PR-CI on main. A markdown-wrapper skill would have added drift surface for zero LLM-judgment value.

One canonical.md per skill. There is no proposals/ directory and no archive/ directory. A skill's lifecycle moves through draft → alpha → beta → stable via a single status field in canonical.md frontmatter (§6.1 (see schemas.md), §9 (see lifecycle.md)). Rollback of a skill body is a git revert against canonical.md; rollback of a catalogue entry (volatile value, reference) is a fresh INSERT in D1 that supersedes the prior row (§9 (see lifecycle.md), S29). History is preserved in both cases — Git for skill bodies, D1's superseded_at column for catalogue rows — and surfaced via the history endpoints (§9 (see lifecycle.md) / S36).

Self-rendered hosting. The repository is consumed by the renderer Worker at bc-infra/site/renderer/ (Cloudflare Workers Static Assets binding). The renderer serves becivic.be/agents from agents.mdx, becivic.be/ from index.mdx, and one canonical URL per skill from skills/<id>/canonical.md — the status frontmatter field (§6.1 (see schemas.md)) drives an in-page banner when the skill is at draft, alpha, or beta. Custom MDX components (<VV>, <Ref>, <Observations>) are resolved at build/fetch time, primarily by build-fetching from /api/*, with a fallback to data-snapshot/ (§6.10 (see schemas.md), §20.3 (see website.md)). llms.txt, llms-full.txt, and content-negotiated markdown are emitted by the renderer build pipeline. The /mcp endpoint moves to the dedicated MCP Worker at mcp.becivic.be (§23 (see protocol.md)).

Agent-vs-human surface split. Agents enter at becivic.be/agents (overview ~40 lines + machine-readable manifest + per-endpoint pages, per §13.1); humans enter at becivic.be/ (marketing landing). Both are served by the same renderer Worker on the same apex; path-routed at Cloudflare via the apex router worker in site/router-worker.js. The /api/* namespace is reserved for the existing Cloudflare Worker staging service (per §4). MCP-protocol clients hit mcp.becivic.be directly. Architecture summary in §20 (see website.md); full design brief in docs/website/requirements.md.

File encoding and formatting conventions (apply to every JSON, JSONL, YAML, and Markdown / MDX file in this repo):

  • UTF-8, no BOM
  • LF line endings only on committed files (enforced via .gitattributes); local user-data files (sessions.jsonl, submissions.jsonl, feedback-buffer-*.jsonl) MAY use the host's native line endings, and readers MUST tolerate either
  • JSON files end with a trailing newline after the closing }; JSONL files end with \n after the last line
  • Skill frontmatter is YAML 1.2 (strict mode — yes/no/on/off are strings, not booleans). Single document per file; no anchors or custom tags
  • Committed observation / validation files are pretty-printed JSON, 2-space indent, trailing newline. The Worker is the canonical writer
  • NIS5 codes in YAML MUST be quoted ("21013", not 21013) so they remain strings; CI rejects unquoted numeric NIS codes

13. Reference consumer: harness skill package

The reference consumer ships as an agentskills.io-conformant skill package: a harness skill (becivic) plus one or more procedure skills (becivic-<id>), installed at ~/.claude/skills/ and symlinked to ~/.agents/skills/ per §15.9 (see skills.md). The canonical deployment surface is Claude Desktop's Cowork tab with the Project's connected folder set to ~/.be-civic/. The full specification of the consumer-side runtime is in §24; this section records what the reference consumer is and what it is not.

What it is. A two-layer skill package that connects a customer to the Be Civic corpus. The harness owns conversation lifecycle, capability detection, profile routing, observation buffering, scrub, and submission. Procedure skills carry the regulation for one administrative process each. Together they implement the session lifecycle contract from /agents/feedback-template, the observation-buffer protocol, the validate-then-stage submission pattern, and the three-tier returning-user adaptation (§24.5). The harness is the one implementation surface every customer interacts with and is held to the full protocol conformance obligations in §15.7 (see skills.md). Procedure skills are governed by §15.1 (see skills.md) (drafting protocol) and §15.8 (see skills.md) (body discipline).

What it is not. A per-vendor adapter project. The D.1 redirect from 2026-04-27 stands: Be Civic does not maintain Anthropic, OpenAI, Google, Mistral, or long-tail vendor adapters. The harness is agentskills.io-conformant so that it installs across any conformant client without vendor-specific adapter code. Capability-tier detection (§24.4) makes the harness gracefully degrade on clients that lack filesystem access or MCP tools without requiring a separate adapter for each.

What ships with v1.

  • becivic harness skill (SKILL.md, status: stable from day one per §15 meta-skill seeding).
  • nationality-application procedure skill (canonical.md, targeting beta at v1 launch; see §14 (see README.md) Phase 4 deliverable 26).
  • becivic.be/agents/bootstrap.zip: the Cowork bootstrap zip per §24.6.1.
  • mcp__becivic__get_skill_graph MCP tool and GET /api/skill-graph HTTP endpoint serving the skills-graph from D1 per §24.2.
  • becivic.be/agents page rewritten per §24.6 to serve as both the T0 boot loader and the install hub for T2.
  • GET /api/skills/<skill_id>/observations HTTP endpoint per §24.7.

Per-ecosystem capable-mode recommendations (retained from the prior §13 framing as a narrower reference; the authoritative tier table is §24.4):

  • Anthropic: Claude Desktop Cowork tab (T2, canonical; T3 with mcp.becivic.be connected); Claude Code (T4, hooks confirmed); free-tier Chat with the Project installed (T1, degraded with upgrade prompt).
  • Any agentskills.io-conformant client: T1 or higher, depending on filesystem and MCP availability.
  • Paste-prompt sampler (T0): any AI that can fetch URLs (ChatGPT, Gemini, Le Chat, free-tier Claude Chat without the Project).

There is no standalone reference-consumer codebase outside the skill package. The harness IS the reference consumer.

13.1 Agent interface: overview, manifest, per-endpoint split (S52)

The /agents page is split into three layers, optimised for just-in-time context injection rather than a monolithic preamble:

  • /agents: overview (approximately 40 lines). What Be Civic is, where to read what, links to manifest and per-endpoint pages, and the T0 boot-loader behaviour per §24.6. Loaded by every agent on first contact.
  • /agents/manifest.json: machine-readable schema for capability tiers, endpoint signatures (including GET /api/skill-graph per §24.2 and GET /api/skills/<skill_id>/observations per §24.7), session-start required fields, retry policy, scrub-rules location. Loaded once per session.
  • /agents/submit/{observation,amendment,draft,validation}: per-endpoint deep references holding full schema, examples, and error responses for that one submission type. Loaded only when the agent intends to submit that type.

This shape replaces the round-6 monolithic agents.mdx (approximately 430 lines). The schema and architecture are recorded here as forward declaration; the implementation lives in the round-7 plan. See design-conversation S52.

16. Anti-patterns

  • Trying to enumerate every commune. Skill body says "verify with your commune"; submission accumulation captures variance.
  • Hardcoding deadlines or fees in skill prose without flagging them as volatile_values.
  • Bundling legal advice. Skills surface statutory text and document workflows; they do not interpret edge cases.
  • LLM at the Worker or in the commit-side NER Action. The receiving end is deterministic; LLM is consumer-side only.
  • Single-file shared JSON for observations / amendments / drafts / validations. Per-submission files only.
  • Embedded submission contract in skill bodies (drift, versioning ambiguity).
  • Trusting submitting_agent strings for rate-limiting or reputation. Worker uses hashed IP only.
  • Persisting submitter identity (plaintext IP, GitHub login, anything) to the public corpus. Anonymous by construction.
  • Retrying a failed Worker submission with a fresh id. Retries must reuse the original id so dedup works.
  • Submitting amendments without running pre-flight validation. The Worker will reject; the consumer wastes the user's turn.
  • Falsely flagging injection to DoS a proposal. Repeated false flags ban the IP permanently (per G.6).
  • Assuming NER auto-revert (it now goes to the human review queue per G.14 — submitters learn via status endpoint).
  • Self-validating one's own proposal. Worker rejects on IP-hash match (per G.7).
  • End-of-session bundled observations. Loses abandoned-session signal. Submit events as they happen.
  • Mixing origin processes into the main skill body. A US birth certificate is its own sub-skill.
  • Adding commune-specific sub-skills for minor variance. Sub-skills only when divergence is structural.
  • Vendor-coupled submission contracts. Contract instructions written to exploit a specific model's behaviour break compatibility across runtimes (per Principle 8 reframe).
  • Designed-for-maintainer skills. Maintainer-specific examples or assumptions creep in unnoticed.
  • Consumers without a submissions log. Even with the one-off start message, the user must be able to review what's been submitted.
  • Maintainer running individual-submission review in steady state. Maintainer is constitutional-court only — quarantines, NER holds, protocol amendments. The state machine handles the rest.

17. Settled decisions

Reorganised by area for ease of reference.

Round 6 (2026-05-03)

  • 2026-05-03: Round 6 architectural collapse. Collapse proposal/canonical/archive to one canonical.md per skill; catalogues + signals (volatile values, references, observations, validations, votes) move to D1; build-time MDX-tag resolution (<VV>, <Ref>, <Observations>); state machine automated from day 1 (with maintainer review only on skill_draft PRs); status enum unified to draft | alpha | beta | stable. Cite design-conversation-r6 S1–S36.
  • 2026-05-03: CC-BY-4.0 licence. Replaces CC0 (rejected for EU jurisdiction incompatibility and sui generis database right friction); also rejects CC-BY-SA-4.0 (recursive share-alike semantics strain agent-excerpting and the anonymous-AI-contribution model). CC-BY-4.0 is the lowest-friction attribution licence for agent excerpting (a citation footer in chat satisfies attribution); compatible with EU jurisdiction; allows commercial use including by Be Civic; preserves brand/credit accumulation. Cite design-conversation-r6 S38.
  • 2026-05-03: Commercial posture. Be Civic is operated by an independent company that publishes a public corpus under CC-BY-4.0 and operates a provider-integration protocol layer (§21 (see protocol.md)) as v1.x ambition. Eight editorial-firewall principles (§21 (see protocol.md)) keep skill content partner-blind, prevent ranking or recommendation, and make removal-for-cause contractually mandatory and revenue-blind. The protocol layer's v1.x scope stops at documentation and application forms completed inside the conversation (S45); full transaction completion is further-out ambition. Cite design-conversation-r6 S37–S49 and docs/product-vision/press-release.md v4 (commit 50c25d3).

Round 7 (2026-05-04)

  • 2026-05-04: Self-rendered Cloudflare Worker site replaces Mintlify. Renderer at bc-infra/site/renderer/ ships markdown→HTML build pipeline, Pagefind static search, light/dark theme, mobile drawer, TOC active-tracking, runtime analytics-beacon injection via HTMLRewriter, and _redirects build-time deduplication. Cutover happens at the same time as this round-7 spec rebase lands. Bare apex becivic.be only; no www. Cite design-conversation-r6 S50.
  • 2026-05-04: End-state-first; v1/v1.x split collapsed. The round-6 hedge that deferred D1 storage, MDX-tag resolution, and state-machine automation to v1.x is reversed. Every architectural component (D1, automated state machine, agent-interface compression, multi-site monorepo, MCP Worker, bc-docs privatisation) ships in v1, before walks resume. Volume only makes round-6 work harder; walks compound on a fixed substrate. Cite design-conversation-r6 S51.
  • 2026-05-04: Agent interface redesigned around just-in-time context. The monolithic /agents page (~430 lines) splits into three layers: /agents ~40-line overview, /agents/manifest.json machine-readable schema, and per-endpoint pages at /agents/submit/*. Optimised for what an agent loads at decision time, not what a human reads end-to-end. Implementation deferred to round-7 implementation plan; spec §13.1 records the architecture. Cite design-conversation-r6 S52.
  • 2026-05-04: API-first; agents don't need GitHub; bc-docs target visibility = private. All submission flows through becivic.be/api/*; agents never need GitHub credentials. Customer-facing pages stop directing humans to GitHub for contributions. bc-docs target end-state is private; corpus content remains CC-BY-4.0 regardless of repo visibility. Privatisation itself is implementation-phase work (post-cutover, after dependent links are cleaned up). Cite design-conversation-r6 S53.
  • 2026-05-04: MCP server at mcp.becivic.be; createMcpHandler; ~6 intent-oriented tools; API-primary. Stateless Cloudflare Worker using createMcpHandler (sub-100 LOC; no Durable Objects). Tool surface ≈ 6 intent tools (e.g., find-skill, read-skill, search-corpus, submit-observation, validate, get-current-status), each routing to existing API endpoints. The public API at becivic.be/api/* is the primary surface; MCP is a thin overlay. Replaces the round-6 expectation of Mintlify-auto-generated /mcp. Cite design-conversation-r6 S54.
  • 2026-05-04: Multi-site monorepo core/ + sites/<site-id>/; per-site wrangler.toml. bc-infra/site/core/ holds shared rendering primitives and Worker scaffolding; bc-infra/site/sites/<site-id>/ holds per-site config, theme, content scope, and wrangler.toml. Site becivic is the default and only live site at launch. Future experimental verticals (e.g. tailored "buying a house" deployments) instantiate as new entries. Scaffold ships in v1 even without experiments; restructure is much cheaper at zero-tenant scale. Cite design-conversation-r6 S55.
  • 2026-05-04: Cloudflare cleanup; bare apex; remove www and Mintlify-era residue. Target topology is becivic.be apex (renderer + API via router-worker.js) plus mcp.becivic.be subdomain (MCP Worker). Mintlify domain bindings retire with the cutover. www.becivic.be removed entirely; bare apex is canonical. Renderer staging on workers.dev subdomain remains as a CI artefact, not user-visible. Cite design-conversation-r6 S56.
  • 2026-05-04: Renderer specifics — HTMLRewriter beacon, _redirects dedup, MDX-tag resolution path. (a) Cloudflare Web Analytics token held as a Worker Secret (not Plaintext, not in wrangler.toml [vars]); beacon <script> injected at request time via HTMLRewriter into HTML responses. (b) Renderer auto-generates per-skill canonical-URL redirects from the corpus index; operator hand-authors high-priority redirects in docs.json; build script de-duplicates with hand-authored entries taking precedence. (c) MDX tags resolve at build/fetch time (primary path: build pulls from /api/* and embeds resolved HTML; snapshot fallback in data-snapshot/ covers API-unreachable case). Raw .md served via content negotiation keeps tags unresolved. Schema in §6.10 (see schemas.md); mechanics in §20 (see website.md). Cite design-conversation-r6 S57.
  • 2026-05-04: Combine bc-docs + bc-infra into a single repo. Once S53 made bc-docs private, the public/private justification for separate repos disappeared. Combine into the existing Be-Civic/be-civic repo (canonical name; matches brand domain becivic.be). bc-infra's content (api/, site/, tools/staging-worker/, mcp/, .github/workflows/*) moves into the bc-docs repo at top-level via merge with --allow-unrelated-histories to preserve git history; bc-infra GitHub repo is archived post-merge. Removes the cross-repo repository_dispatch PAT requirement, the schema-vendoring CI step, and the renderer's bc-docs clone-at-build pattern. bc-operations stays separate (local-only planning). Cite design-conversation-r6 S58.

Hosting and infrastructure

  • Domain: becivic.be — language-neutral Belgian umbrella. Bare apex only (no www). Self-rendered via a Cloudflare Worker at bc-infra/site/renderer/ (S50). MCP at separate Worker on subdomain mcp.becivic.be (S54).
  • Repo: target end-state is a single private repo at Be-Civic/be-civic containing both content (skills/, agents/, docs/, data/, schemas/, tools/scripts/, tools/scrub/) and infra (api/, site/, tools/staging-worker/, mcp/, .github/workflows/*) per S58. Per S53, bc-docs goes private; the round-7 implementation combines bc-infra into bc-docs (preserving history via --allow-unrelated-histories merge) so a single repo holds both surfaces. Corpus content remains CC-BY-4.0 regardless of repo visibility. bc-operations stays separate as local-only planning. Single founder/maintainer initially; main-only branching with PR review.
  • Staging service: Cloudflare Worker for the API (source in api/) + scheduled Cloudflare Worker for the cron commit job (source in tools/staging-worker/). Single Cloudflare account; D1 database for catalogues + signals; single GitHub App.
  • Multi-site monorepo (S55): bc-infra/site/core/ holds shared rendering primitives, build pipeline, and Worker scaffolding; bc-infra/site/sites/<site-id>/ holds per-site config, theme, content scope, and wrangler.toml. Site becivic is the default and only live site at launch. Future experimental verticals instantiate as new entries under sites/.
  • Human / agent entry split (per G.12): becivic.be/ for humans; becivic.be/agents for agents (overview ~40 lines, see §13.1); mcp.becivic.be for MCP-protocol clients.
  • Routing topology (per §20 (see website.md)): the router Worker (site/router-worker.js) path-routes between the renderer Worker (everything that's not /api/*) and the staging Worker (/api/*). The renderer Worker (site/renderer/) handles all human-facing paths including /agents, /skills/*, /docs/*, /llms.txt, /llms-full.txt, /sitemap.xml, /robots.txt. The MCP Worker on mcp.becivic.be is independently routed.
  • Brand typeface: Manrope (self-hosted woff2, latin subset only, weights 700 + 800). Per §20 (see website.md) and docs/website/requirements.md. ExtraBold (800) for display + wordmark + favicon; Bold (700) for inline marks + headings; system sans for body copy.

Submission and staging

  • Five feedback types + analytics + rating in v1 (per G.1, post-2026-05-15 taxonomy normalization): concern, amendment, validation, draft, feedback, plus the separate analytics stream and the parallel rating channel (Lock A, sprint 2026-W23). Each typed feedback type carries target_type as the sole discriminator over skill | volatile_value | reference | path | path_source | skill_graph | observation. Each has its own schema, endpoint, and capability tier. Pre-2026-05-15 the framing was "four submission types" (observation, skill_amendment, skill_draft, validation).
  • amendment payload shape is typed by target_type and content.amendment_subtype (per §6.2.2 (see schemas.md), post-2026-05-15): for target_type=skill, amendment_subtype=body carries a unified body_diff + pinned skill_commit, and amendment_subtype=frontmatter carries a frontmatter_change with field_path + typed proposed_value. For target_type=path | path_source, amendment_subtype=field_edit | source_add. For target_type=volatile_value | reference, the content is a scalar/object correction on the fast-path. Body is prose (diff is right shape); frontmatter is structured (field-based is right shape). Pure-diff and pure-field-based shapes were both rejected — neither serves both targets cleanly.
  • No GitHub account required for users. Submissions go to becivic.be/api/<type>; the Worker holds GitHub App credentials and commits.
  • 24-hour server-side staging in Cloudflare KV. After window, scheduled Worker commits.
  • Cancellation: DELETE endpoint with cancel_token.
  • Idempotent submissions: retrying with the same id returns the existing record.
  • Capability tier per feedback type (per G.2, post-2026-05-15 normalization): concern = multi_turn + structured_output; amendment (target_type=skill | path | path_source) = + web_fetch + tool_execution; amendment (target_type=volatile_value | reference) = concern tier (lighter, no web_fetch / tool_execution); validation (target_type ∈ {skill, volatile_value, reference, path, path_source}) = + web_fetch + tool_execution; validation (target_type=observation) = concern tier (lighter); draft = + file_read; feedback and rating = concern tier.
  • Self-validation prevention (per G.7): Worker rejects validation submissions whose per-artefact-salted IP-hash matches the original artefact's submitter IP-hash. The per-artefact salt is generated on first alpha commit / D1 INSERT and persists until the artefact reaches stable or is superseded; the IP record is destroyed at termination.
  • Per-IP rate limits (per G.6): validations 10/day, validations w/ injection_flag 2/day, all types combined 50/day.

State machine (corpus-growth)

  • Unified status enum (round 6): draft | alpha | beta | stable. The skill's status lives in canonical.md frontmatter; the catalogue row's status lives in D1. There is no separate version_status / corpus_status / proposal_id field.
  • Promotion thresholds (per G.5, first-pass; time anchors measured from cohort_started_at):
    • alpha → beta: ≥3 confirms, 0 rejects, ≥48h since cohort start, ≥3 distinct IPs
    • beta → stable: ≥10 confirms, ≥14 days since cohort start, confirm rate >85%, ≥10 distinct IPs
    • rollback: rejects exceed confirms by ≥2 ⇒ D1 supersession (catalogue rows) or git revert (skills)
    • quarantine: ≥1 injection_flag from a non-submitter ⇒ D1 supersession or git revert + maintainer-review issue
  • State machine is automated from day 1 (S31). PR-opening + auto-merge for skill flows; direct D1 UPDATE for catalogue rows. Maintainer review is required only on skill_draft PRs (new skill creation).
  • Cohort reset is encoded in data (S25). Skills reset cohort when frontmatter version bumps; same-version edits are maintenance. Catalogue rows reset cohort on every fresh INSERT (the supersession itself is the reset).
  • History endpoints (S36): GET /api/skills/<id>/history, GET /api/volatile-values/<uid>/history, GET /api/references/<uid>/history — full audit trail.
  • Observation hide threshold (S20): net_score ≤ -3 flips hide_threshold_breached = 1. Hard-coded in api/_lib/submit.ts (HIDE_THRESHOLD = -3) and surfaced via the <Observations> MDX tag — hidden rows render behind a click-to-reveal, not removed. The value sets a deliberately-asymmetric bar: a single downvote is noise; minus-three requires concerted disagreement (at least three more rejects than confirms) before suppression kicks in. Trivially-easy suppression would let a single ill-faith voter bury legitimate observations; a higher absolute threshold (e.g. -5) leaves clearly-rejected content visible too long. Revisit after 90 days of v1 traffic if either failure mode shows up.
  • Compaction job is deferred to v1.1 — not required for v1 corpus growth.

PII protection — structural by construction (per G.14)

  • Identity-shaped fields banned at schema level on all submission types.
  • Hard length caps on free-text fields (concern body ≤500; concern report (target_type=path) ≤2000; rationale ≤500 on amendments and on validation rejects; commit_message ≤200 on drafts; injection_reason ≤300 on validations; feedback body ≤2000; rating would_be_5_stars ≤500). See §6.2 (see schemas.md) for the full caps table.
  • Per-IP correlation as salted hash, never plaintext. Two salt scopes: daily-rotating (rate limits / DoS defence) and per-artefact (self-validation prevention + state-machine distinct-IP counting; destroyed on artefact termination).
  • No request-body logging anywhere.
  • Three-stage scrub: consumer pre-flight (regex + LLM contextual) + Worker hard-gate regex + commit-side NER.
  • NER-on-commit holds in review queue (NOT auto-revert) — same human-review path as injection-flag quarantines.
  • Anonymous submissions in the committed corpus.

Architecture

  • Skill composition graph: one concept (skill), open-enum category with deterministic guards, requires DAG with typed inputs/outputs. No kind field; no asymmetric main/sub rule.
  • One canonical.md per skill. Skills live as one folder per skill: skills/<id>/canonical.md plus process.mmd when branching. There is no proposals/ directory; there is no archive/ directory. Lifecycle moves through a single status enum on the canonical body.
  • Catalogues + signals in D1 (per S28). Volatile values, references, observations, validations, and votes (validations on observations) live in D1. Daily JSONL backup snapshots in Git under data-snapshot/ for archival. Authored skill prose stays in Git.
  • Build-time MDX-tag resolution. <VV>, <Ref>, <Observations> resolved by the renderer at build/fetch time (S57); primary path build-fetches from /api/*, fallback path reads data-snapshot/. Schema for the tags themselves is in §6.10 (see schemas.md); resolution mechanics live in §20 (see website.md).
  • Catalogue UID convention (per §6.11 (see schemas.md)): 3-letter prefix + dash + 5-digit zero-padded sequence (val-NNNNN, ref-NNNNN, obs-NNNNN). D1 auto-assigns on INSERT. Agents never mint uids; PR-CI orchestrates uid assignment.
  • requires.id resolves to skill_id only. The consumer loads each required skill at its current status. Skills at status: deprecated or status: quarantined are not valid targets.
  • Granularity heuristic: unit becomes its own skill file when (a) referenced by ≥2 main skills, or (b) self-contained with own diagram needed.
  • Type system: initial types in schemas/types.json, extensible via PR.
  • Categories: open enum + deterministic guards (per G.3).
  • Skill index: status / origin aware; auto-regenerated.
  • Activity dashboards: per-skill (point-of-use) AND global; both auto-regenerate.

Skills and content

  • Submission contract: global, versioned (docs/submission-contract-v<N>.mdx), referenced by skills via submission_contract_version pointer.
  • Citation handling (per §6.10 (see schemas.md)): <span class="dsl dsl-ref">label</span> MDX wrapper tags in body resolve at build time against the D1 references catalogue; multilingual rendering at runtime; Justel numac shortcut for Belgian federal law.
  • Source classes: authoritative sources broadly defined (S46) — legal text where it exists, official admin pages and guidance where the procedure lives in practice, professional-body guidance where it applies. Primary (citation-grade), secondary (context-grade, citable with rationale), tertiary (read freely, never cited).
  • Volatile values: named scalars only in v1; stored in D1 with INSERT-with-supersede update mechanism (S29).
  • Communes source: Statbel REFNIS, pinned by nomenclature_date, refreshed quarterly.
  • Citation rot: monthly link-check Action against the references catalogue; optional archived_url per reference entry.
  • Meta-skills are maintainer-seeded (per §15 (see skills.md)): category: meta skills (meta-draft-l1-skill, meta-decompose-process) are committed directly to canonical.md with status: stable from day one and bypass the consumer submission flow. Updates land via direct maintainer PR — not via an amendment submission — consistent with the maintainer's constitutional-court role.

Trust model and review

  • Trust tiers: A (consumer-AI submission of any of five feedback types + rating — automated; consensus via state machine), B (protocol amendment — maintainer review), C (constitutional / retraction — slow maintainer review). Post-2026-05-15 normalization replaces the prior "four submission types" framing.
  • Maintainer is constitutional court only. Six classes of human-review item: injection-flag quarantines (G.6), NER-held submissions (G.14), draft PR review (target_type=skill | path; S31), operator-private feedback triage (post-2026-05-15; §6.2.5 (see schemas.md)), provider-eligibility decisions (§21 (see protocol.md)), and protocol amendments. Routine corpus growth is not maintainer-gated.
  • meta-validate-skill-graph is not a skill (per G.10) — it's the deterministic script tools/scripts/validate-cross-refs.ts.

Consumer

  • Reference consumer is a documentation surface (per D.1 redirect), not a per-vendor adapter project.
  • Per-ecosystem capable-mode recommendations at becivic.be/agents. Agent self-classifies, recommends stepping up if below tier, falls back to advice-only mode otherwise (per D.3 redirect).
  • Pre-flight validation is consumer-side: tool_execution of the cross-ref script (preferred) or rules checklist via LLM judgment.

Test

  • Test fixtures: ship with the validator; one set per submission type (tests/fixtures/<type>/{valid,invalid}/).
  • PII detection scope: every string field across all submission types.
  • Test users: 5–10 users with varied origin countries and platforms across the four ecosystems.

Licence and openness

  • CC-BY-4.0 for everything — code, schemas, skill content, diagrams, docs, site copy, internal design narratives. Decided 2026-05-03 (S38). The "Round 6" subsection above carries the full rationale.
  • Everything open — the LLM IS the runtime; the agent that loads a skill at session time also reads the design context.
  • Rejected alternatives:
    • CC0 (was previously decided in round 5 but rejected at round 6 per S38) — EU jurisdiction incompatibility; sui generis database right friction; PD-dedication ambiguity for Belgian-resident maintainer.
    • CC-BY-SA-4.0 — recursive share-alike semantics strain agent excerpting (every AI chat is technically a derivative); strain on the anonymous-AI-contribution model; potential deterrent to public-administration co-authorship.
    • MIT — legally ambiguous on prose.
    • Closed-source for parts — defeats the agent-as-runtime model.

Round 8 (2026-05-11)

  • S59: Observation event-type enum collapsed to 3 values; S8 and S17 reversed. The round-6 decisions S8 and S17 that collapsed structured event types into a single free-text body field are reversed: the spec adopts a structured 3-value taxonomy (volatile_value, accuracy_concern, skill_surface) because deterministic compute over each type has measurable routing value; observation schema bumps from v2 to v3; session_pause is removed from observations entirely and becomes harness-local resume state; session_outcome is removed from observations and moved to the new analytics endpoint (§6.2.5 (see schemas.md), S60). See §6.2.1 (see schemas.md).
  • S60: Separate POST /api/analytics endpoint with own schema (v1) and own D1 table; all-or-nothing opt-in for v1. Three analytics event types ship in v1 (session_start, step_transition, session_outcome); analytics submission is fully deterministic with no LLM in the path; opt_in_consent: true is a required constant; orphan sessions older than 72 hours receive an inferred session_outcome: abandoned_inferred submission by harness code on the next session preamble. See §6.2.5 (see schemas.md).
  • S61: session_id is a client-side correlation token only; D1 stores opaque recovery_token per row instead. The Worker generates and returns an opaque recovery_token on first staging; the agent stores it locally; D1 NEVER stores session_id; the recovery endpoint becomes GET /api/feedback/sessions/<recovery_token>. See §6.2.1 (see schemas.md) and §8 (see privacy.md).
  • S62: Harness consumer obligations codified in §15.7 as a MUST/SHOULD/MAY protocol-conformance contract. Key obligations include: implementing the four-phase session lifecycle, resolving <USER_DATA_DIR> per spec, fetching scrub rules on session start, running Layer 1 scrub on every buffer write, and delivering the first-session disclosure per §3 principle 10. See §15.7 (see skills.md).
  • S63: Consumer-side runtime specified in §24; two-layer model (harness plus per-procedure skills); T0-T4 tier model adopted with T1-out pivot. T2 (Cowork connected folder with ~/.be-civic/ as writable state) is the canonical v1 supported target; T3 (T2 plus MCP) is the recommended mode; T1 (free-tier Chat Project, read-only Project Files) is demoted to a degraded fallback with an explicit upgrade prompt because the agent cannot write customer-side state, breaking §3 principle 11 and §8.7 (see privacy.md); T4 (hooks environment) is deferred post-v1. See §24.
  • S64: Skills-graph is live-served via GET /api/skill-graph (HTTP) and mcp__becivic__get_skill_graph (MCP); static becivic.be/agents/skills-graph.json artifact deprecated. The live-served model eliminates freshness lag, pushes filtering to the server (optional query parameters: applies_to, status, customer_locale, profile_match), and aligns with the source-of-truth pattern used by every other corpus surface; tools/build-graph.ts is retained for DOT/SVG human-inspection output only. Schema lives at bc-docs/schemas/skill-graph.schema.json. See §24.
  • S65: Skill body discipline invariant codified in §15.8; customer-facing skill bodies MUST NOT contain implementation paths, design rationale, operator-internal sprint references, or dogfooding notes. Skill bodies carry regulation in customer voice only; W20 procedure skill drafts require cleanup against this invariant before v1 ship. See §15.8 (see skills.md).
  • S66: Consumer-side state contract codified in §8.7; 16-axis profile.json catalogue; document-content-discard rule in §8.9. Customer-side state qualifies only when the customer can read it with standard tools, the customer can delete it unilaterally as a single artifact, and the agent can both read and write it in-session; Project Memory and free-tier Chat Project Files fail this contract; only Cowork connected folder (T2 and above) qualifies. The document-content-discard rule (§8.9 (see privacy.md)) requires the harness to extract routing fields only when a customer provides a document and to discard document content (numbers, names, photos, full addresses, dates of birth) immediately. See §8.7 (see privacy.md), §8.9 (see privacy.md).
  • S67: Spec cross-reference check added to every CEO, engineering, and design review; process gate lands in playbooks/sprint-cycle.md. Every design document MUST name which spec sections it touches and either provide a conformance statement or propose a spec amendment; a spec amendment, once approved, MUST land before the implementation that depends on it; this is a playbook process decision, not a spec architectural decision, and is recorded here for auditability.
  • S68: OSS-alignment v1 committed; install-time symlink at ~/.agents/skills/becivic/ and three SKILL.md metadata frontmatter fields. The install step creates ~/.agents/skills/becivic/ pointing to ~/.claude/skills/becivic/ for agentskills.io cross-client discoverability; three metadata fields are required in harness SKILL.md frontmatter: privacy-contact, data-retention, and opt-in-required. See §15 (see skills.md).

Round 8 follow-up (2026-05-12)

  • SG1: S17 "no maintainer-curated 'Known surprises' section" decision reversed; both surfaces coexist. The round-6 decision (S17) that removed a dedicated "Known surprises" section from skill bodies is reversed. The operator rationale: "there are often things that need to be surfaced in the skill that don't fit neatly in some other part of the skill, and they're a little bit detailed." The new framing is a two-surface model: the Known surprises section is maintainer-curated, carries stable well-understood pitfalls that don't fit cleanly into other body sections, and is committed directly by the maintainer as part of the canonical body; the <Observations> MDX tag is the rendered layer for community-discovered issues submitted via the consumer submission flow and sorted by net score (§6.10 (see schemas.md)). The two surfaces are complementary, not competing: maintainer-curated surprises are authoritative and version-stable; community observations are dynamic and reputation-weighted. Known surprises is now listed as a canonical body section in §6.1 (see schemas.md). The research-report.md §11 failure-modes catalog feeds candidate content for Known surprises into the skill via the deferred promote mode (§3 (see build-tools.md)); on first draft the maintainer manually reviews and promotes relevant failure modes. Commit d1fd7c1 implements the schema change.

Path Directory (2026-05-12)

  • S69: Paths introduced as a new top-level concept alongside skills. A path is a route to obtain something the citizen needs: a document, a deeplink to an interactive tool, a form on a portal, a calculator, or a commune service desk. Skills are procedures (multi-step, multi-party, often citing law and branching by user category); paths are routes (the agent navigates them on the customer's behalf where possible, or hands off cleanly to the customer where it cannot). The two concepts compose: a procedure skill declares requires_paths for the documents and tools it needs, and the Path Directory tells the harness where each target lives and how to reach it across multiple sources in priority order. The discriminator between path and skill is whether the target is the output of a complex multi-step procedure (skill) or is reachable via a portal, deeplink, form, calculator, or commune visit (path). See §6.12 (see schemas.md) for the path schema; see §24.9 for the traversal algorithm. (D1, D2.a per the 2026-05-12 design conversation.)
  • S70: Paths catalogue is served as one JSON document from bc-docs/paths/index.json. The catalogue is agent-traversable and keyed by path_id for O(1) lookup; the MCP server and becivic.be/api/paths HTTP endpoint serve the same content. One file at V0 scale (approximately 7 to 8 entries); sharding by theme is deferred until the catalogue exceeds approximately 1MB raw. The catalogue uses the same MCP / HTTP / web-fetch fallback chain as the skills-graph per §24.4.1. (D2.b per the 2026-05-12 design conversation.)
  • S71: pth-NNNNN UID convention for path entries. Path entries follow the existing catalogue UID convention (§6.11 (see schemas.md)): 3-letter prefix pth plus a dash plus 5-digit zero-padded sequence. Authority on minting is the same as for val-, ref-, and obs- entries: D1 auto-assigns on INSERT, agents never mint, PR-CI orchestrates uid assignment. (D2.b cont., OQ3 resolved 2026-05-12.)
  • S72: Source class is a closed enum; validation templates encoded in schema by source_class via oneOf / if then discriminators. The source_class enum at V0 covers brussels-tier1-quicklink, brussels-tier2-inquiry, brussels-tier3-noauth, flanders-api-page, wallonia-sitemap-page, federal-anonymous-form, federal-auth-handoff, partner-portal, and offline. Each class drives a procedure template and a validation-path template, both encoded into the JSON schema using the discriminator pattern already in use for observation.v3. Any JSON-Schema validator can check the shape; PR-CI gets the constraint for free. New source_class values are protocol-level changes added via spec amendment, not by individual path authors. (D22 per the 2026-05-12 reconciliation discussion.)
  • S73: Each path source carries an explicit actor block declaring who does what. Every entry in sources[] carries an actor block with three fields: actor.primary (closed enum: agent, user, or both), actor.handoff.when (closed enum: none, auth-wall, captcha, confirmation, physical-presence, full-takeover), and three plain-English text fields (agent_responsibility, user_responsibility, resumption) the harness presents at the handoff moment. The actor block replaces implicit handoff cues with structural shape and lets the harness present handoffs in plain English without inferring them from procedure.kind. Schema-encoded constraints tie the handoff value to consistent fields (handoff none implies primary agent; handoff auth-wall implies auth.method != none; handoff physical-presence implies source_class: offline; and so on). (D24 per the 2026-05-12 reconciliation discussion.)
  • S74: No blanket credentials prohibition on the harness; per-source instructions carry safe-by-default behaviour. The earlier consideration of a §15.7 obligation forbidding the harness from ever handling credentials is dropped. The actor block's per-source agent_responsibility, user_responsibility, and resumption text carry the safe-by-default behaviour for credential exchange (the customer authenticates; the agent does not attempt to authenticate; agent runtimes that legitimately receive a one-time token from a customer-owned password manager handle that narrow case via per-source instructions). Future tiers may broaden agent-side credential handling in specific, named cases without needing a blanket exception. (D25 per the 2026-05-12 reconciliation discussion.)
  • S75: First-contact framing about document handling is agent-platform-neutral. The plain-English line the harness delivers on first contact about document handling avoids platform-specific disclosure (no mention of "cloud", "Anthropic servers", or vendor-specific storage). It says: the document file stays where the customer put it; categorical routing fields go into the customer's profile on their machine; nothing else from the document is kept in the agent's notes. Agent-platform privacy policies handle their own layer; the Be Civic framing is portable across runtimes. (D26 per the 2026-05-12 reconciliation discussion.)
  • S76: Spec text describes behaviours in agent-platform-neutral terms wherever possible. The genericity principle applies to body prose across the spec: platform-specific examples (vendor key-value stores, vendor-specific chat tabs) are introduced as bracketed examples after a generic phrase (e.g., "vendor key-value stores [for example, Project Memory in Anthropic platforms]"; "read-only file surfaces [for example, free-tier Chat in Anthropic platforms]"). Platform-specific language is preserved in the canonical-target tier mapping (§24.4), in the Cowork bootstrap onboarding (§24.6), in path resolution (§8.7.3 (see privacy.md)), and in cowork-plugin.md, because in those subsections the platform is the canonical-target reality and abstracting it would lose meaning. (D23 per the 2026-05-12 reconciliation discussion.)
  • S77: V0 ships the catalogue plus harness wiring in v1; spec text does not enumerate catalogue size. V0 ships the schema, the catalogue (at the operator's discretion: approximately 7 to 8 paths for nationality-application's Belgian-side documents, optionally including casier-judiciaire as a high-demand quick-win), and the harness wiring in v1. The spec describes the concept; the catalogue grows with corpus authoring. The friend-tester bundle covers V0 catalogue plus harness wiring end-to-end. (D19 per the 2026-05-12 reconciliation discussion.)

Tag resolution model (2026-05-12)

  • S78: MDX tag format is wrapper-tags with value or label as children. Volatile-value tags (<VV>) and reference tags (<Ref>) carry the catalogue uid in the uid="..." attribute and the current value (or, for <Ref>, the inline citation label) as children. The renderer substitutes children at build time when the catalogue changes; the wrapping tag remains in every emission as the agent's signal that the content is volatile or cited. This preserves the four-signal contract: current value (children), semantic name (name attribute), catalogue uid (uid attribute), and "this content is volatile or cited" marker (the tag itself). Decided 2026-05-11. Reconciled into §6.10 (see schemas.md) and §20.3 (see website.md) on 2026-05-12.

  • S79: Build-time renderer-side resolution. Tag substitution happens at the renderer Worker's build step. The build fetches from /api/volatile-values and /api/references over HTTP (primary path); data-snapshot/volatile-values.jsonl and data-snapshot/references.jsonl serve as the fallback when the API is unreachable at build time. The MCP read_skill tool stays a thin proxy that forwards the resolved markdown from the build output; no per-request catalogue fetches occur in the MCP Worker. Request-time substitution with a short Worker-side cache remains a fallback if catalogue change rate ever exceeds build cadence; this is not a v1 concern. Decided 2026-05-11.

  • S80: Unresolved-tag emission. When a tag's uid does not resolve to an active catalogue row (row absent, or status: deprecated), the renderer emits the tag with the sentinel children [unresolved] and a data-resolution-status="unresolved" attribute. This lets agents detect the broken link and surface the gap to the customer. Refusal-to-build at PR-CI time is a stricter alternative reserved for a future round if validator stricture is needed. Decided 2026-05-12 (reasonable default applied during reconciliation).

  • S81: Re-citation uses bracket form. First citation of a reference in a skill body MUST use the full <Ref> wrapper with all attributes. Subsequent re-citations of the same reference within the same skill body MAY use the lighter `[ref-id]` bracket form. The renderer resolves the bracket form to an in-page anchor link targeting the first <Ref> instance. The bracket form carries no attributes and is valid only after the full <Ref> has been introduced in the same body. Decided 2026-05-12 (reasonable default applied during reconciliation).

  • S82: Catalogue uses kebab-case agency-prefixed naming convention. Catalogue rows use names in the form dvz-handling-fee-d-visa-eur (kebab-case, agency-explicit prefix) per walking-procedure.md §Catalogue conventions. Alpha skills authored under the legacy snake_case agency-implicit convention (e.g. federal_registration_fee_eur) are migrated to the catalogue convention during the W2.C corpus rebase. Migration direction: alpha skill frontmatter and body prose adopt catalogue-row names, not the reverse. Decided 2026-05-12.

  • S83: Tag-only edits do not bump skill version. When a walker converts `[ref-id]` brackets to <Ref> wrapper tags, or populates empty children with the current value from the catalogue, the body change is semantically equivalent to the prior form and MUST NOT bump the skill's version field per §6.1 cohort-reset rules. PR-CI validates this invariant. Decided 2026-05-12.

Feedback taxonomy normalization (2026-05-15)

  • S84: Submission taxonomy normalised to 5 feedback types + analytics + rating. The pre-2026-05-15 route-shaped taxonomy (observation + skill_amendment + skill_draft + path_amendment + path_draft + path_validation + validation + analytics) collapses into five type-shaped feedback types (concern | amendment | validation | draft | feedback) + the unchanged analytics stream + a new rating channel (Lock A, sprint 2026-W23). Each typed feedback type carries target_type as the sole discriminator over skill | volatile_value | reference | path | path_source | skill_graph | observation. The same amendment shape covers skill body diffs, frontmatter edits, volatile-value scalar corrections, reference URL updates, and path field edits — keyed by target_type. Wire reduction: route count from 5 entry-points → 4 typed + 1 new feedback-channel + analytics + ratings; MCP write tools from 8 → 6. The <Observations> aggregator walks every catalogue / path / source uid the body cites and surfaces concerns + pending amendments at render time, with a <CohortStats> header derived from D1 (NOT materialised in canonical frontmatter; locked G4). Hard cutover, no aliases, pre-launch. See 2026-05-15-feedback-taxonomy-normalization.md for the full proposal; §6.2 (see schemas.md) for the wire shape; §9 (see lifecycle.md) for state-machine semantics.

  • S85: S61 reversed pre-launch. The recovery_token component proposed in the 2026-05-11 cluster-2 amendment (S61) never landed in code; the spec carried a 5-site contradiction with the live session_id recovery endpoint. Pre-launch reversal: session_id remains the recovery key end-to-end; the recovery_token concept is dropped from the spec entirely. D1's validations.session_id column persists; the proposed recovery_token column on concerns was never created and is permanently dropped from the migration sequence. KV salt-key conventions get a cosmetic rename (proposal:*submission:*, observation:*concern:*) but the salt mechanics are unchanged. Settled decisions S61 stays in the audit trail; this entry records the reversal. See 2026-05-15-feedback-taxonomy-normalization.md §"S61 reversal" for the substance.

  • S86: Rating feedback surface — Lock A, sprint 2026-W23. The 2026-05-10 feedback-surface design (per bc-operations/docs/agent-ux/2026-05-10-feedback-surface-design.md) ships in v1 as a new 6th rating feedback type, first-class parallel to concern / amendment / validation / draft / feedback. Three target_types (skill, agent_protocol, session); per-axis model — exactly one of three star fields populated per submission (skill_quality_stars | agent_protocol_stars | user_experience_stars); optional would_be_5_stars anchor text per the 5-star-prompting rule. Aggregates into <CohortStats> on the skill canonical (skill_quality_avg / skill_quality_n); agent_protocol and session ratings land on the operator-private /api/_internal/rating-stats surface. Implementation distributed across the W23 sprint workers. See §6.2.7 (see schemas.md).

Auto-version-bumping (2026-05-15)

  • S87: Auto-version-bumping workflow shipped with 8 OPENs locked. A version-bump.yml workflow inspects every commit on main and bumps the artefact's version: per status: (0.0.x draft / 0.1.x alpha / 0.2.x beta / 1.0.x stable; patch increments per commit; only minor+ bumps reset cohort). All 8 OPENs locked at proposal-author recommendations (per operator directive 2026-05-15): 1.0.x stable with patches allowed; cohort locks at →stable; no cancel-in-progress; squash-merge collapse (main's version = bumps observable on main); git log audit trail (no separate artefact); previous_stable_sha for quarantine demote; warning (not error) on first-deploy migration; optional rebase script as companion; Be Civic Bot <bot@becivic.be> bot identity differentiated via commit-message prefix. The state-machine bot bundles status flips with version bumps in a single PR. See 2026-05-15-auto-version-bumping.md for the full proposal; §9.7 (see lifecycle.md) for the workflow contract.

18. Open questions

  • Validation consensus thresholds (per G.5 first-pass values): ≥3 confirms / 0 rejects / 48h / ≥3 IPs for alpha→beta; ≥10 confirms / 14d / 85% / 10 IPs for beta→stable; rejects-exceed-confirms-by-2 for rollback. Revisit at end of v1's first 90 days based on observed validation volume; tighten or loosen accordingly.
  • Injection-flag false-positive rate calibration. Single flag → quarantine is aggressive; how often will benign skills get quarantined by malicious or careless flaggers? Tune after first 90 days; consider raising the threshold to ≥2 distinct IPs flagging if FP rate is high.
  • Concern-compaction job necessity. Whether the v1.1 deferred job is actually needed, or whether direct amendment (target_type=skill) submissions from observant consumer AIs are sufficient. Likely the latter; review at end of v1's first 90 days.
  • Worker rate-limit thresholds. 10 validations/IP/day, 2 with injection_flag, 50 total — initial values, tunable.
  • Rolled-back artefact reinstatement path. A skill body or catalogue row rolled back in alpha may have been correct but unlucky. Currently the only recourse is a fresh skill_amendment (or fresh catalogue INSERT) with the same content. Consider a re-submission protocol that ties to the rolled-back artefact and accepts validations from the prior round; defer to v1.1.
  • Multi-region skill strategy (regional fork as one skill with branching body, vs. three sub-skills referenced from one main) — decide per skill at drafting time.
  • Test user recruitment specifics — operational, decide closer to recruitment time.
  • Submissions log location — decided 2026-05-05: project-local first (<output_dir>/.be-civic/), <USER_DATA_DIR>/be-civic/ as fallback when no project context available.
  • Honor-system accountability signal. No "expected vs. actual submission ratio" metric (no skill-load telemetry by design). Drift from "everyone submits" → "few submit / many spam" is invisible until quality collapses. Accept as deliberate gap; document in docs/threat-model.md.
  • OSS-positioning trigger. Open-source the harness and/or per-procedure skills when triggered by a named condition (customer-privacy survey signal, substantive product traction, or first legal review of Be Civic as a product); the spec records that the trigger condition must be named before the decision is taken.
  • Hooks in Cowork tab. Verify whether Anthropic's Cowork tab supports skill frontmatter hooks; if it does, fold the T3/T4 distinction and promote hook-fired analytics to the standard T3 path; if it does not, keep T4 as Claude Code tab only. See §24.
  • Per-event vs all-or-nothing analytics opt-in. v1 ships all-or-nothing opt-in for analytics (S60); revisit and consider per-event granularity if consent-fatigue evidence or per-event-opt-in requests emerge in the first 90 days. See §6.2.5 (see schemas.md).
  • outputs field type granularity beyond document_artefact. The Path Directory uses the skill type system for outputs. document_artefact covers V0; finer typology (document_artefact_pdf, document_artefact_paper, tool_url, calculator_url) may emerge from real catalogue authoring. Resolve at v1.1 once V0 entries have been authored and any divergence is observed in practice. (From the 2026-05-12 Path Directory reconciliation, OQ4.)
  • Cross-region path entries for documents that genuinely differ across regions. The recommended pattern is one path entry with multiple sources covering regional divergence (sources[].audience.regions). Split into separate path entries only when document semantics are materially different (rare). Real-world authoring cases — composition de ménage in all four regions, regional casier judiciaire variants — will validate the one-entry-multi-source pattern; revisit if a clean split is needed. (From the 2026-05-12 Path Directory reconciliation, OQ5.)
  • Current renderer behaviour on <VV> and <Ref> source tags. Walking-procedure.md states that raw .md keeps tag form, but no alpha skill currently carries any tags to test against. A probe is needed: drop a <VV> tag into a draft skill on dev.becivic.be and hit it with Accept: text/markdown to confirm whether the renderer strips the tag, leaves it with empty children, or preserves it as authored. This determines how much of the W2.B renderer change is new work versus an adjustment to existing behaviour. Implementation-verification task, not a design decision. Resolve in the W2.B renderer-change PR.
  • PR-CI uid minting. Walking-procedure.md §Phase 7 states that PR-CI mints uids when a <VV name=".." uid="">value</VV> arrives with an empty uid attribute. The implementation in tools/scripts/ needs verification. If the script does not exist, add the task to W2.A scaffolding. Verification task, not a design decision.
  • Single-PR vs. staged rollout for W2. W2.A (walking-procedure rework and bc-corpus-creator), W2.B (renderer change), and W2.C (corpus rebase of the 47 alpha skills) can ship as one atomic PR (coherence advantage: no intermediate state where the renderer expects tags but the corpus has none) or as separate staged PRs (review-load advantage: each piece is smaller and independently reviewable). Operational preference; resolve at the start of round 9 implementation before any W2 branch is opened.
  • Skill-walks in flight that require pause or rebase coordination. Check feat/walk-* branches before locking the timing of the W2.C corpus rebase. Any walk in progress needs either a pause (walker waits for W2.A to lock) or a mid-walk rebase onto the new tag format. Operational, not a spec question.

19. Out of scope (do not build in v1)

  • Per-vendor adapters beyond the capability-tier documentation surface (per D.1 redirect — Be Civic does not maintain Anthropic / OpenAI / Google / Mistral adapters)
  • Filesystem-less mode as a full-participation tier (per D.3 redirect — degrades to advice-only mode; observation-only at most)
  • Coverage of non-Belgian processes (architecture allows it via the umbrella; v1 doesn't pursue it)
  • Coverage of non-government / corporate processes
  • Reputation systems beyond IP-hash rate-limit counters
  • Translated skill bodies (lang/<code>/ reserved for future; v1 is English-only with agent-side translation of user-facing prose; citations rendered multilingually per G.13)
  • Web submission form for humans (CLI tooling sufficient if needed; agents are the primary submitter)
  • Hosted database (Dolt or otherwise)
  • Cross-repo skill references (reserve a repo field in submission schemas for future, don't build)
  • Polished consumer integrations (Cowork plugin, Cursor extension, etc.)
  • Submission archival / retention policies — decide once data exists
  • Operational scaling provisions — decide after first month of real use
  • Custom domain SSL beyond Cloudflare's automatic — Cloudflare's free SSL is sufficient
  • Multi-region Worker deployment — single-region is fine for our scale
  • Email notifications, webhook integrations, third-party integrations
  • Public health page (per F.13 — defer to v1.1)
  • Stress-test regression suite (per F.16 — defer to v1.1)
  • Switch-mutualité L1 (per D-vision.1 — defer to v1.1 unless drafted as part of mutualite-enrolment in Phase 4)
  • Compaction job extracting patterns from observation clusters (per §9.4 (see lifecycle.md) — deferred to v1.1)
  • Cross-machine state sync (customer-driven export and import via tools such as rsync or syncthing is permitted; server-mediated sync where Be Civic infrastructure holds or relays customer state is not; see §3 principle 11 and §8.7 (see privacy.md)).
  • Server-side per-customer state of any kind (also stated as §3 principle 11; the customer-side state contract in §8.7 (see privacy.md) is the architectural mechanism that enforces this boundary).
  • Cross-vendor deterministic execution (Be Civic ships agentskills.io-conformant skills; vendor-specific deterministic surfaces such as tool-use pipelines tied to a single provider's runtime are v2+ per-vendor work).
  • Web-only and cloud-runtime clients where the agent has no access to a writable local filesystem (the customer-side state contract in §8.7 (see privacy.md) requires write-back capability; cloud-runtime agents that cannot write ~/.be-civic/ cannot satisfy the contract and are not a v1 target tier).
  • Free-tier Claude Chat tab as a primary delivery channel (the agent has read-only access to Project Files in the free-tier Chat interface, which breaks the customer-side state contract requiring write-back capability per §8.7 (see privacy.md) and §3 principle 11; free-tier Chat is supported only as a T0/T1 sampler with an explicit upgrade prompt directing the customer to open the Project in the Cowork tab; it is not a v1 supported tier). See §24.
  • Static becivic.be/agents/skills-graph.json artifact (replaced by the live-served skill-graph endpoint GET /api/skill-graph and the MCP tool mcp__becivic__get_skill_graph per S64; tools/build-graph.ts is retained for DOT/SVG human-inspection output only). See §24.
  • Hand-authored translations of Path Directory catalogue entries beyond the 4 official languages (FR, NL, EN, DE). The catalogue title and description fields use the same multilingual shape as procedure-skill metadata; translation into additional languages is deferred. Agent-side translation of catalogue text into the customer's preferred language remains permitted at runtime.
  • Server-side eligibility evaluation for sources[].audience.predicates and path.applies_to. V0 treats profile_match as a client-side filter: the harness evaluates predicates against the customer's profile.json and skips ineligible sources locally. Server-side evaluation (the catalogue server returns only eligible sources, given a profile snapshot) lands at v1.1+ once skill body discipline and the predicate language have stabilised. See §24.9 traversal step three.

24. Consumer-side runtime

Sections 1 through 24 describe what the server does: the staging service, the submission schemas, the state machine, the PII scrub, the validation-consensus protocol, and the MCP intent surface. They describe the protocol from the corpus side. This section describes the protocol from the consumer side: the runtime an agent operates to connect a real user to the corpus and to satisfy the submission contract.

The consumer-side runtime ships as a two-layer skill package: one harness skill (becivic) that owns the conversation, plus one or more procedure skills (becivic-<id>) that carry the regulation. Both are agentskills.io-conformant SKILL.md packages installed at ~/.claude/skills/ (and symlinked to ~/.agents/skills/ per §15.9 (see skills.md)). The harness is the implementation surface every customer touches and is held to the obligations in §15.7 (see skills.md). Procedure skills are governed by the drafting protocol in §15.1 (see skills.md) and the body discipline in §15.8 (see skills.md).

This chapter is normative. Where §24 conflicts with prior chapters on consumer-side behaviour, §24 controls.

24.1 Harness and procedure skills

The consumer runtime is composed of exactly two kinds of skill. The split is structural, not stylistic: it determines which obligations apply to which body and which body the harness loads when.

The harness skill (becivic). The harness is the session orchestrator. It owns:

  • Session lifecycle (initialise, buffer, present, submit) per the feedback-template four-phase contract.
  • Capability detection per §24.4 and tier-appropriate setup guidance.
  • Profile routing: reading profile.json (see C7 amendment, §8.7) to supply only the minimum context each procedure skill needs.
  • Observation buffering, L1 scrub, and validate-then-stage submission per §15.7 (see skills.md).
  • Contract framing on first contact per §3 principle 10.
  • Skill routing via the skills-graph (§24.2).

The harness body MUST NOT carry regulation for any specific administrative process. Regulation lives in procedure skills.

Procedure skills (becivic-<id>). A procedure skill carries the regulation for one administrative process. It owns:

  • The authoritative procedural body: required documents, steps, citations, volatile values, branching by region and legal category.
  • Frontmatter metadata used for routing: id, summary, tags, applies_to, prerequisites, related, profile_requirements, status.

A procedure skill MUST NOT carry session state, submission logic, scrub logic, orphan-recovery logic, harness-internal mechanics, or state-path references. A procedure body that contains any of these has mixed the two layers and is non-conformant per §15.8 (see skills.md).

Activation and invocation. The harness auto-activates on any session where the user describes a Belgian administrative task. The trigger set is a keyword list published at becivic.be/agents/triggers.json and SHOULD be maintained by the agentskills.io client's description-match mechanism. Procedure skills are loaded on demand by the harness via Skill-tool invocation, routed through the skills-graph (§24.2).

The harness invokes a procedure skill exactly once per procedure per session, or once per active-procedure switch. The procedure body, once loaded, asserts in context on every subsequent turn without further explicit invocation, per skill-tool persistence semantics.

Co-loaded procedures. Multiple procedure skills MAY be co-loaded in a single session when the customer's case crosses procedures (for example, registering a change of address while also preparing a nationality declaration). The harness is responsible for keeping the procedure bodies contextually separate when presenting steps to the customer and for attributing observations to the correct skill_id at submission time. The attribution heuristic (ask the customer, or infer from step context) is at the harness's discretion.

24.2 Skills-graph

The skills-graph is the routing metadata linking procedure skills. It is the source of truth the harness queries to identify which procedure skill to invoke for a given customer request.

Live-served, not a static artifact. The skills-graph is served live from D1 via two parallel surfaces. The prior round-7 draft proposed a static becivic.be/agents/skills-graph.json regenerated by CI; that approach introduced a freshness gap between state-machine status promotions and the next CI build, forced client-side filtering as the only filtering option, and broke the source-of-truth-aligned pattern used by every other corpus surface. Live serving costs the renderer approximately one D1 query per session start with edge caching, and gains live freshness, server-side filtering, and parity with the submit_observation / get_skill_observations HTTP-and-MCP pattern.

Surfaces. The graph is served from D1 via two parity paths:

  1. MCP tool mcp__becivic__get_skill_graph (T3 path). Filterable by applies_to, status, customer_locale, and optionally profile_match.
  2. HTTP endpoint GET https://becivic.be/api/skill-graph (T0, T1, and T2-no-MCP parity path). Returns the same JSON. Filters are passed as query-string parameters with identical names and semantics.

Both surfaces call the same D1 query and the same serializer. Both responses carry Cache-Control: public, max-age=60, s-maxage=60 so that bursts of session starts in the same minute hit Cloudflare edge cache rather than D1.

Schema. The response shape is defined in schemas/skill-graph.schema.json (new file; see §14 (see README.md) build sequence). The schema describes the response shape returned by both surfaces; it does not describe a file-on-disk artifact. Changes to the schema MUST be accompanied by a spec amendment to this section.

No static artifact. No static skill-graph.json is generated by CI or published by the renderer. The legacy tools/build-graph.ts script is retained for DOT and SVG output used in human inspection and design documents; its JSON-emission branch is removed. The renderer URL becivic.be/agents/skills-graph.json is decommissioned.

Response shape.

{
  "generated_at": "2026-05-11T15:00:00Z",
  "filters_applied": {"applies_to": ["nationality"], "status": ["stable", "beta"]},
  "nodes": [
    {
      "id": "nationality-application",
      "title": "Belgian nationality declaration (art. 12bis)",
      "summary": "Declare Belgian nationality after 5+ years of registered residence.",
      "description": "...",
      "status": "beta",
      "canonical_url": "https://becivic.be/skills/nationality-application",
      "tags": ["nationality", "citizenship", "art-12bis"],
      "applies_to": "nationality",
      "prerequisites": [],
      "related": ["apostille-foreign-document-hague"],
      "profile_requirements": ["region", "commune_nis5", "residency_history", "nationality_status", "employment_history"]
    }
  ],
  "edges": [
    { "from": "nationality-application", "to": "apostille-foreign-document-hague", "type": "related" }
  ]
}

Node fields.

Field Type Description
id string Skill id, matching the procedure skill's id frontmatter field
title string Human-readable title
summary string One sentence describing the administrative process
description string Longer-form description as it appears in find_skill results
status enum One of draft, alpha, beta, stable per §6.1 (see schemas.md)
canonical_url URL Canonical URL of the skill body
tags string[] Topic tags matched against customer task descriptions
applies_to string Closed enum: nationality, residency, employment, civic-status, social-security, education, taxation, family, housing, travel-documents, meta. One value per skill. Fine-grained routing constraints (region, civil status, origin country) live in profile_requirements and in the procedure body, not on this axis.
prerequisites string[] Skill ids that MUST be completed before this skill can proceed
related string[] Adjacent skill ids the customer may also need
profile_requirements string[] profile.json field names this skill needs for routing (the harness asks only for fields missing from the current profile)

Edge semantics.

  • prerequisites: the named skill MUST be completed before this skill can proceed. Example: apostille-foreign-document-hague is a prerequisite of nationality-application when the customer's origin country is non-Hague.
  • related: the named skill is adjacent. The harness SHOULD surface related skills to the customer after the current skill's primary steps are complete.

Filter semantics.

  • applies_to (optional array of category enum values): return only skills in these categories. Default: all categories.
  • status (optional array): return only skills in these lifecycle statuses. Default: ["stable", "beta"], matching the §24.3 runtime-composition rule that draft and alpha are gated behind explicit customer opt-in.
  • customer_locale (optional, one of NL, FR, DE, EN): used to disambiguate procedure variants where a skill has region-specific or commune-language-specific branches. The harness MAY pass the customer's administration_language from profile.json.
  • profile_match (optional object): a partial profile.json snapshot. When present, the server returns only skills whose profile_requirements are satisfied and whose statutory eligibility constraints (encoded as a server-side rules table per skill) are met given the supplied profile fields. v1 implementation MAY treat profile_match as a flag that defers to client-side filtering (return all skills, let the harness filter); v1.1 and later SHOULD add server-side eligibility evaluation as the skill body discipline matures.

D1 source of truth. Node fields are columns on the existing skills catalogue table in D1 (the same source that find_skill and read_skill query). applies_to, tags, prerequisites, related, and profile_requirements are added as columns; the migration is part of the v1 build sequence (§14 (see README.md)). Frontmatter on each procedure skill carries the same values and is the input to D1 row creation through the skill-deploy CI pipeline.

Why two surfaces. The harness operates at T0 and T1 as well as T2 and T3 (§24.4). T0 and T1 harnesses do not have MCP tools available and need an HTTP-fetchable surface. The HTTP path serves the same JSON; only the transport differs. The status-enum values returned by either surface MUST agree with the C1 observation-taxonomy amendment.

Routing floor: meta-skill always present. Every skills-graph response MUST include the no-skill-fallback meta-skill node (id: meta-no-skill-fallback, applies_to: meta, status: stable) when the meta-skill ships. The harness routes to this node whenever the customer's request matches no other node after filters are applied. The meta-skill is the routing floor: a corpus where the meta-skill is absent is a publishing error and the harness MAY refuse to proceed until it is present. See §24.8 for the meta-skill's behaviour contract.

Path Directory fetched alongside the skills-graph. At session start the harness also fetches the Path Directory: the catalogue of document, tool, and commune-handoff routes that procedure skills declare via requires_paths (§6.12 (see schemas.md), §15 (see skills.md)). Both surfaces use the same MCP / HTTP / web-fetch fallback chain documented in §24.4.1: MCP tool first (mcp__becivic__get_path_directory at T3), HTTP endpoint next (GET /api/paths), and a web fetch of the canonical catalogue at becivic.be/paths/index.json as the floor. The same edge-cached Cache-Control shape applies. The harness queries the Path Directory whenever a procedure declares requires_paths OR when the customer asks for a specific document, deeplink, calculator, or commune service that may be catalogued. The traversal algorithm the harness then applies is §24.9.

24.3 Runtime composition

The consumer-side runtime operates as follows from first contact through session close. This subsection is the canonical sequence; §24.5 specialises it for returning customers, and §24.6 specialises it for the first-ever entry through the boot-loader.

Phase 1: session initialisation.

  1. The harness detects the capability tier (§24.4) silently, without prompting the customer. The detected tier governs every subsequent capability-gated choice.
  2. The harness generates a ses_<UUIDv7> session id. The id is a purpose-generated random token; it is not derived from any customer attribute.
  3. The harness resolves <USER_DATA_DIR> per the rule in §8.6 (see privacy.md): $XDG_DATA_HOME/be-civic/ if $XDG_DATA_HOME is set; ~/.local/share/be-civic/ if $XDG_DATA_HOME is absent on Linux or macOS; ~/.be-civic/ as the final fallback. On a filesystem-less runtime, the harness operates in advice-only mode (no observation submission, no profile persistence) and announces this limitation in plain language at session start.
  4. The harness fetches canonical scrub rules from becivic.be/scrub-rules.json and caches them for the session. If the fetch fails after two retries, the harness MUST NOT submit any observations during the session. This rule is non-optional.
  5. The harness scans for orphan buffer files in <USER_DATA_DIR>/sessions/ per the feedback-template orphan-scan protocol. Orphan sessions older than 72 hours trigger a deterministic session_outcome: abandoned_inferred submission at the start of the next preamble. This path is code-only; no LLM judgment.
  6. The harness fetches the skills-graph from https://becivic.be/api/skill-graph (T0, T1, T2 path) or via mcp__becivic__get_skill_graph (T3 path). On fetch failure, the harness falls back to a locally cached graph stored under <USER_DATA_DIR>/cache/skill-graph.json. If no cache exists, the harness operates without graph-assisted routing and asks the customer to describe their task so it can match against locally-known procedure names.
  7. The harness presents contract framing per §24.5 (the form depends on first_contact, returning, or continuing tier).

Phase 2: procedure loading.

The harness matches the customer's task description against the skills-graph tags and applies_to fields. It selects the one most relevant procedure skill and invokes it via the Skill tool. If the graph identifies prerequisites for the selected skill that are not yet completed for this customer, the harness informs the customer before loading the procedure.

Status gating. The harness checks the procedure skill's status at load time:

  • stable and beta skills are surfaced to customers by default.
  • draft and alpha skills are gated behind explicit customer opt-in, presented as: "This procedure is in testing. It may be incomplete. Do you want to proceed with this version?" A customer who declines is directed to the closest stable or beta alternative, or told that no ready procedure exists yet for their case.

Phase 3: during the session.

The harness appends observations to the session buffer at <USER_DATA_DIR>/sessions/<session_id>/observations-buffer.jsonl as they surface during conversation. It applies L1 scrub (regex pass from the cached scrub rules plus LLM judgment over free-text fields for PERSON names, addresses, dates of birth, and business names tied to individuals) before each append. The harness does not submit per-event; submission happens at session close.

Phase 4: session close.

Per the feedback-template contract: the harness presents the buffer to the customer in plain language, obtains approval per item, and then executes the two-call validate-then-stage submission. This is mandatory for any session that buffered at least one item. The harness MUST NOT submit without customer review. The validate-then-stage pattern is described in §6.7 (see schemas.md) and §15.7 (see skills.md) obligation 1.

Degradation mid-session. If a capability detected at session start is lost mid-session (for example, an MCP connection drops or <USER_DATA_DIR> becomes unwritable), the harness downgrades gracefully to the highest still-available tier and informs the customer of the change. It does not ask the customer to take action unless the change affects in-progress work (for example, a pending observation can no longer be submitted deterministically; in that case the harness asks whether to retry via HTTP or hold for the next session).

24.4 Capability tiers

The harness detects its operating tier silently before any customer interaction. Detection probes run in tier order; the harness uses the highest tier it confirms.

v1 support summary. T2 is the canonical v1 target. T3 is the recommended v1 mode. T1 is a degraded fallback with an explicit upgrade prompt. T0 is a sampler for customers who have not yet installed the Project. T4 is post-v1.

Tier Detection probe Customer experience v1 support
T0 (paste-prompt sampler) No harness skill activated; customer pasted a prompt referencing becivic.be/agents Agent fetches becivic.be/agents, operates in advice-only mode, keeps the observation buffer in conversation memory, submits via HTTP at session close. Works on any AI that can fetch URLs (ChatGPT, Gemini, Le Chat, free-tier Claude Chat without the Project). No persistent state. sampler only
T1 (read-only skill mode) Harness SKILL.md is loaded; write probe for <USER_DATA_DIR> fails Skill is loaded, Skill-tool procedure routing works, HTTP submission works. No persistent state across sessions; the customer's experience is single-session like T0. This is the free-tier Claude Chat tab with the Project installed: skill content is readable, Project Files are readable, but the agent cannot write Project Files, so profile.json and memory/ cannot persist. degraded; explicit upgrade prompt
T2 (Cowork; canonical) T1 confirmed; write probe for <USER_DATA_DIR> succeeds T1 capabilities plus persistent profile.json, memory/, and per-session buffers at <USER_DATA_DIR>/sessions/<session_id>/. This is the Claude Desktop Cowork tab with a connected folder at ~/.be-civic/. The supported target tier for v1. canonical
T3 (Cowork plus MCP) T2 confirmed; agent environment contains tools matching the mcp__becivic* prefix T2 capabilities plus deterministic MCP submission tools (mcp__becivic__submit_observation, mcp__becivic__get_skill_graph, mcp__becivic__get_skill_observations, etc.). Customer has connected mcp.becivic.be via account Settings > Integrations. The recommended mode for v1. recommended
T4 (Cowork plus MCP plus hooks) T3 confirmed; skill frontmatter hooks are active T3 plus UserPromptSubmit hook for MEMORY.md injection per turn and PostToolUse hook for analytics events. Confirmed for the Claude Code tab; Cowork hook support is an open question (§18). post-v1

Path-traversal capabilities by tier. Two declared_capabilities apply to Path Directory work (per §6.7 (see schemas.md) tiering): path_traversal (the harness can read the Path Directory, evaluate eligibility predicates against the customer's profile, attempt sources in priority order, and submit per-attempt validations) and path_handoff (the harness can present the structured actor.handoff text to the customer and pause for the customer's "done" signal before resuming). Both capabilities are declared at T2 and above, since both depend on customer-side state (the harness needs profile.json to evaluate eligibility predicates deterministically) and on a multi-turn surface (handoffs require the customer to act between turns and signal completion). T0 sampler harnesses MAY route paths in advice-only mode — naming the source, the URL, and the plain-English handoff text — but MUST NOT execute audited deliveries (audited_document_delivery: true sources) because they have no persistent consent surface and each audited call costs the customer a real document delivery. T1 harnesses inherit the T0 audited-delivery prohibition for the same reason: no write-back means no recorded consent. T2 and T3 harnesses MAY execute audited deliveries after the per-source consent gate in §24.9. The traversal algorithm itself is in §24.9.

Detection-first principle. The harness detects what it can without asking the customer. The only question it asks at session start is "what brought you here today?". It does not ask "what tools do you have available?" or "can you write files?". It probes silently and reports the detected tier as part of its opening framing so the customer can verify and correct.

Upgrade guidance. If the harness detects T0 or T1, it MUST offer the customer an upgrade path at the end of the opening framing. The framing acknowledges the price, names the specific capability gap (file write-back), and promises to maximise the current-tier experience anyway. The harness MUST source the current Cowork price from its own training knowledge with an "as of " qualifier so future agents can detect when their knowledge is out of date. Suggested language:

  • From T0: "Installing the Be Civic Project in Claude Desktop's Cowork tab lets you resume across sessions and keeps your information on your own machine. Cowork costs around 20 EUR per month (as of May 2026; check current pricing). I'll do my best to help in this session."
  • From T1: "The best Be Civic experience is in Claude Desktop's Cowork tab on your computer. There I can save your progress to a folder on your machine and pick up next session. In free-tier Chat I can only help with one question at a time and I cannot save files for you. Cowork costs around 20 EUR per month (as of May 2026; check current pricing). I'll do my best to maximise this session regardless."

The harness moves on without waiting for an upgrade decision. The harness MUST NOT promise cross-session memory at T0 or T1 (see §24.6 stateless-client policy).

T1 framing rationale. Free-tier Claude Chat with the Be Civic Project installed gives the agent skill content and read access to Project Files but no write-back. That breaks the customer-side state contract (C7 amendment §8.7 (see privacy.md)), which requires the agent to update profile.json and memory/ files in place. Project Memory is a vendor-managed key-value store; it is not customer-side state in the sense of §3 principle 11 because the customer cannot inspect or delete it as a single artifact. T1 is therefore supported as a degraded fallback rather than a target mode.

24.4.1 T1 graceful degradation patterns

T1 (free-tier Chat with the Be Civic Project installed) is a read-only advice tier. The agent reads skill content from the Project but cannot write any state. There is no supported "manual paste" workaround: the harness MUST NOT instruct the customer to copy file contents from chat, save them manually, and re-upload on the next turn. That pattern was trialled as a fallback during pre-plugin design and retired with the plugin-as-bootstrap commitment (2026-05-17 amendment, principle 13).

At T1 the harness:

  • Delivers advice-only guidance for the session, citing the relevant skill body.
  • MUST NOT attempt to write profile.json, memory/, or any session state file.
  • MUST NOT promise cross-session continuity.
  • At session end, offers a single upgrade prompt: "The full Be Civic experience — including saving your notes between sessions — is available via the Be Civic plugin in Claude Desktop's Cowork tab. Here is the install link."

T1 does not satisfy the §3 principle 11 customer-side-state contract. It is offered as a degraded sampler only.

Degradation chain. When a tier capability is detected at session start and then lost mid-session (for example, an MCP connection drops, or filesystem write access is revoked), the harness downgrades gracefully and informs the customer of the change. The harness MUST NOT ask the customer to take action unless the change affects in-progress work. The fallback order for any data fetch or submission is:

  1. MCP tool call if the corresponding mcp__becivic* tool is in the agent's tool list.
  2. HTTP endpoint at becivic.be/api/<path> if MCP is unreachable. This is the §24.7 parity path.
  3. Web fetch of the canonical content from becivic.be (skill bodies at becivic.be/skills/<id>, agents.mdx at becivic.be/agents, etc.) if HTTP API is also unreachable. The harness MAY cache fetched skills locally for the rest of the session and, at T2+, persist them to <USER_DATA_DIR>/skills-cache/ (see §8.7.2 (see privacy.md)) so they can be re-used as actual skills, not just scratch markdown.

The web-fetch fallback is the floor: as long as becivic.be is reachable, the harness can operate.

T4 reserved for post-v1. v1 does not depend on hooks. Hooks are layered on top of T3 once Cowork tab hook support is confirmed (§18). Until then, T3 is the highest mode v1 supports as a recommended configuration.

24.5 Three-tier returning-user adaptation

The harness adapts its opening framing based on the customer's history. The adaptation reads only local state (profile.json and the active_procedures list under <USER_DATA_DIR>/memory/); the harness does not query any Be Civic server for customer history. The harness MUST NOT promise an adaptation it cannot deliver: at T0 and T1 the harness has no persistent state and therefore always operates as if first_contact.

first_contact. profile.json does not exist or is empty; active_procedures is absent. The harness delivers full contract framing: what Be Civic is, what observations are, what the buffer protocol is, that document content is not retained beyond routing fields. The document-content-discard rule is stated in plain language: "If you show me a document, I will read the parts relevant to routing your case and will not keep a copy." This tier maps to the disclosure required by §3 principle 10.

returning. profile.json exists with at least one populated field; active_procedures is either empty or does not include a procedure matching the current request. The harness opens with abbreviated contract reference ("I have notes on your situation from our previous session") and asks one question: "Has anything changed since we last spoke?" It does not re-ask routing fields it already has. It pre-populates routing from the existing profile.

continuing. profile.json exists; active_procedures contains a procedure id matching the customer's current request. The harness loads memory/procedure_progress_<id>.md at session start and resumes from the last recorded step. It skips contract framing (already given in a prior session). It opens with: "We were working on [procedure title]. You were at [last recorded step]. Shall we pick up there?"

multi_active (extension of continuing). profile.json exists; active_procedures contains MULTIPLE procedure ids that are all in flight. The customer's current request typically matches one of them, but the harness MUST keep state for ALL active procedures in conversation memory simultaneously, not only the currently-focused one. On session start the harness loads every memory/procedure_progress_<id>.md referenced in active_procedures. On customer pivot ("actually, let's switch to the address-change") the harness:

  1. Persists the current procedure's progress state to its procedure_progress_<id>.md file before switching focus
  2. Surfaces the target procedure's last recorded step from its own progress file
  3. Confirms the pivot with the customer: "Picking up [target procedure title] from [last step]. Your nationality progress is saved; we can return to it anytime."

The harness does NOT close out a procedure on pivot. A procedure leaves active_procedures only when (a) the customer explicitly closes it as completed, abandoned, or no-longer-relevant, or (b) the §8.8 active-window retention timer expires. Observations submitted during the session carry the skill_id of the procedure they apply to, not the procedure currently in focus; the harness MUST attribute observations correctly even when the customer is pivoting (e.g., an observation about nationality-application filed five minutes after a pivot to address-change still carries skill_id: nationality-application).

State layout. The file layout for profile.json, memory/, and sessions/ is defined in the C7 amendment (§8.7 (see privacy.md)). The harness MUST use the layout defined there and MUST NOT introduce additional state locations.

Routing fields versus narrative. profile.json stores only the enum-and-categorical routing fields catalogued in §8.7.1 (see privacy.md) (the 16-axis routing catalogue). Narrative content lives in memory/ files. The harness MUST NOT write narrative content to profile.json and MUST NOT write routing fields to memory/ files.

24.6 Onboarding: post-plugin first-contact flow

The 2026-05-17 plugin-as-bootstrap amendment (principle 13) locked the install gesture. The universal post-plugin onboarding flow has three beats: gate-classification + confirmation, branded form delivery, and folder-mount-after-submit. The normative universal flow is specified across §24.3 (runtime composition), §24.5 (returning-user adaptation), §15.7 (harness obligations), and protocol.md §23.2 (MCP wire surface).

The Cowork-host-specific instantiation — mcp__visualize__show_widget for form delivery, sendPrompt for form-payload return, mcp__cowork__request_cowork_directory for folder mount, on-disk fallback HTML at ${CLAUDE_PLUGIN_ROOT}/skills/bc-onboarding/references/onboarding.<locale>.html — is documented in cowork-plugin.md §3. Sibling harnesses (a future ChatGPT-app harness, etc.) will get analogous flow documents in their own harness specs.

24.6.1 Cowork plugin install (primary path)

The primary call-to-action on becivic.be is a link to install the Be Civic plugin via the Cowork Plugins directory (or the marketplace-by-GitHub-URL fallback). The plugin install delivers the complete consumer-side runtime — meta-skills, drafter sub-agents, scripts, schemas, starter state templates — under the Cowork plugin runtime in one gesture.

Plugin contents (summary). The plugin bundle contains the harness CLAUDE.md template, the be-civic gate skill and the bc-* peer skills, the drafter sub-agents, the preamble / scrub / scan scripts, the profile and concern schemas, and the alpha-operational data files (privacy attachment, privacy snippet, mini-header strings). Full file layout is documented in cowork-plugin.md §2.1.

${CLAUDE_PLUGIN_ROOT} path resolution. The Cowork plugin runtime sets ${CLAUDE_PLUGIN_ROOT} to the plugin's installed (read-only) folder root and ${CLAUDE_PLUGIN_DATA} to a writable plugin-data location. The harness resolves these at session start (see preamble.py). User-picked-parent BeCivic/ folders for per-procedure state are created later by bc-onboarding via mcp__cowork__request_cowork_directory after form submit; the platform-conventional <USER_DATA_DIR> lookup in privacy.md §8.7.3 applies only as a degraded-mode fallback for harnesses without the plugin.

Skill-matching at session start uses in-thread parallel tool calls (α path, per 2026-05-17 onboarding-rebuild-design). mcp__becivic__find_skill and mcp__becivic__read_skill resolve live against the becivic.be catalogue (or HTTPS / WebFetch per §24.4.1 fallback chain). The plugin does not ship a local snapshot of procedure skill bodies — procedure content is MCP-delivered. Only the meta-skills (be-civic gate, bc-onboarding, bc-discovery, bc-document-handler, bc-path-traversal, bc-session-close, bc-dossier-compilation) and harness CLAUDE.md template live in the plugin bundle.

Customer steps.

  1. Open or sign in to Claude Desktop on a paid plan. Cowork tab requires paid.
  2. In Claude Desktop, go to the Cowork tab and open Plugins. Search for "Be Civic" and click Install. (Pre-launch: install via Personal → Add marketplace from GitHub and paste the plugin repo URL. Pre-launch fallback: download a release zip and use My Uploads.)
  3. Optional: in Settings > Integrations > MCP, add mcp.becivic.be. One-time step.
  4. Open a session in the Cowork tab and describe the administrative task. The harness auto-activates.

The customer does not download a zip, select a folder, or paste any text into Project Instructions at install time. Steps 1–4 are the complete setup flow. The folder picker (mcp__cowork__request_cowork_directory) fires later — after the user accepts the confirmation gate and submits the onboarding form — per cowork-plugin.md §3.3.

After step 4 (session opens). The harness invokes bc-onboarding. The session-start UX is lead-with-form — a single batched form built from the matched procedure's inputs: frontmatter, presented immediately. Below the form sit two elements: (a) a single bundled "I'm part of the Be Civic alpha" consent checkbox covering anonymous telemetry + observation submission ("part of the pre-launch deal" framing); (b) a collapsible "Why I'm doing it this way" block carrying the 4-clause trust contract, default closed. The 4 clauses (anonymity, document discipline, forward-only state, review-before-submit) then surface just-in-time as one-line teaches when their trigger-action fires.

Subsequent sessions. When the user opens a previously-mounted BeCivic/<project>/ subfolder as a Cowork project, the harness CLAUDE.md (copied by bc-onboarding at folder-mount time) loads as Project Instructions and drives the session. The opening framing takes the abbreviated form described in §24.5; the returning-user mini header fires per cowork-plugin.md §3.9.

24.6.2 Paste-prompt sampler (T0 fallback)

The paste-prompt path is retained as a sampler for customers who want to try Be Civic without installing the Project. The prompt template is:

"Help me with [task]. Read https://becivic.be/agents and follow Be Civic's protocol."

When an agent receives a message that includes the becivic.be/agents URL, or when it identifies a Belgian administrative task and finds no harness skill loaded, it operates under T0 semantics.

T0 boot-loader behaviour.

  1. The agent fetches becivic.be/agents.
  2. The /agents page identifies T0 from the absence of a harness skill, presents tier-detection instructions, and directs the agent to fetch the skills-graph via the HTTP endpoint (§24.2).
  3. The agent operates in advice-only mode: it keeps the observation buffer in conversation memory, presents the buffer to the customer at session close per §24.3 Phase 4, and submits approved items via HTTP POST. It does not accumulate profile state.
  4. At session close, the agent offers the customer an upgrade to T2: "Installing the Be Civic Project in Claude Desktop's Cowork tab keeps your notes between sessions. Here is the link." One sentence; no further pressure.

Stateless-client policy. The harness MUST NOT promise cross-session memory at T0 or T1. Both tiers are stateless from the customer's perspective, and the customer is told this explicitly in the opening framing.

Why Cowork specifically. Cowork tab is the lowest-priced Anthropic surface that gives the agent write access to a customer-owned folder. This satisfies the §3 principle 11 customer-side-state requirement without server-side per-customer state. Free-tier Chat with a Project gives read-only file access to the agent, which is insufficient.

Non-Claude clients. The install path described in this subsection is Claude.ai-specific. Other agentskills.io-conformant clients have their own install paths. Specifying each client's install path is out of scope for v1 and is deferred to v1.5 and later (§18). The /agents boot-loader page SHOULD describe the general agentskills install mechanism and link to the relevant client's documentation where possible.

24.7 HTTP-agent observation fetch parity

The MCP server at mcp.becivic.be exposes a get_skill_observations tool that returns pending and committed observations for a given skill id. Agents that operate at T0 or T1 have no MCP tools and therefore no equivalent path. Without an HTTP parity endpoint they cannot surface <Observations> content to customers, which breaks the "community observations surface to future customers" feedback loop.

Endpoint. GET /api/skills/<skill_id>/observations.

  • Returns the same JSON shape as the MCP get_skill_observations tool response.
  • Authentication requirements and rate limits are identical to the existing observation-submission endpoints (§6.7 (see schemas.md)).
  • The endpoint MUST be listed in agents/manifest.json so that T0 and T1 agents discover it at session start rather than hardcoding the URL.

Filter parameters. The endpoint accepts the same filter parameters as the MCP tool: status (e.g. pending, committed), since (ISO timestamp), limit (integer). Default values agree with the MCP tool defaults.

Caching. Responses carry Cache-Control: public, max-age=30, s-maxage=30 so that bursts of session starts in the same minute hit edge cache. The TTL is shorter than the skills-graph TTL because observation content is more time-sensitive.

Conformance. Any harness operating at T0 or T1 MUST use this endpoint to populate <Observations> content in procedure-skill bodies. A T0 or T1 harness that displays a procedure skill body without resolving its <Observations> tags is non-conformant.

24.8 Procedure-not-in-corpus handling

When the skill-graph (§24.2) returns zero matches for the customer's current request — that is, the requested procedure has no Be Civic skill — the harness MUST route to the no-skill-fallback meta-skill, identified in the skill-graph as meta-no-skill-fallback. The meta-skill is a first-class node in the corpus and follows the same lifecycle as other procedure skills (draft → alpha → beta → stable per §9 (see lifecycle.md)).

The meta-skill MUST:

  1. Acknowledge openly. State to the customer in plain language: "Be Civic doesn't have a verified skill for this procedure yet. I'll work through it with you using general knowledge and live research, but treat what I say as a starting point to verify with your commune, not as Be Civic's confirmed guidance." The acknowledgement is non-negotiable and runs even if the customer is at T2/T3 with full state persistence.

  2. Run a Be-Civic-style process. Apply the same discipline that procedure skills follow: cite sources inline (Belgilex links, commune procedure pages, Moniteur belge official text where applicable), make conservative claims, identify failure modes, end with a "verify with your commune" checklist. The customer experience MUST be substantively similar to a verified-skill procedure; the difference is in the source confidence level, not the surface behaviour.

  3. Research as the session progresses. The meta-skill carries explicit research-method guidance: prefer authoritative sources (Belgilex, commune sites, official Moniteur text); avoid blog posts, social media, or third-party legal commentary as primary sources; treat sources older than 12 months as a freshness concern; cite source dates inline. The meta-skill's body documents the research method; the harness applies it via the same skill-loading mechanism used for verified skills.

The meta-skill SHOULD:

  1. Reuse existing corpus skills for sub-parts. Many Belgian admin processes share sub-procedures (apostille, sworn translation, residence-permit renewal, commune attestation). When the meta-skill identifies a sub-part for which a verified Be Civic skill exists, it MUST load that skill via the standard skill-graph routing and use its verified guidance for the sub-part. The meta-skill MUST flag to the customer which parts of the session are verified-from-corpus and which are research-as-we-go. This both improves the customer's confidence and surfaces sub-part reuse opportunities for the drafting step (5).

  2. Offer a skill-draft submission at session end. After the session has worked through the procedure end-to-end, the meta-skill SHOULD offer to package the worked-through procedure as a skill_draft submission per §6.2 (see schemas.md). Submission is OPT-IN and requires explicit customer agreement; the customer reviews the draft text before it leaves their machine. The draft enters the standard skill-drafting protocol (§15 (see skills.md)) for community validation. The submission inherits the per-item review gate in §15.7 (see skills.md) obligation 8.

Routing. The harness queries the skill-graph at session start (§24.2). If the graph response has zero nodes matching the customer's request after applying filters, the harness routes to meta-no-skill-fallback. The meta-skill node MUST be present in every skill-graph response with status: stable once it ships; the harness depends on it as a routing floor and a missing meta-skill node is a corpus-publishing error.

Implementation reference. The drafting-skills skill (existing in bc-docs/skills/<existing-id>; reviewed at v1.1 implementation time for fit) provides the canonical guidance for the skill-draft submission flow. The meta-skill's session-end submission step delegates to that skill's instructions; the meta-skill does not duplicate the drafting protocol. If the drafting-skills skill is missing or stale, the meta-skill's submission step degrades to producing a draft markdown blob and instructing the customer to copy/paste it into a manual skill-draft submission form (becivic.be/agents/submit/draft).

Customer-side state. Active sessions running the meta-skill MUST be reflected in profile.json active_procedures like any other procedure — the entry has the form no-skill-fallback:<short-slug-describing-customer-request> so the procedure_progress file can be reloaded in subsequent sessions if the customer returns to it. Once a draft is submitted (or the customer declines to submit), the entry is moved to memory/archive/ per §8.8 (see privacy.md).

24.9 Path traversal algorithm

The Path Directory (§6.12 (see schemas.md)) catalogues the routes by which the customer obtains documents, reaches interactive tools, and hands off to commune service desks. Each path entry carries one or more sources, ordered by priority, each with its own eligibility predicates, authentication shape, procedure template, validation signals, and an explicit actor block declaring who does what (the harness, the customer, or both) and how the handoff is presented when responsibility shifts. This subsection specifies the algorithm the harness applies to traverse a path on the customer's behalf. The algorithm is deterministic in its sequencing; the LLM-driven judgment is restricted to authoring per-attempt validation rationales and to interpreting customer responses at handoff points.

The traversal algorithm. Given a path id and the customer's profile, the harness proceeds as follows.

First, the harness loads the path entry from the Path Directory catalogue. If the path does not exist, the harness reports the missing entry to the customer in plain English and does not synthesise a fallback route from general knowledge; missing paths are corpus-publishing concerns and the customer is told the corpus does not yet cover this case.

Second, the harness checks whether the path applies to the customer. The path's coarse-grained applies_to block is evaluated against the customer's profile. If the customer's situation does not match — for example, a marriage-certificate path on a customer who has never been married — the harness reports the mismatch to the customer in plain English and returns without attempting any source.

Third, the harness filters the sources by their per-source audience predicates. Each predicate is structured (field op value); the harness evaluates each one against the customer's profile fields. Sources whose audience does not match the customer are removed from consideration entirely — they are never tried, never validated, never offered. If no source remains after filtering, the harness asks the customer whether to search online directly or visit their commune. This is the eligibility-first invariant.

Fourth, the harness orders the eligible sources. Sources are partitioned into non-fallback and fallback groups; within each group they are sorted by priority in descending order; the final ordering is non-fallback first, then fallback. The fallback_only: true flag is honoured: such sources are tried only after all non-fallback options have been attempted or declined. By spec invariant (§17 settled decisions), every source_class: offline source carries fallback_only: true, which means commune visits and postal or email requests sit at the bottom of every traversal. This is the commune-last invariant.

Fifth, the harness attempts each source in order. Before each attempt the harness checks two consent gates. If the source is flagged audited_document_delivery: true, the harness MUST present the consent surface in plain English and obtain the customer's explicit agreement before invoking the source; audited deliveries each generate a real document delivery and cost the customer a real action. If the source carries an authentication wall (actor.handoff.when == auth-wall), the harness MUST present the agent_responsibility, user_responsibility, and resumption text from the source's actor block before handing off. The harness MUST NOT hand off silently. Once consent and handoff text have been presented, the harness executes the source's procedure: a deeplink after authentication, a captcha-gated form, a typed-API fetch, a federal anonymous form walk-through, or an offline instruction set, depending on source_class. When the procedure completes, the harness evaluates the source's validation_path against the observed result: a content-type and PDF-magic check for tier-1 quickLinks, a form-success page check for anonymous federal forms, a customer's verbal confirmation for offline routes, and so on.

Sixth, the harness submits a validation for every attempt. On success it submits verdict: confirm against the source id. On failure it submits verdict: reject with a structured rationale naming the failure signal (404, redirect to the auth root, captcha unsolvable, page text matched a service-down string, customer reported the document was not produced, and so on). Per-attempt validation is non-optional: it is how the catalogue learns which sources are currently working and which have rotted. This is the validation-per-attempt invariant.

Seventh, if a source succeeds, the harness records the outcome on the customer's procedure progress (the artefact obtained, the source that produced it, the timestamp) and returns. If a source fails, the harness moves to the next source in the ordering and repeats from the consent-gate step. If every source has been tried and failed, the harness asks the customer to choose between searching online themselves, visiting their commune, or skipping the document for the session.

Three invariants. The algorithm is constrained by three invariants that the harness MUST satisfy at every traversal step.

The first invariant is eligibility-first. Sources whose audience predicates do not match the customer's profile are never tried. The harness MUST NOT attempt a source on the chance that the predicate is wrong; if the predicate is wrong, the correct response is an observation submitting an accuracy_concern against the catalogue entry, not a probe. Probing audited-delivery sources at random is specifically prohibited.

The second invariant is commune-last. The harness MUST NOT suggest a commune visit until every online source has been tried and either failed its validation_path or been declined by the customer at a consent gate. The single exception is when the customer explicitly chooses to visit the commune at the all-sources-exhausted prompt. The reason is operational: commune visits cost the customer half a day of work, transit, and queueing; online sources cost minutes or seconds. The catalogue's traversal order encodes that cost asymmetry.

The third invariant is consent-before-audited-delivery. Sources flagged audited_document_delivery: true produce real, audited document deliveries each time they are invoked. The harness MUST obtain the customer's explicit consent before each such invocation. Consent is per-call, not per-session: a customer who agreed to fetch a marriage certificate has not consented to fetch a residence certificate; the harness asks again. Testing harnesses MUST NOT probe these sources blindly; integration tests use fixture sources flagged audited_document_delivery: false and source_class values reserved for testing.

Worked example: certificat-residence-historique. A customer is preparing a nationality declaration under article 12bis. The procedure skill nationality-application declares requires_paths: [{id: certificat-residence-historique, role: submission, timing: pre-filing}]. The customer's profile records commune.region = brussels, commune.nis5 = "21013" (Etterbeek), and has_eid = true.

The harness loads the certificat-residence-historique path entry. The path's applies_to block matches the customer (any resident with at least three months of registered residence); the harness proceeds to source filtering.

The path lists three sources. The first is brussels-tier1-quicklink with source_class: brussels-tier1-quicklink, audience.predicates = [{field: "user.commune.region", op: "eq", value: "brussels"}], priority: 90, audited_document_delivery: true, and actor.handoff.when: auth-wall. The second is a federal handoff with source_class: federal-auth-handoff, audience matching all federal residents, priority: 60, actor.handoff.when: full-takeover. The third is offline with source_class: offline, audience matching the customer's commune, priority: 10, fallback_only: true, actor.handoff.when: physical-presence.

The harness evaluates audience predicates against the customer's profile. The Brussels source matches (user.commune.region == brussels); the federal handoff matches (all residents); the offline source matches (commune in the audience list). All three are eligible.

The harness partitions and orders: non-fallback group is [Brussels tier-1 (priority 90), federal handoff (priority 60)], ordered by priority descending; fallback group is [offline (priority 10)]. The final ordering is Brussels tier-1, then federal handoff, then offline.

The harness attempts the Brussels tier-1 source first. The source is flagged audited_document_delivery: true, so the harness presents the consent surface: "The Brussels Irisbox quicklink generates a real, audited residence certificate each time we call it. Want me to use this route? You will need to authenticate with your eID or itsme." The customer agrees. The harness also presents the actor.handoff text: "I'll give you the link. You sign in with itsme or your eID, and your residence certificate will download as a PDF. Save it to your Be Civic folder and tell me when you have it." The customer authenticates, downloads the PDF, saves it, and says "got it". The harness evaluates the validation_path: the customer's "got it" satisfies the success signal for this source class. The harness submits validation: {target: brussels-tier1-quicklink, verdict: confirm} and records the artefact on the customer's procedure progress. The traversal ends.

If the Brussels tier-1 source had returned a 404 or a redirect to the Irisbox root (one of its declared failure signals), the harness would have submitted validation: {target: brussels-tier1-quicklink, verdict: reject, rationale: "404 on quicklink URL"} and proceeded to the federal handoff source. If the federal handoff also failed, the harness would have offered the customer the all-sources-exhausted prompt: "All online routes failed. Would you like me to walk you through visiting your commune, search online for another route, or skip this document for now?" The offline source is only proposed at this prompt, never preemptively.

Caching and re-traversal. The harness MAY cache a successful source attempt on the customer's procedure progress so that subsequent sessions know which source worked. Re-traversal in a later session SHOULD start from the previously-successful source if the customer needs the document again, but MUST still apply the consent gate before audited deliveries: the customer's consent to an earlier fetch does not extend to subsequent fetches. If the previously-successful source fails on re-traversal, the harness submits a reject validation and proceeds through the full ordering as on a first traversal.

Appendix — Judgment calls during rework

Recorded for prior passes — round-5 historical context. Round-6 supersedes most of these decisions; entries are retained as audit trail but should not be read as current architectural authority. Current architecture is described in §3–§5, §13, §16–§19, §24; current settled decisions are §17.

Source: archive/specification-pre-split-2026-05-11.md §"Judgment calls during rework". Not included in any sub-spec at the 2026-05-11 split; appended here as architecture.md is the closest home for round-5 architectural deliberation. Added 2026-05-11 post-split reconciliation.

  1. Worker storage of proposal submitter IP-hash beyond 24h staging. RESOLVED (2026-04-26 reconciliation). The self-validation prevention check (G.7) requires the Worker to know the original submitter's IP-hash for the entire time a proposal sits in alpha or beta. Resolution: the Worker keeps a per-proposal hashed-IP record using a per-proposal salt (separate from the daily-rotating salt used for rate limits). The proposal's state is the retention key — when the proposal reaches stable, rolled_back, or quarantined, the per-proposal IP record (and its salt) is destroyed. This makes self-validation prevention work uniformly across the proposal's lifetime regardless of how many daily salt rotations have elapsed for rate-limit purposes. Documented in §7 (see protocol.md) (trust signals), §8.3 (see privacy.md) (validation pipeline + self-validation prevention), and §17 (settled decisions).

  2. docs.json regeneration as part of state-machine Action. I asserted this lives in tools/scripts/regenerate-docs-json.ts and is called by the state-machine Action whenever version_status changes. Mintlify reads docs.json for navigation; this keeps the alpha/beta/stable nav in sync. Alternative: have Mintlify read frontmatter directly. Mintlify versioning per G.4 has no inheritance, so explicit regeneration of docs.json is the correct pattern.

  3. Roll-back retains the proposal directory. RESOLVED (2026-04-26 follow-on refinement). Rolled-back and quarantined proposals are not deleted and not moved. They stay where they were committed at skills/<id>/proposals/<proposal-id>/proposal.md with version_status: rolled_back or quarantined. The archive/ directory is reserved exclusively for superseded stable canonicals. The regenerate-docs-json Action drops these proposals from nav so they stop being served, but the file persists indefinitely for audit. Recoverable later if proven correct: a maintainer manually moves version_status back to alpha; no special un-rollback flow in v1. Pinned in §5, §9.2 (see lifecycle.md), §17.

  4. skill_amendment produces a full proposal.md, not a delta file. The submission carries a typed change payload (unified diff for body, field_path + proposed_value for frontmatter, or add / modify / remove for references — see §6.2.2 (see schemas.md) hybrid shape), but the committed proposal is a full materialised post-amendment skill body. The consumer composes it via tool_execution against the canonical content (applying the diff for body, splicing the typed value for frontmatter, mutating the references list for references). This makes the alpha/beta serving model uniform: every proposal at proposals/<pid>/proposal.md is a complete renderable skill. Alternative: store the change payload and apply at serving time — rejected as more complex for no obvious gain.

  5. Skill-amendment commit writes both the proposal markdown AND a .meta.json sidecar. The sidecar carries the original amendment_id, declared_capabilities, references array, amendment_type, and the type-specific change payload (body_diff / frontmatter_change / references_change) — data that's useful for audit but doesn't belong in the skill body's frontmatter. Pattern is the same for skill_draft proposals.

  6. held-for-review/<type>/<id>/ path for NER holds. I introduced this directory; it's not in any prior section. The alternative was leaving the file at its canonical path with a frontmatter held_for_review: true. I chose a separate directory because (a) Mintlify navigation can exclude the directory cleanly via docs.json, (b) it's harder to accidentally fetch via a content-link, (c) the maintainer reviewing the queue has one obvious place to look. Spec author may prefer the in-place flag pattern.

  7. The version_status and proposal_id validation rules in CI (§10.1 (see lifecycle.md)) — I asserted these as required CI checks. They follow naturally from G.4 but were not explicitly written in the open-questions doc. Recorded.

  8. Daily-salt IP-hash distinct-IP counting in state machine. RESOLVED (2026-04-26 reconciliation). Same root concern as judgment call 1. Resolution: state-machine distinct-IP counting uses the per-proposal salt rather than the daily-rotating salt. Validations from the same IP across different days hash to the same value within a proposal's lifetime, so distinct-IP de-duplication is straightforward. The daily-rotating salt is retained for rate-limit counters only (where the short window is intentional). The per-proposal salt is destroyed when the proposal terminates, so the IP record does not outlive the state-machine work that needed it. Documented in §7 (see protocol.md) (trust signals), §8.3 (see privacy.md) (distinct-IP counting paragraph), and §17.

  9. The meta-skills set drops meta-validate-skill-graph. Per G.10 this is settled; I removed all spec mentions of it as a skill and replaced with the script reference. The category: meta skills in v1 are meta-draft-l1-skill and meta-decompose-process only (per G.11). The two-skill set is correct for v1; G.11 also lists the two prose entry pages and the contract as the rest of the meta surface.

  10. §13 reduced significantly. The previous §13 described a tools/reference-consumer/ codebase with core + adapters per platform. Per D.1 redirect this is dropped; the new §13 is a documentation surface. I retained the docs/reference-consumer.md file as the documentation home and pointed becivic.be/agents (Mintlify) as the public surface. There is no shipped reference-consumer code in v1.

  11. Section §11 (Citation rot) preserved verbatim per the rework map. No changes.

  12. Section §6.4 (see schemas.md) (communes) preserved with minor formatting. The detailed language tables in the previous spec were operationally useful but the rework map said "no change" by implication; I kept the agent logic and field rules but trimmed the long language-table prose. If the spec author wants the full tables back, they can be re-pasted from the prior version verbatim.

  13. Section §8.2 (see privacy.md) (submission contract) is intentionally short — describes the contract's role and structure but does not duplicate content. The parallel rewrite of docs/submission-contract-v1.mdx is the source of truth for contract content. I included the alpha/beta UX excerpts because the spec needs to be self-contained on the wording the state machine references.

  14. Phase 0.5 (Mintlify) inserted as its own phase. The original v14 sequencing had no Mintlify step; I inserted Phase 0.5 between Phase 0 (DNS / Cloudflare / GitHub App) and Phase 1 (schemas). This matches the rework map's "Phase 0.5 = Mintlify OSS Program application + custom domain" instruction.

  15. compaction.yml and tools/compaction/* removed from §5 layout. Compaction is deferred to v1.1; I removed the directory entries to avoid implying they exist. The compaction config / prompts directories will be added in v1.1 if the job is built.

  16. Section §3 principle 4 collapsed two prior paragraphs. The old principle 4 was about consumer-side LLM scrub + Worker regex + commit NER auto-revert. The new principle 4 incorporates G.14's structural framing (schema bans, length caps, hashed IPs, no body logging, NER held-for-review). It is denser but the items are pinned individually rather than as a single sentence.

  17. Section §17 expanded from the prior §17. The original "settled decisions" list was largely preserved; I added new sub-sections for Mintlify, the four submission types, capability tiers per type, state machine, structural PII protection, citation handling per G.13, and the maintainer-as-constitutional-court pattern. Removed mentions of compaction as the primary corpus-growth path.

  18. Section §6.5 (see schemas.md) example output — I kept the per-skill index entry format and added an active_proposals array. The aggregates per proposal (confirms, rejects, distinct_ips) are duplicated from validations/<pid>/*.json for fast index access; the state-machine Action regenerates this on each tick.

  19. Section §6.8 NER-on-commit description — I describe the held-for-review path inline at §6.8 (cross-referencing §8.5 (see privacy.md)) and again in detail at §8.5 (see privacy.md). Slight redundancy is intentional: §6.8 is the rules file and the three scrub points; §8.5 (see privacy.md) is the operational detail of the third point.

  20. Anti-pattern about "Maintainer running individual-submission review in steady state" — added as a new anti-pattern. The maintainer-as-constitutional-court pattern is a substantial behavioural reframe; explicitly calling it out as an anti-pattern should help.

  21. skill_amendment payload shape — field-based vs diff-based. RESOLVED (2026-04-26 follow-on refinement) — hybrid, typed by amendment target. Earlier drafts had an unresolved tension: the prior field_name + proposed_value shape worked cleanly for typed frontmatter scalars but was impractical for body changes (a 2000-line proposed_value defeats the "small, validatable amendments" intent). A pure-diff shape would have over-served frontmatter (where the schema is structured, not prose) and references (which is registry-shaped). Resolution: the payload is hybrid, dispatched on amendment_type ∈ {body, frontmatter, references}. Body uses unified diff (body_diff); frontmatter uses field_path + typed proposed_value (frontmatter_change); references uses add / modify / remove operations (references_change). Multi-target amendments split into multiple submissions, one per amendment_type. Documented in §6.2.2 (see schemas.md) (with §6.2.2a "why hybrid" rationale, and §6.2.2b–d per-type subsections), §6.2 free-text caps table (rationale ≤500 chars applies across all variants), §8.3 (see privacy.md) Worker pipeline, §12 (see lifecycle.md) fixtures, and §17 (settled decisions).

  22. Cross-references to in-flight proposals. RESOLVED (2026-04-26 follow-on refinement). Earlier drafts implicitly assumed requires.id always points at a stable skill, which would have blocked any new skill_draft whose dependency was itself drafted recently and not yet promoted. Resolution: cross-refs may target either a stable skill folder OR the proposal_id of a currently-in-flight alpha / beta proposal. The cross-ref validator (validate-cross-refs.ts) accepts both forms; rolled-back, quarantined, retracted, and deprecated proposals are not valid targets. The consumer AI loading the graph loads each required skill at its current version_status and validates them in the same session; the alpha disclosure (G.9) applies recursively when both main and required skill are in alpha. State-machine promotion of a dependency does not auto-promote the consumer. Documented in §6.6 (see schemas.md), §10.1 (see lifecycle.md) (cross-ref validation), and §17 (settled decisions, Architecture).

  23. Hybrid amendment shape — Worker race on body diffs. New judgment call surfaced by Fix 1. When a body amendment is composed against a particular canonical body and another body amendment lands first, the diff may fail to apply at Worker time (or even earlier at pre-flight, if the canonical was updated between drafting and submission). I asserted a 409 with {error: "diff_apply_failed"} and "the consumer either re-bases or files fresh" as the resolution. Alternative considered: server-side three-way merge — rejected as too magical for v1 and likely to introduce subtle semantic errors. Spec author may want to revisit if the failure rate is high in practice.

  24. References operations — orphan-citation guard. New judgment call surfaced by Fix 1. The references_change.operation: remove path rejects if any body section still cites the [ref-id] token. I asserted this as a hard Worker check rather than a soft warning, because allowing the orphan would silently break the §6.1 (see schemas.md) references-block rendering. Alternative: allow with a warning and let CI catch it on next render — rejected because Worker-side rejection gives a deterministic same-turn error to the consumer, whereas CI-side rejection arrives at PR time after the user has moved on.

Cross-references

Cross-doc references are inlined throughout this document in the form §X.Y (see .md). The list below was the pre-reconciliation manifest from the 2026-05-11 split, retained for audit; it can be deleted at the next split-or-merge cycle.

  • §6.1 (Skill schema / status enum / alpha banner) — see schemas.md §6.1
  • §6.2.1 (Observation schema / session_id removal / recovery_token) — see schemas.md §6.2.1
  • §6.7 (Agent capability requirements / declared_capabilities tiers) — see schemas.md §6.7
  • §6.10 (MDX-tag schema / VV, Ref, Observations components) — see schemas.md §6.10
  • §6.11 (Catalogue UID convention) — see schemas.md §6.11
  • §7 (Trust model / maintainer review queue / tier definitions) — see protocol.md §7
  • §8.3 (Worker ingestion pipeline / KV staging / rate limits) — see privacy.md §8.3
  • §8.5 (NER held-for-review path) — see privacy.md §8.5
  • §8.7 (Consumer-side state contract / profile.json / ) — see privacy.md §8.7
  • §8.9 (Document-content-discard rule) — see privacy.md §8.9
  • §9 (State machine / status transitions) — see lifecycle.md §9
  • §9.2 (Promotion thresholds) — see lifecycle.md §9.2
  • §10.1 (CI rules / cross-ref validator) — see lifecycle.md §10.1
  • §14 (Initial deliverables and sequencing) — see README.md §14
  • §15.1 (Skill-drafting protocol) — see skills.md §15.1
  • §15.7 (Harness consumer obligations) — see skills.md §15.7
  • §15.8 (Skill body discipline) — see skills.md §15.8
  • §15.9 (OSS-alignment frontmatter / install-time symlink) — see skills.md §15.9
  • §20 (Website rendering / renderer Worker) — see website.md §20
  • §20.3 (MDX-tag resolution mechanics) — see website.md §20.3
  • §21 (Provider-integration protocol layer) — see protocol.md §21
  • §23 (MCP server) — see protocol.md §23