ver-0.7
This commit is contained in:
@@ -1,10 +1,12 @@
|
||||
# echo-memory — v0.5
|
||||
# echo-memory — v0.7
|
||||
|
||||
Persistent memory for Claude / CoWork sessions via the **ECHO** Obsidian vault, driven over the [Obsidian Local REST API](https://github.com/coddingtonbear/obsidian-local-rest-api). No MCP server — the skill makes direct REST calls.
|
||||
Persistent memory for Claude / CoWork sessions via the **ECHO** Obsidian vault, driven over the [Obsidian Local REST API](https://github.com/coddingtonbear/obsidian-local-rest-api). No MCP server — the skill makes direct REST calls, now through a bundled validated client (`scripts/echo.sh`).
|
||||
|
||||
Built for **Jason Stedwell** (Director of Technical Services / Systems Engineer, MPM / ALABAMA wISP), who is both the **operator** and the **architect** of this vault. This is a personal plugin, not for distribution.
|
||||
|
||||
This repository (`jason/echo-v.05`) holds the **v0.5 source** of the plugin: the tracked source tree at `echo-memory.plugin.src/` and the built `echo-memory.plugin` package artifact rebuilt from it on each version bump.
|
||||
This repository (`jason/echo-v.05`) holds the plugin **source** (tracked tree at `echo-memory.plugin.src/`), the built `echo-memory.plugin` package artifact (rebuilt on each version bump), and a credential-free A/B `eval/` harness.
|
||||
|
||||
**0.7 in one line:** the prose-and-raw-curl skill grew an executable spine — a status-checking API client, a machine-readable routing manifest the linter enforces, deterministic bootstrap/migrate scripts, an advisory multi-writer lock, four slash commands, and an eval harness. See the [0.7.0 version-history entry](#version-history) for the full list.
|
||||
|
||||
---
|
||||
|
||||
@@ -26,11 +28,11 @@ Three consequences follow:
|
||||
|
||||
The plugin is hardcoded for a single personal endpoint:
|
||||
|
||||
- **Server:** `https://echoapi.alwisp.com` (reverse proxy → backend Obsidian Local REST API). This is the **only** valid endpoint — never use LAN addresses (`10.x`, `192.168.x`, `:27124`); those belong to retired memory systems.
|
||||
- **Auth:** a bearer token stored in the skill / `references`. The key lives only in the plugin — **never inside a vault note** (per the vault's own safety rules).
|
||||
- **Server:** `https://echoapi.alwisp.com` (reverse proxy → backend Obsidian Local REST API). This is the **only** valid endpoint — never use LAN addresses (`10.x`, `192.168.x`, `:27124`); those belong to retired memory systems. Override per-run with `ECHO_BASE`.
|
||||
- **Auth:** a bearer token stored in the skill / `references` and injected by `scripts/echo.sh`. The key lives only in the plugin — **never inside a vault note** (per the vault's own safety rules). Override with `ECHO_KEY`.
|
||||
- **TLS:** the endpoint presents a valid certificate, so `-k` is not needed.
|
||||
|
||||
Vault paths are addressed **at the root** (e.g. `GET /vault/_agent/context/current-context.md`).
|
||||
Vault paths are addressed **at the root** (e.g. `GET /vault/_agent/context/current-context.md`). Prefer `scripts/echo.sh` over raw `curl`: it injects auth, checks HTTP status (a failed write exits non-zero instead of looking like success), retries transient 5xx, verifies PUTs, and does idempotent appends.
|
||||
|
||||
---
|
||||
|
||||
@@ -46,17 +48,30 @@ Vault paths are addressed **at the root** (e.g. `GET /vault/_agent/context/curre
|
||||
```
|
||||
echo-v.05/
|
||||
├── echo-memory.plugin ← built, installable plugin (zip artifact, rebuilt on version bump)
|
||||
└── echo-memory.plugin.src/ ← tracked source tree
|
||||
├── echo-memory-<version>.plugin ← versioned build artifacts (history)
|
||||
├── eval/ ← credential-free A/B eval harness (0.6 vs 0.7); not bundled
|
||||
│ ├── mock_olrapi.py ← deterministic mock of the REST API + fault injection
|
||||
│ ├── run_eval.py ← orchestrator (runs the real echo.sh vs modeled raw curl)
|
||||
│ └── README.md
|
||||
└── echo-memory.plugin.src/ ← tracked source tree (the plugin)
|
||||
├── .claude-plugin/plugin.json ← manifest (name, version, description)
|
||||
├── README.md ← plugin-level README
|
||||
├── commands/ ← slash commands: echo-load, echo-save, echo-triage, echo-health
|
||||
└── skills/echo-memory/
|
||||
├── SKILL.md ← operating procedure (authoritative)
|
||||
├── references/
|
||||
│ ├── operating-contract.md ← durable principles + safety rules
|
||||
│ ├── operating-contract.md ← durable principles + safety rules + concurrency
|
||||
│ ├── bootstrap.md ← bootstrap / repair / migrate manifest
|
||||
│ ├── vault-layout.md ← canonical layout + frontmatter conventions
|
||||
│ ├── routing-map.md ← complete endpoint→logic routing map (human authority)
|
||||
│ ├── api-reference.md ← REST endpoint patterns + routing map
|
||||
│ └── session-log-template.md
|
||||
├── scripts/ ← executable logic (NEW in 0.7)
|
||||
│ ├── echo.sh ← validated API client (auth, status-check, retry, verify, lock)
|
||||
│ ├── routing.json ← canonical machine-readable route manifest (linter enforces it)
|
||||
│ ├── vault-lint.sh ← read-only invariant checker (Vault Health)
|
||||
│ ├── bootstrap.sh ← deterministic, idempotent vault setup/repair
|
||||
│ └── migrate.sh ← deterministic schema migration (dry-run by default)
|
||||
└── scaffold/ ← verbatim files the bootstrap writes into the vault
|
||||
├── echo-vault.md ← bootstrap marker
|
||||
├── README.vault.md ← thin human signpost
|
||||
@@ -64,7 +79,29 @@ echo-v.05/
|
||||
└── templates/ ← 8 canonical note templates
|
||||
```
|
||||
|
||||
**Division of responsibility:** `SKILL.md` owns day-to-day *procedure* (loading order, search-first, triage, scope switching, PATCH rules). `references/operating-contract.md` owns the durable, client-independent *principles and safety rules*. The other references are the canonical layout, API, and bootstrap specs.
|
||||
**Division of responsibility:** `SKILL.md` owns day-to-day *procedure* (loading order, search-first, triage, scope switching, PATCH rules) and points at the bundled tooling. `references/operating-contract.md` owns the durable, client-independent *principles, safety rules, and concurrency model*. `scripts/routing.json` is the machine-readable source of truth for routing; `references/routing-map.md` is its human-readable authority. The other references are the canonical layout, API, and bootstrap specs.
|
||||
|
||||
---
|
||||
|
||||
## Bundled tooling (0.7)
|
||||
|
||||
Executable logic ships under `skills/echo-memory/scripts/`; the agent prefers it over hand-built curl.
|
||||
|
||||
| Tool | Purpose |
|
||||
|------|---------|
|
||||
| `echo.sh` | The validated API client. `get/ls/map/search/put/post/append/patch/fm/bump/delete/lock/unlock`. Injects auth, **checks HTTP status** (non-zero exit on ≥400 — a failed write can't masquerade as success), one bounded retry on transient 5xx/connection errors, read-back verify on PUT, idempotent `append` (read-before-POST), correct `::` heading targets. |
|
||||
| `routing.json` | The **canonical machine-readable** route manifest — one regex pattern per valid destination plus retired paths. The single source of truth for "what may be written where"; `vault-lint.sh` enforces it. |
|
||||
| `vault-lint.sh` | Read-only invariant checker (Vault Health). Real YAML parsing, clock injected via `ECHO_TODAY`, exits `3` if the vault isn't bootstrapped (instead of falsely reporting "clean"). |
|
||||
| `bootstrap.sh` | Deterministic, idempotent, probe-before-write vault setup/repair (`--dry-run` to preview). Resolves the scaffold relative to itself, so it works from any CWD. |
|
||||
| `migrate.sh` | Deterministic schema migration. Dry-run by default; destructive steps gated behind `--apply` and printed first. |
|
||||
|
||||
### Slash commands
|
||||
|
||||
`/echo-load` (cold-start read), `/echo-save <text>` (route + persist), `/echo-triage` (drain the inbox), `/echo-health` (run the linter) — explicit, reproducible entry points to the procedures below.
|
||||
|
||||
### Concurrency (shared vault)
|
||||
|
||||
ECHO is read/written by multiple clients (Claude Code **and** CoWork). Single-line files (`heartbeat/last-session.md`, `current-context.md::Scope`) and append targets (`inbox.md`, `## Agent Log`) assume one writer at a time. Before an overlapping burst of writes, take the cooperative advisory lock (`echo.sh lock <id>` → `_agent/locks/vault.lock`, TTL-reclaimable) and release it at session end. Idempotent append and status-checked writes are the second line of defense.
|
||||
|
||||
---
|
||||
|
||||
@@ -107,7 +144,8 @@ echo-v.05/
|
||||
├── health/ ← YYYY-MM-vault-health.md (monthly self-maintenance audit)
|
||||
├── templates/ ← canonical note templates (seeded from the plugin's scaffold/)
|
||||
├── skills/ ← active / archived capability catalog
|
||||
└── heartbeat/ ← last-session.md pointer (read at load, written at session end)
|
||||
├── heartbeat/ ← last-session.md pointer (read at load, written at session end)
|
||||
└── locks/ ← vault.lock — cooperative advisory multi-writer lock (echo.sh lock/unlock)
|
||||
```
|
||||
|
||||
### Memory model
|
||||
@@ -177,7 +215,7 @@ The journal is one append-only time-series stream; rollups are coarser-grained e
|
||||
|
||||
### Vault Health (monthly)
|
||||
|
||||
Agent self-maintenance (not a journal entry), written to `_agent/health/YYYY-MM-vault-health.md`. The bundled `scripts/vault-lint.sh` mechanically asserts the invariants — folder↔status mismatch, duplicate slugs across lifecycle folders, wikilinks in frontmatter, duplicate `## Agent Log` headings, stale active projects (`updated:` > 30 days), and aging inbox items (> 14 days). Findings are reported, not auto-fixed.
|
||||
Agent self-maintenance (not a journal entry), written to `_agent/health/YYYY-MM-vault-health.md`. Run `scripts/vault-lint.sh` (or `/echo-health`) with `ECHO_TODAY` = the conversation's date so stale/aging math uses one clock. It mechanically asserts: folder↔status mismatch, duplicate slugs across lifecycle folders, wikilinks in frontmatter (swept across all folders), duplicate `## Agent Log` headings, stale active projects (`updated:` > 30 days), aging inbox items (> 14 days), **paths matching no route in `routing.json` or sitting at a retired path**, and **frontmatter integrity** (missing required fields, `updated` < `created`, future dates, wikilinks leaking into `source_notes`). Exit codes: `0` clean · `1` violations · `2` unreachable · `3` not bootstrapped. Findings are reported, not auto-fixed.
|
||||
|
||||
---
|
||||
|
||||
@@ -188,6 +226,7 @@ Agent self-maintenance (not a journal entry), written to `_agent/health/YYYY-MM-
|
||||
- **Bump `updated:` on substance.** PATCH the frontmatter `updated:` to today after a meaningful content change (status update, decision, scope switch). Skip it for routine log appends — bump on substance, not heartbeat.
|
||||
- **Preserve `created:`.** It is the earliest known date the entity was tracked anywhere — never reset it to "today" when merging.
|
||||
- **No `[[wikilinks]]` in frontmatter** — YAML parses them as nested lists and they break. Cross-references go in a `## Related` body section. `source_notes` holds plain relative-path strings (backward links to inputs only).
|
||||
- **Check the HTTP status (or use `echo.sh`).** A `PATCH` to a missing heading returns `400 invalid-target` (40080) and the write is *silently lost*; a transient `503` or an accepted-but-unpersisted `PUT` can do the same. `echo.sh` enforces this (exit ≠ 0, retry, read-back verify); raw `curl` must branch on `-w "%{http_code}"`. Treat ≥ 400 as a failed op — surface it, don't continue.
|
||||
|
||||
---
|
||||
|
||||
@@ -273,12 +312,12 @@ Observations are promoted into Fact / Pattern (dropping the date) once a pattern
|
||||
|
||||
## Bootstrap, repair & migration
|
||||
|
||||
The plugin carries everything needed to stand up a vault in `scaffold/`; there is no dependency on any in-vault control doc.
|
||||
The plugin carries everything needed to stand up a vault in `scaffold/`; there is no dependency on any in-vault control doc. In 0.7 these are scripts (`scripts/bootstrap.sh`, `scripts/migrate.sh`), not hand-run curl loops — the prose in `references/bootstrap.md` documents what they do and serves as the fallback.
|
||||
|
||||
- **Probe** — GET `_agent/echo-vault.md`. `200` → bootstrapped (check `schema_version`); `404` → run a fresh bootstrap.
|
||||
- **Fresh bootstrap** (idempotent, additive — never overwrites): create the folder tree (a one-line README seeds each leaf since Obsidian/git ignore empty dirs) → PUT the 8 templates to their mirrored vault paths → PUT the 3 anchor seeds only if absent (the operator-preferences seed is deliberately empty — no fabricated facts) → PUT the thin vault README → PUT the marker **last** (so the vault is only flagged "set up" once everything is in place) → write a first-run trace (daily note + session log).
|
||||
- **Repair** — same steps, GET-probing each path and creating only what's missing; malformed files are flagged in the session log, never replaced.
|
||||
- **Migrations** — when the marker's `schema_version` is older than the plugin's, apply each intervening migration then PATCH the marker. `0 → 1` removed the old in-vault control docs (`CLAUDE.md` / `BOOTSTRAP.md` / `STRUCTURE.md` / `index.md`) in favor of the plugin-as-source-of-truth model.
|
||||
- **Fresh bootstrap** (`bootstrap.sh`; idempotent, additive — never overwrites): create the folder tree (a one-line README seeds each leaf since Obsidian/git ignore empty dirs) → PUT the 8 templates to their mirrored vault paths → PUT the 3 anchor seeds only if absent (the operator-preferences seed is deliberately empty — no fabricated facts) → PUT the thin vault README → PUT the marker **last** (so the vault is only flagged "set up" once everything is in place) → write a first-run trace (daily note + session log). `--dry-run` previews.
|
||||
- **Repair** — `bootstrap.sh` again: GET-probes each path and creates only what's missing; malformed files are flagged, never replaced.
|
||||
- **Migrations** — `migrate.sh` reads the marker's `schema_version`, applies each intervening migration (dry-run by default; `--apply` to perform moves/deletes), and stamps the marker. `0 → 1` removed the old in-vault control docs (`CLAUDE.md` / `BOOTSTRAP.md` / `STRUCTURE.md` / `index.md`); `1 → 2` folded `reviews/` into `journal/` + `_agent/health/`.
|
||||
|
||||
---
|
||||
|
||||
@@ -313,3 +352,4 @@ If the API returns a connection error, timeout, or `502` (usually Obsidian / the
|
||||
| **0.5.0** | Self-bootstrap + control-logic-in-plugin. Plugin becomes the single source of truth: bundled `scaffold/` (8 templates, 3 anchor seeds, thin vault README, marker) bootstraps an empty vault with no external/local-path dependency. New `operating-contract.md` (principles + safety from the old in-vault `CLAUDE.md`); `bootstrap.md` rewritten as a portable bootstrap/repair/migrate manifest. Cold-start probe moved from `/vault/BOOTSTRAP.md` to `_agent/echo-vault.md` (carries `schema_version`). Live vault migrated to data-only. |
|
||||
| **0.5.1** | Routing-doc consistency pass: decision-mirror heading unified to `## Key Decisions`; stale `Current status` PATCH examples corrected to `Status`; vault-layout inline project example refreshed to the real template. All 17 `projects/active/` notes normalized losslessly to the canonical template heading set; `android-mqtt-shell` moved to `incubating/` (was broken `status: upcoming` in active). Plugin repackaged (21 files). |
|
||||
| **0.6.0** | Schema 2. **#8 Inbox auto-fire:** the Loading procedure adds an inbox-depth GET and a load-time *Reconcile* step (inbox triage + scope-drift), so triage self-fires. **#10 Routing:** `reviews/` retired — weekly/monthly/quarterly/annual rollups fold into `journal/{weekly,monthly,quarterly,annual}/`, vault-health moves to `_agent/health/`; new `references/routing-map.md` is the complete audited endpoint→logic map. **Recs:** heartbeat pointer operationalized (read first at load, written at session end); new `scripts/vault-lint.sh` mechanically checks vault invariants. Dead refs pruned (`archive/`, `_agent/outputs/`, `resources/source-material`). Migration `1 → 2` in `bootstrap.md`. |
|
||||
| **0.7.0** | Schema 2 (unchanged layout). Hardening pass — gave the prose-and-curl skill an executable spine. **S2** `scripts/echo.sh`: one validated client wrapping every verb with auth, HTTP-status checking (failed writes exit non-zero instead of looking like success), one bounded retry on 5xx, read-back-verified PUT, and idempotent `append`. **S3** `scripts/routing.json`: canonical machine-readable route manifest; `vault-lint.sh` enforces it (flags unknown/retired paths). **S4** deterministic `scripts/bootstrap.sh` + `scripts/migrate.sh` (idempotent, dry-run, probe-before-write; fixes the old CWD-relative `@scaffold/...` empty-body bug). **S5** cooperative advisory lock (`_agent/locks/vault.lock`) + documented multi-writer model. **M1/M2/M5** linter rewrite: real YAML parsing, injected clock (`ECHO_TODAY`), exits `3` (not "clean") on an un-bootstrapped vault, plus routing-membership + frontmatter-integrity checks. **M3** status-check guidance throughout. **M4** four slash commands (`/echo-load`, `/echo-save`, `/echo-triage`, `/echo-health`). Added a credential-free A/B `eval/` harness (mock REST API + fault injection): isolates a **−76% generated-token** I/O layer and **4 → 0 silent write failures** vs 0.6. |
|
||||
|
||||
Binary file not shown.
Binary file not shown.
Vendored
BIN
Binary file not shown.
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"name": "echo-memory",
|
||||
"version": "0.6.0",
|
||||
"description": "Persistent memory via the ECHO Obsidian vault over the Local REST API. Self-bootstrapping: the plugin carries the full vault scaffold and all control logic (no in-vault routing docs), so it stands up an empty vault and ports cleanly to any other. Reads and writes notes across Claude/CoWork sessions using direct REST calls \u2014 no MCP server required. Jason's personal memory vault.",
|
||||
"version": "0.7.0",
|
||||
"description": "Persistent memory via the ECHO Obsidian vault over the Local REST API. Self-bootstrapping: the plugin carries the full vault scaffold and all control logic (no in-vault routing docs), so it stands up an empty vault and ports cleanly to any other. Ships a validated API client (echo.sh), a canonical machine-readable routing manifest, deterministic bootstrap/migrate scripts, a read-only vault linter, and /echo-load|save|triage|health commands. Reads and writes notes across Claude/CoWork sessions using direct REST calls \u2014 no MCP server required. Jason's personal memory vault.",
|
||||
"author": {
|
||||
"name": "Jason"
|
||||
},
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Persistent memory for Claude via the **ECHO** Obsidian vault, using the [Obsidian Local REST API](https://github.com/coddingtonbear/obsidian-local-rest-api).
|
||||
|
||||
Reads and writes notes across Claude/CoWork sessions using direct REST calls — no MCP server required. Built for **Jason Stedwell** (Director of Technical Services / Systems Engineer, MPM / ALABAMA wISP), who is both the operator and the architect of this vault.
|
||||
Reads and writes notes across Claude/CoWork sessions using direct REST calls — no MCP server required, but routed through a bundled validated client (`scripts/echo.sh`) that status-checks every write. Built for **Jason Stedwell** (Director of Technical Services / Systems Engineer, MPM / ALABAMA wISP), who is both the operator and the architect of this vault.
|
||||
|
||||
**The plugin is the single source of truth.** All control logic — bootstrap/repair, operating contract, taxonomy, frontmatter conventions, and the canonical note templates — ships inside this plugin under `skills/echo-memory/`. The vault itself holds **data only**: there are no `CLAUDE.md` / `BOOTSTRAP.md` / `STRUCTURE.md` / `index.md` control docs in it. This makes ECHO self-bootstrapping (point it at an empty Obsidian vault and it stands up the full structure), easy to update (update the plugin, not the vault), and portable to any other vault.
|
||||
|
||||
@@ -11,7 +11,8 @@ Reads and writes notes across Claude/CoWork sessions using direct REST calls —
|
||||
- Loads operator preferences, current context, and relevant project notes at the start of substantive conversations
|
||||
- Writes facts, preferences, and decisions Jason asks Claude to remember
|
||||
- Logs working sessions so future conversations can pick up where they left off
|
||||
- **Bootstraps an empty vault from the bundled `scaffold/`** (folders, templates, anchor seeds, marker) and repairs/migrates an existing one — see `skills/echo-memory/references/bootstrap.md`
|
||||
- **Bootstraps an empty vault from the bundled `scaffold/`** (folders, templates, anchor seeds, marker) and repairs/migrates an existing one via `scripts/bootstrap.sh` / `scripts/migrate.sh` — see `skills/echo-memory/references/bootstrap.md`
|
||||
- Exposes `/echo-load`, `/echo-save`, `/echo-triage`, `/echo-health` slash commands as explicit entry points, and coordinates concurrent Claude/CoWork sessions via a cooperative advisory lock
|
||||
|
||||
## Configuration
|
||||
|
||||
@@ -41,35 +42,48 @@ The endpoint presents a **valid TLS certificate**, so `-k` is not required. The
|
||||
├── health/ ← YYYY-MM-vault-health.md (monthly self-maintenance audit)
|
||||
├── templates/ ← canonical note templates (seeded from the plugin's scaffold/)
|
||||
├── skills/ ← active / archived
|
||||
└── heartbeat/ ← last-session.md orientation pointer (read at load, written at session end)
|
||||
├── heartbeat/ ← last-session.md orientation pointer (read at load, written at session end)
|
||||
└── locks/ ← vault.lock — cooperative advisory multi-writer lock
|
||||
```
|
||||
|
||||
Control logic and the master scaffold live in the plugin, not the vault:
|
||||
|
||||
```
|
||||
commands/ ← slash commands: echo-load, echo-save, echo-triage, echo-health
|
||||
skills/echo-memory/
|
||||
├── SKILL.md ← operating procedure (authoritative)
|
||||
├── references/
|
||||
│ ├── operating-contract.md ← durable principles + safety
|
||||
│ ├── operating-contract.md ← durable principles + safety + concurrency
|
||||
│ ├── bootstrap.md ← bootstrap / repair / migrate an empty vault
|
||||
│ ├── vault-layout.md ← canonical layout + frontmatter
|
||||
│ ├── routing-map.md ← complete endpoint→logic map (every path, trigger, why distinct)
|
||||
│ ├── routing-map.md ← complete endpoint→logic map (human authority)
|
||||
│ ├── api-reference.md ← REST endpoint patterns + routing map
|
||||
│ └── session-log-template.md
|
||||
├── scripts/
|
||||
│ └── vault-lint.sh ← read-only invariant checker (run by the monthly Vault Health pass)
|
||||
│ ├── echo.sh ← validated API client (auth, status-check, retry, verify, lock)
|
||||
│ ├── routing.json ← canonical machine-readable route manifest (linter enforces it)
|
||||
│ ├── vault-lint.sh ← read-only invariant checker (monthly Vault Health pass)
|
||||
│ ├── bootstrap.sh ← deterministic, idempotent vault setup/repair
|
||||
│ └── migrate.sh ← deterministic schema migration (dry-run by default)
|
||||
└── scaffold/ ← verbatim files the bootstrap writes into the vault
|
||||
├── README.vault.md echo-vault.md
|
||||
├── templates/ (8 note templates)
|
||||
└── anchors/ (operator-preferences, current-context, inbox seeds)
|
||||
```
|
||||
|
||||
## Skills
|
||||
## Skills & commands
|
||||
|
||||
| Skill | Triggers |
|
||||
|-------|----------|
|
||||
| `echo-memory` | "remember that", "save to memory", "what do you know about me", "load my profile", "check my notes", "log this decision", "add to my inbox" — and proactively at the start of substantive conversations |
|
||||
|
||||
| Command | Does |
|
||||
|---------|------|
|
||||
| `/echo-load` | Cold-start context read (profile, scope, latest session, today, inbox) |
|
||||
| `/echo-save <text>` | Route + persist content to its canonical home (search-first, idempotent) |
|
||||
| `/echo-triage` | Drain aging inbox captures to their homes, logging each move |
|
||||
| `/echo-health` | Run `vault-lint.sh` and summarize invariant violations |
|
||||
|
||||
## Requirements
|
||||
|
||||
- Obsidian running on the backend with the [Local REST API plugin](https://github.com/coddingtonbear/obsidian-local-rest-api) enabled
|
||||
|
||||
@@ -0,0 +1,14 @@
|
||||
---
|
||||
description: Run the ECHO vault-health linter and summarize any invariant violations
|
||||
allowed-tools: Bash(*/echo-memory/scripts/vault-lint.sh*)
|
||||
---
|
||||
|
||||
Run the bundled, read-only vault linter and report findings. Pass the conversation's current date so stale/aging math uses the same clock the agent writes with:
|
||||
|
||||
```bash
|
||||
ECHO_TODAY=$(date +%F) "${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts/vault-lint.sh"
|
||||
```
|
||||
|
||||
Exit codes: `0` clean · `1` violations (printed, grouped by check) · `2` vault unreachable · `3` vault not bootstrapped (run `bootstrap.sh`).
|
||||
|
||||
Summarize the violations grouped by category and propose fixes, but **do not auto-fix** without Jason's go-ahead. If this is the first substantive session of a calendar month, offer to write the findings to `_agent/health/YYYY-MM-vault-health.md` (per the skill's **Vault Health** section).
|
||||
@@ -0,0 +1,13 @@
|
||||
---
|
||||
description: Load ECHO memory — cold-start context read (profile, scope, latest session, today, inbox)
|
||||
---
|
||||
|
||||
Use the **echo-memory** skill to load memory now. Run the cold-start **Loading procedure** from `SKILL.md`: issue the 5–6 reads in parallel (marker, operator-preferences, current-context, heartbeat→latest session, today's daily note, inbox), then do the load-time **reconcile** (inbox-depth + scope-drift) and surface it in a single line.
|
||||
|
||||
Prefer the bundled client for each read:
|
||||
|
||||
```bash
|
||||
"${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts/echo.sh" get _agent/memory/semantic/operator-preferences.md
|
||||
```
|
||||
|
||||
Do not narrate the reads. End with a one-line orientation: who/what/where the active scope is, and any reconcile prompt.
|
||||
@@ -0,0 +1,22 @@
|
||||
---
|
||||
description: Save to ECHO memory — route content to its canonical home (search-first, idempotent)
|
||||
argument-hint: "[what to remember]"
|
||||
---
|
||||
|
||||
Use the **echo-memory** skill to persist this to the ECHO vault:
|
||||
|
||||
> $ARGUMENTS
|
||||
|
||||
Follow the skill's write discipline exactly:
|
||||
1. **Route** via `references/routing-map.md` (the canonical map). If no path fits, capture to `inbox/captures/inbox.md`.
|
||||
2. **Search-first** for any new slug-addressed note (slug AND human title) before creating — merge/promote instead of duplicating.
|
||||
3. Write through the bundled client so the call is status-checked and idempotent:
|
||||
|
||||
```bash
|
||||
ECHO="${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts/echo.sh"
|
||||
"$ECHO" append inbox/captures/inbox.md "- $(date +%F): <entry>" # idempotent capture
|
||||
"$ECHO" put <routed/path>.md <bodyfile> # create/overwrite (verifies)
|
||||
"$ECHO" patch <path>.md append heading "<H1::Sub>" <bodyfile> # targeted append
|
||||
```
|
||||
|
||||
4. Write in third person about Jason; set `agent_written: true` + `source_notes`; never put `[[wikilinks]]` in frontmatter; `bump` `updated:` only on substantive changes.
|
||||
@@ -0,0 +1,11 @@
|
||||
---
|
||||
description: Triage the ECHO inbox — route aging captures to their canonical homes and log the moves
|
||||
---
|
||||
|
||||
Use the **echo-memory** skill to run **Inbox Triage**. GET `inbox/captures/inbox.md`, list captures older than ~7 days that were never routed, and offer to route them.
|
||||
|
||||
```bash
|
||||
"${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts/echo.sh" get inbox/captures/inbox.md
|
||||
```
|
||||
|
||||
For each capture Jason accepts: send it to its proper home per the routing map (preference → `operator-preferences.md::Observations`; project idea → `projects/incubating/`; durable fact → `_agent/memory/semantic/`; person → `resources/people/`), then record the move in `inbox/processing-log/YYYY-MM-DD.md` (`- <original> → <destination>`). Do **not** delete the original capture unless Jason explicitly asks — the processing log is the audit trail.
|
||||
BIN
Binary file not shown.
@@ -30,6 +30,49 @@ The endpoint has a **valid TLS certificate**, so `-k` is not needed (add it only
|
||||
- Complete endpoint→logic routing map (every write destination, its trigger, and why it's distinct): `references/routing-map.md`
|
||||
- Full API reference with every endpoint pattern and the memory routing map: `references/api-reference.md`
|
||||
|
||||
Executable logic ships under `scripts/`:
|
||||
|
||||
- `scripts/echo.sh` — the **validated API client**; prefer it over hand-built `curl` (below)
|
||||
- `scripts/routing.json` — the **canonical, machine-readable** route manifest (the routing map's source of truth; the linter enforces it)
|
||||
- `scripts/vault-lint.sh` — read-only invariant checker (Vault Health)
|
||||
- `scripts/bootstrap.sh` / `scripts/migrate.sh` — deterministic vault setup/repair and schema migration
|
||||
|
||||
## Bundled Tooling (prefer over raw curl)
|
||||
|
||||
All paths below are under `${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/`.
|
||||
|
||||
**`scripts/echo.sh` — use this for every read/write.** It centralizes auth, **HTTP-status checking** (a failed write exits non-zero instead of looking like success), one bounded retry on 5xx/connection errors, idempotent append, correct `::` heading targets, and frontmatter patches. The raw `curl` recipes later in this file are the underlying mechanics / fallback — reach for them only if `echo.sh` is unavailable, and if you do, **check the HTTP status yourself** (the PATCH-heading `400 invalid-target` failure silently loses writes otherwise).
|
||||
|
||||
```bash
|
||||
ECHO="${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts/echo.sh"
|
||||
"$ECHO" get <path> # 404 -> exit 44
|
||||
"$ECHO" ls <dir> ; "$ECHO" map <path> # listing / document-map
|
||||
"$ECHO" search <terms...>
|
||||
"$ECHO" put <path> <file> # create/overwrite (read-back verified)
|
||||
"$ECHO" append <path> "<line>" # idempotent: skips if the exact line exists
|
||||
"$ECHO" patch <path> append heading "<H1::Sub>" <file>
|
||||
"$ECHO" fm <path> updated '"2026-06-19"' ; "$ECHO" bump <path> # frontmatter
|
||||
"$ECHO" lock <session-id> ; "$ECHO" unlock <session-id>
|
||||
```
|
||||
|
||||
**Bootstrap / migrate** are scripts now, not hand-run curl loops: `scripts/bootstrap.sh [--dry-run]` (idempotent, probe-before-write, never overwrites) and `scripts/migrate.sh [--apply]` (reads the marker's `schema_version` and applies migrations; dry-run by default). See `references/bootstrap.md`.
|
||||
|
||||
### Concurrency — the vault is shared, so coordinate writes
|
||||
|
||||
ECHO is read/written by multiple clients (Claude Code **and** CoWork sessions). The single-line files (`heartbeat/last-session.md`, `current-context.md::Scope`, `inbox.md`) assume a single writer at a time. Before a burst of writes in a session that may overlap another, take the **advisory lock**, and release it at session end:
|
||||
|
||||
```bash
|
||||
"$ECHO" lock "cc-$(date +%s)" # exit 75 if another session holds a fresh lock
|
||||
# ... do the writes ...
|
||||
"$ECHO" unlock "cc-$(date +%s)"
|
||||
```
|
||||
|
||||
The lock is cooperative (a stale lock past `ECHO_LOCK_TTL`, default 15 min, is reclaimable) and lives at `_agent/locks/vault.lock`. It is a courtesy, not a hard mutex — if you can't take it, tell Jason another session may be active rather than racing it.
|
||||
|
||||
### Slash commands
|
||||
|
||||
`/echo-load` (cold-start read), `/echo-save <text>` (route + persist), `/echo-triage` (drain the inbox), `/echo-health` (run the linter). These are explicit entry points to the procedures below.
|
||||
|
||||
## Operating Contract & Safety
|
||||
|
||||
The vault is the **system of record** for long-term memory, not a scratchpad. Default to **additive updates, explicit status changes, and traceable summaries**. Keep agent-managed content (`agent_written: true` + `source_notes`) separable from human-authored content. Non-negotiable safety rules:
|
||||
@@ -294,20 +337,22 @@ A weekly/monthly rollup is a **light digest** — open threads across `projects/
|
||||
|
||||
On the first substantive session of a calendar month, run a quick health pass and write findings to `_agent/health/YYYY-MM-vault-health.md`. This is **agent self-maintenance, not a journal entry** — it lives under `_agent/` because it's about the vault's mechanical integrity, not Jason's work narrative. Don't auto-fix without asking.
|
||||
|
||||
Run the bundled linter first — it mechanically checks the invariants below so you don't eyeball them:
|
||||
Run the bundled linter first — it mechanically checks the invariants below so you don't eyeball them. **Pass `ECHO_TODAY` = the conversation's `currentDate`** so stale/aging math uses the same clock you write with (not the runner's machine date):
|
||||
|
||||
```bash
|
||||
bash "${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts/vault-lint.sh"
|
||||
ECHO_TODAY=<currentDate> "${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts/vault-lint.sh"
|
||||
```
|
||||
|
||||
Checks (the linter asserts each and prints violations):
|
||||
Exit codes: `0` clean · `1` violations · `2` unreachable · `3` not bootstrapped. Checks (the linter asserts each and prints violations):
|
||||
|
||||
1. **Stale active projects** — for each note in `projects/active/`, check `updated:` >30 days. Likely belongs in `on-hold/`.
|
||||
2. **Unprocessed inbox** — GET `inbox/captures/inbox.md`. List items older than 14 days that never moved through the triage protocol.
|
||||
3. **Duplicate slugs across lifecycle folders** — any slug appearing in more than one of `active/`, `incubating/`, `on-hold/`, `archived/` is broken state.
|
||||
4. **Folder ↔ `status:` mismatch** — any `projects/<lifecycle>/` note whose `status:` frontmatter disagrees with its folder.
|
||||
5. **Wikilinks in frontmatter** — any `[[...]]` inside a YAML frontmatter block (breaks Obsidian reading view).
|
||||
5. **Wikilinks in frontmatter** — any `[[...]]` inside a YAML frontmatter block (breaks Obsidian reading view), swept across all folders.
|
||||
6. **Duplicate `## Agent Log` headings** — any daily note with more than one.
|
||||
7. **Unknown / retired paths** — any vault file that matches no route in `scripts/routing.json` (or sits at a retired path).
|
||||
8. **Frontmatter integrity** — missing required fields, `updated` < `created`, future `updated`, and `[[wikilinks]]` leaking into `source_notes`.
|
||||
|
||||
The pass is cheap and pays for itself by catching drift before it requires a reorg. Write the findings as a digest; act on them only with Jason's go-ahead.
|
||||
|
||||
|
||||
@@ -4,6 +4,8 @@ Server: `https://echoapi.alwisp.com` (reverse proxy → backend Obsidian Local R
|
||||
Auth header: `Authorization: Bearer 241265fbe6830934a9a4ad3e69335f64a42153b663aa5b0017cb1ea1217b2bab`
|
||||
The endpoint has a **valid TLS certificate** — `-k` is not required. Paths address the vault at its **root**.
|
||||
|
||||
> **Prefer `scripts/echo.sh` over the raw recipes below.** It wraps every verb with auth, status checking, retry, idempotent append, and frontmatter patches. The recipes here are the underlying mechanics and the fallback. **If you call `curl` directly, check the HTTP status** — add `-o /dev/null -w "%{http_code}"` and branch on it. A `PATCH` to a non-existent heading returns `400 invalid-target` (errorCode 40080) and the write is *silently lost*; a bare `curl` that ignores status will report success anyway. `GET` returns `404` for a missing file. Treat any `>= 400` as a failed operation, surface it, and do not continue as if it succeeded.
|
||||
|
||||
---
|
||||
|
||||
## Reading Files
|
||||
|
||||
@@ -4,6 +4,20 @@ The **plugin is the single source of truth** for ECHO's structure. Everything ne
|
||||
|
||||
The vault holds **data only**. It carries no `CLAUDE.md` / `BOOTSTRAP.md` / `STRUCTURE.md` / `index.md`. The "is this vault set up?" signal is a small marker file, `_agent/echo-vault.md`.
|
||||
|
||||
## Quick path — run the scripts
|
||||
|
||||
Bootstrap, repair, and migration are deterministic scripts; prefer them over running the curl steps by hand. They resolve the scaffold relative to their own location, so they work regardless of the caller's CWD:
|
||||
|
||||
```bash
|
||||
SCRIPTS="${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scripts"
|
||||
"$SCRIPTS/bootstrap.sh" --dry-run # preview what would be seeded
|
||||
"$SCRIPTS/bootstrap.sh" # idempotent, additive — fills only what is missing (also the repair path)
|
||||
"$SCRIPTS/migrate.sh" # plan a schema migration (dry-run)
|
||||
"$SCRIPTS/migrate.sh" --apply # perform the migration (moves/deletes, after review)
|
||||
```
|
||||
|
||||
`bootstrap.sh` writes through `echo.sh`, so every step is status-checked and the marker is written last. The manual steps below document what the script does (and serve as a fallback if the script can't run).
|
||||
|
||||
```
|
||||
AUTH="Authorization: Bearer 241265fbe6830934a9a4ad3e69335f64a42153b663aa5b0017cb1ea1217b2bab"
|
||||
BASE="https://echoapi.alwisp.com"
|
||||
@@ -75,9 +89,12 @@ PUT every file under this skill's `scaffold/templates/` to its mirrored vault pa
|
||||
|
||||
Templates keep their Obsidian Templater tokens (`{{date:YYYY-MM-DD}}` etc.) verbatim — those are resolved by Templater / by the skill's daily-note routine, not at seed time.
|
||||
|
||||
Resolve scaffold paths against the skill directory — **never a bare relative `@scaffold/...`**, which assumes the caller's CWD is the skill dir and silently sends an empty body otherwise:
|
||||
|
||||
```bash
|
||||
SCAFFOLD="${CLAUDE_PLUGIN_ROOT}/skills/echo-memory/scaffold"
|
||||
curl -s -X PUT -H "$AUTH" -H "Content-Type: text/markdown" \
|
||||
--data-binary @scaffold/templates/journal/templates/daily-note-template.md \
|
||||
--data-binary @"$SCAFFOLD/templates/journal/templates/daily-note-template.md" \
|
||||
"$BASE/vault/journal/templates/daily-note-template.md"
|
||||
```
|
||||
|
||||
@@ -115,6 +132,8 @@ Run the same steps 1–5, but GET-probe each path first and **only create what i
|
||||
|
||||
## Migrations (`schema_version` mismatch)
|
||||
|
||||
`scripts/migrate.sh` automates this: it reads the marker's `schema_version`, applies each intervening migration (idempotent, additive; destructive steps gated behind `--apply` and printed first), and stamps the marker at the end. Run it dry-run, review the plan, then `--apply`. The steps below are what it encodes — and the manual fallback.
|
||||
|
||||
When the marker's `schema_version` is older than the plugin's, apply the migration steps for each intervening version, then PATCH the marker's `schema_version` frontmatter to the new value.
|
||||
|
||||
- **0 → 1** (control-docs-in-plugin): the vault previously carried root control docs (`CLAUDE.md`, `BOOTSTRAP.md`, `STRUCTURE.md`, `index.md`). Back them up outside the vault, DELETE them, PUT the thin `scaffold/README.vault.md` over the old verbose `README.md`, write the `_agent/echo-vault.md` marker, and scrub now-dangling `[[CLAUDE]]`/`[[BOOTSTRAP]]`/`[[STRUCTURE]]`/`[[index]]` links from the `## Related` sections of `operator-preferences.md` and `current-context.md` (leave historical session logs alone). Confirm with the operator before deleting.
|
||||
|
||||
@@ -34,3 +34,12 @@ You are an agent operating against an Obsidian vault that functions as a shared,
|
||||
## REST/API readiness
|
||||
|
||||
Assume clients may operate without filesystem access, through the Obsidian Local REST API. Keep paths predictable, frontmatter parseable, titles stable, and all state stored in notes rather than hidden conversation memory. Structure must stay portable: a fresh, empty vault is brought fully online by the plugin's bootstrap (`references/bootstrap.md`), so nothing essential should depend on files existing in the vault ahead of time.
|
||||
|
||||
## Concurrency (shared vault)
|
||||
|
||||
The vault is a **shared** substrate — Claude Code, CoWork, and other REST/MCP clients may operate on it concurrently. The REST API offers no transactions, so writers coordinate cooperatively:
|
||||
|
||||
- Single-line, overwrite-style files (`_agent/heartbeat/last-session.md`, `current-context.md::Scope`) and append targets (`inbox.md`, `## Agent Log`) assume **one writer at a time**. Two sessions writing at once can clobber or duplicate.
|
||||
- Before a burst of writes in a session that may overlap another, take the advisory lock (`echo.sh lock <id>` → `_agent/locks/vault.lock`) and release it at session end. The lock is cooperative with a TTL (stale locks are reclaimable); it is a courtesy, not a hard mutex.
|
||||
- Idempotent append (read-before-POST, via `echo.sh append`) is the second line of defense against duplicate lines from retries or overlapping sessions.
|
||||
- Status-check every write. A write that returns `>= 400` did **not** land — surface it rather than assuming success.
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
**This document is canonical and complete.** Every write destination in the vault appears here exactly once, with the condition that routes content to it, what lands there, and why it is distinct from its neighbours. The rule for the whole system: **if a path is not in this map, nothing is written to it.** A path that cannot justify its separateness from a neighbour is a merge candidate, not a valid destination.
|
||||
|
||||
Three views of the same truth: the `SKILL.md` *Where to Write* table is the quick-reference, this map is the authority, and `vault-layout.md` is the physical tree. When they disagree, this map wins and the others are fixed to match.
|
||||
Views of the same truth: `scripts/routing.json` is the **machine-readable canonical source** (one route per destination, as a regex pattern + trigger + reason); `vault-lint.sh` loads it and flags any vault path that matches no route (and any write to a retired path). This prose map is the human-readable authority and must stay in sync with `routing.json`. The `SKILL.md` *Where to Write* table is the quick-reference and `vault-layout.md` is the physical tree. When they disagree, `routing.json` + this map win and the others are fixed to match.
|
||||
|
||||
## How to read a row
|
||||
|
||||
|
||||
@@ -0,0 +1,97 @@
|
||||
#!/usr/bin/env bash
|
||||
# bootstrap.sh — stand up (or repair) an ECHO vault deterministically.
|
||||
#
|
||||
# Idempotent and additive: every write is probe-before-write and NEVER overwrites an
|
||||
# existing file. The marker (_agent/echo-vault.md) is written LAST, so the vault is
|
||||
# only flagged "set up" once every piece is in place. Safe to re-run any time — that
|
||||
# is also the "repair" path (it fills in only what is missing).
|
||||
#
|
||||
# All scaffold is resolved relative to THIS script's location, so it works regardless
|
||||
# of the caller's CWD (fixes the old `@scaffold/...` relative-path assumption).
|
||||
#
|
||||
# Usage:
|
||||
# bootstrap.sh [--dry-run]
|
||||
#
|
||||
# Env: ECHO_BASE, ECHO_KEY (passed through to echo.sh), ECHO_TODAY (YYYY-MM-DD for {{DATE}}).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
SKILL_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
|
||||
SCAFFOLD="$SKILL_DIR/scaffold"
|
||||
ECHO="$SCRIPT_DIR/echo.sh"
|
||||
TODAY="${ECHO_TODAY:-$(date +%Y-%m-%d)}"
|
||||
DRY=0
|
||||
[ "${1:-}" = "--dry-run" ] || [ "${1:-}" = "-n" ] && DRY=1
|
||||
|
||||
[ -d "$SCAFFOLD" ] || { echo "bootstrap: scaffold not found at $SCAFFOLD" >&2; exit 1; }
|
||||
[ -x "$ECHO" ] || chmod +x "$ECHO" 2>/dev/null || true
|
||||
|
||||
say() { echo "bootstrap: $*"; }
|
||||
exists() { ECHO_VERIFY=0 "$ECHO" get "$1" >/dev/null 2>&1; } # 0 = present(200), nonzero = absent/404
|
||||
|
||||
# seed VAULT_PATH from LOCAL_FILE (with {{DATE}} substitution), only if absent
|
||||
seed() {
|
||||
local vpath="$1" local_file="$2"
|
||||
if exists "$vpath"; then say "skip (exists) $vpath"; return 0; fi
|
||||
if [ "$DRY" = "1" ]; then say "would seed $vpath <- ${local_file#$SKILL_DIR/}"; return 0; fi
|
||||
sed "s/{{DATE}}/$TODAY/g" "$local_file" | ECHO_VERIFY=1 "$ECHO" put "$vpath" - >/dev/null
|
||||
say "seeded $vpath"
|
||||
}
|
||||
|
||||
# write a one-line leaf README only if absent
|
||||
leaf_readme() {
|
||||
local dir="$1" name="${1##*/}"
|
||||
local vpath="$dir/README.md"
|
||||
if exists "$vpath"; then return 0; fi
|
||||
if [ "$DRY" = "1" ]; then say "would readme $vpath"; return 0; fi
|
||||
printf '# %s\n\nMemory vault folder. See the echo-memory plugin for conventions.\n' "$name" \
|
||||
| ECHO_VERIFY=0 "$ECHO" put "$vpath" - >/dev/null
|
||||
say "readme $vpath"
|
||||
}
|
||||
|
||||
# ---- Pre-flight: is the vault already bootstrapped? --------------------------
|
||||
if exists "_agent/echo-vault.md"; then
|
||||
ver="$("$ECHO" get _agent/echo-vault.md 2>/dev/null | sed -n 's/^schema_version:[[:space:]]*//p' | head -1)"
|
||||
say "marker present (schema_version=${ver:-unknown}). Running repair pass (fills only missing files)."
|
||||
CUR_SCHEMA=2
|
||||
if [ -n "$ver" ] && [ "$ver" -lt "$CUR_SCHEMA" ] 2>/dev/null; then
|
||||
say "NOTE: schema_version $ver < $CUR_SCHEMA — run migrate.sh before relying on the vault."
|
||||
fi
|
||||
fi
|
||||
|
||||
# ---- 1. Folder tree (leaf READMEs guarantee non-empty dirs) ------------------
|
||||
LEAVES=(
|
||||
inbox/captures inbox/imports inbox/processing-log
|
||||
journal/daily journal/weekly journal/monthly journal/quarterly journal/annual journal/templates
|
||||
projects/active projects/incubating projects/on-hold projects/archived
|
||||
areas/business areas/personal areas/learning areas/systems
|
||||
resources/concepts resources/references resources/people resources/companies resources/meetings
|
||||
decisions/by-date
|
||||
_agent/context _agent/memory/working _agent/memory/episodic _agent/memory/semantic
|
||||
_agent/sessions _agent/health _agent/templates _agent/heartbeat
|
||||
_agent/skills/active _agent/skills/archived _agent/locks
|
||||
)
|
||||
for d in "${LEAVES[@]}"; do leaf_readme "$d"; done
|
||||
|
||||
# ---- 2. Templates (mirror scaffold/templates/ 1:1 into the vault) ------------
|
||||
if [ -d "$SCAFFOLD/templates" ]; then
|
||||
while IFS= read -r f; do
|
||||
rel="${f#$SCAFFOLD/templates/}"
|
||||
seed "$rel" "$f"
|
||||
done < <(find "$SCAFFOLD/templates" -type f -name '*.md')
|
||||
fi
|
||||
|
||||
# ---- 3. Anchor seeds (only if absent — never fabricate facts) ----------------
|
||||
seed "_agent/memory/semantic/operator-preferences.md" "$SCAFFOLD/anchors/operator-preferences.seed.md"
|
||||
seed "_agent/context/current-context.md" "$SCAFFOLD/anchors/current-context.seed.md"
|
||||
seed "inbox/captures/inbox.md" "$SCAFFOLD/anchors/inbox.seed.md"
|
||||
|
||||
# ---- 4. Vault README (human signpost) ----------------------------------------
|
||||
seed "README.md" "$SCAFFOLD/README.vault.md"
|
||||
|
||||
# ---- 5. Marker (write LAST) --------------------------------------------------
|
||||
seed "_agent/echo-vault.md" "$SCAFFOLD/echo-vault.md"
|
||||
|
||||
say "done (${DRY:+DRY-RUN }$TODAY)."
|
||||
say "Next: create today's daily note + a bootstrap session log + heartbeat (see SKILL.md First-run trace)."
|
||||
+199
@@ -0,0 +1,199 @@
|
||||
#!/usr/bin/env bash
|
||||
# echo.sh — the single validated client for the ECHO Obsidian Local REST API.
|
||||
#
|
||||
# Every read/write to the vault should go through this script instead of hand-built
|
||||
# curl. It centralizes: auth injection, HTTP-status checking (non-zero exit on >=400
|
||||
# so a failed write can never look like success), one bounded retry on 5xx/connection
|
||||
# errors, idempotent append (read-before-POST), correct `::` heading-target handling,
|
||||
# frontmatter field patches, and an advisory multi-writer lock.
|
||||
#
|
||||
# Config (env overrides; defaults match the rest of the plugin):
|
||||
# ECHO_BASE default https://echoapi.alwisp.com
|
||||
# ECHO_KEY default the plugin bearer token
|
||||
# ECHO_VERIFY default 1 — read-back verify after a PUT
|
||||
# ECHO_LOCK_TTL default 900 — seconds before an advisory lock is considered stale
|
||||
#
|
||||
# Usage:
|
||||
# echo.sh get <path> # print file contents (404 -> exit 44)
|
||||
# echo.sh map <path> # document-map JSON (headings/blocks/frontmatter)
|
||||
# echo.sh ls <dir> # directory listing JSON
|
||||
# echo.sh search <query...> # /search/simple
|
||||
# echo.sh put <path> [file] # create/overwrite (body from file or stdin)
|
||||
# echo.sh post <path> [file] # raw append (NON-idempotent; prefer `append`)
|
||||
# echo.sh append <path> <line> # idempotent append: skips if the exact line exists
|
||||
# echo.sh patch <path> <append|prepend|replace> <heading|frontmatter|block> <target> [file]
|
||||
# echo.sh fm <path> <field> <json-value> # PATCH a frontmatter scalar (e.g. fm p.md updated '"2026-06-19"')
|
||||
# echo.sh bump <path> [YYYY-MM-DD] # set frontmatter updated: to today (or given date)
|
||||
# echo.sh delete <path> # DELETE (destructive; explicit use only)
|
||||
# echo.sh lock <owner-id> # acquire advisory lock (exit 75 if held by someone else & fresh)
|
||||
# echo.sh unlock <owner-id> # release advisory lock if owned by <owner-id>
|
||||
#
|
||||
# Exit codes: 0 ok · 44 not-found(404) · 75 lock-held · 2 usage · 1 other HTTP/transport error.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
ECHO_BASE="${ECHO_BASE:-https://echoapi.alwisp.com}"
|
||||
ECHO_BASE="${ECHO_BASE%/}"
|
||||
ECHO_KEY="${ECHO_KEY:-241265fbe6830934a9a4ad3e69335f64a42153b663aa5b0017cb1ea1217b2bab}"
|
||||
ECHO_VERIFY="${ECHO_VERIFY:-1}"
|
||||
ECHO_LOCK_TTL="${ECHO_LOCK_TTL:-900}"
|
||||
AUTH="Authorization: Bearer ${ECHO_KEY}"
|
||||
|
||||
# response-body scratch file (filled by every _curl; read by callers via $RESP)
|
||||
RESP="$(mktemp)"; BODY=""
|
||||
cleanup() { rm -f "$RESP" "${BODY:-}"; }
|
||||
trap cleanup EXIT
|
||||
|
||||
die() { echo "echo.sh: $*" >&2; exit 1; }
|
||||
usage() { sed -n '2,40p' "$0" >&2; exit 2; }
|
||||
|
||||
_today() { echo "${ECHO_TODAY:-$(date +%Y-%m-%d)}"; }
|
||||
_now_iso() { date -u +%Y-%m-%dT%H:%M:%SZ; }
|
||||
_vault_url() { echo "${ECHO_BASE}/vault/$1"; }
|
||||
|
||||
# _curl METHOD URL [extra curl args...]
|
||||
# Writes the response body to $RESP and the status code to $HTTP — BOTH IN THE PARENT
|
||||
# SHELL (never call this in $(...) or on the right of a pipe, or those globals are lost).
|
||||
# One bounded retry on transport failure (000) or 5xx.
|
||||
HTTP=""
|
||||
_curl() {
|
||||
local method="$1" url="$2"; shift 2
|
||||
local code attempt=0
|
||||
while :; do
|
||||
code="$(curl -sS -X "$method" -H "$AUTH" -o "$RESP" -w '%{http_code}' "$@" "$url" 2>/dev/null || echo 000)"
|
||||
if { [ "$code" = "000" ] || [ "${code:0:1}" = "5" ]; } && [ "$attempt" -lt 1 ]; then
|
||||
attempt=$((attempt+1)); sleep 1; continue
|
||||
fi
|
||||
break
|
||||
done
|
||||
HTTP="$code"
|
||||
}
|
||||
|
||||
# assert $HTTP is acceptable; 404 -> exit 44, other >=400 -> exit 1
|
||||
_check() {
|
||||
local ctx="$1"
|
||||
[ "$HTTP" = "404" ] && exit 44
|
||||
[ "$HTTP" = "000" ] && die "$ctx: vault unreachable (connection failed) [$ECHO_BASE]"
|
||||
[ "${HTTP:-000}" -ge 400 ] && die "$ctx: HTTP $HTTP — $(cat "$RESP")"
|
||||
return 0
|
||||
}
|
||||
|
||||
# capture a body argument (file path or '-'/empty for stdin) into $BODY (a temp file)
|
||||
_capture_body() {
|
||||
BODY="$(mktemp)"
|
||||
if [ "${1:-}" = "" ] || [ "${1:-}" = "-" ]; then cat > "$BODY"; else cat "$1" > "$BODY"; fi
|
||||
}
|
||||
|
||||
cmd="${1:-}"; shift || true
|
||||
case "$cmd" in
|
||||
get)
|
||||
[ $# -ge 1 ] || usage
|
||||
_curl GET "$(_vault_url "$1")"; _check "get $1"; cat "$RESP" ;;
|
||||
|
||||
map)
|
||||
[ $# -ge 1 ] || usage
|
||||
_curl GET "$(_vault_url "$1")" -H 'Accept: application/vnd.olrapi.document-map+json'
|
||||
_check "map $1"; cat "$RESP" ;;
|
||||
|
||||
ls)
|
||||
[ $# -ge 1 ] || usage
|
||||
p="$1"; [ "${p%/}" = "$p" ] && p="$p/"
|
||||
_curl GET "$(_vault_url "$p")"; _check "ls $1"; cat "$RESP" ;;
|
||||
|
||||
search)
|
||||
[ $# -ge 1 ] || usage
|
||||
q="$*"; q="${q// /+}"
|
||||
_curl POST "${ECHO_BASE}/search/simple/?query=${q}"; _check "search"; cat "$RESP" ;;
|
||||
|
||||
put)
|
||||
[ $# -ge 1 ] || usage
|
||||
path="$1"; _capture_body "${2:-}"
|
||||
_curl PUT "$(_vault_url "$path")" -H 'Content-Type: text/markdown' --data-binary @"$BODY"
|
||||
_check "put $path"
|
||||
if [ "$ECHO_VERIFY" = "1" ]; then
|
||||
_curl GET "$(_vault_url "$path")"
|
||||
[ "$HTTP" = "200" ] || die "put $path: write did not verify (GET returned $HTTP)"
|
||||
fi
|
||||
echo "ok: PUT $path" ;;
|
||||
|
||||
post)
|
||||
[ $# -ge 1 ] || usage
|
||||
path="$1"; _capture_body "${2:-}"
|
||||
_curl POST "$(_vault_url "$path")" -H 'Content-Type: text/markdown' --data-binary @"$BODY"
|
||||
_check "post $path"; echo "ok: POST $path" ;;
|
||||
|
||||
append)
|
||||
# idempotent: GET the file, skip the POST if the exact line is already present.
|
||||
[ $# -ge 2 ] || usage
|
||||
path="$1"; line="$2"
|
||||
_curl GET "$(_vault_url "$path")"
|
||||
if [ "$HTTP" = "200" ] && grep -qF -- "$line" "$RESP"; then
|
||||
echo "skip: line already present in $path"; exit 0
|
||||
fi
|
||||
[ "$HTTP" = "200" ] || [ "$HTTP" = "404" ] || _check "append(read) $path"
|
||||
BODY="$(mktemp)"; printf '%s\n' "$line" > "$BODY"
|
||||
_curl POST "$(_vault_url "$path")" -H 'Content-Type: text/markdown' --data-binary @"$BODY"
|
||||
_check "append $path"; echo "ok: APPEND $path" ;;
|
||||
|
||||
patch)
|
||||
[ $# -ge 4 ] || usage
|
||||
path="$1"; op="$2"; ttype="$3"; target="$4"; _capture_body "${5:-}"
|
||||
case "$op" in append|prepend|replace) ;; *) die "patch: op must be append|prepend|replace";; esac
|
||||
case "$ttype" in heading|frontmatter|block) ;; *) die "patch: target-type must be heading|frontmatter|block";; esac
|
||||
_curl PATCH "$(_vault_url "$path")" \
|
||||
-H "Operation: $op" -H "Target-Type: $ttype" -H "Target: $target" \
|
||||
-H 'Content-Type: text/markdown' --data-binary @"$BODY"
|
||||
_check "patch $path ($target)"
|
||||
echo "ok: PATCH $op $ttype '$target' -> $path" ;;
|
||||
|
||||
fm)
|
||||
[ $# -ge 3 ] || usage
|
||||
path="$1"; field="$2"; value="$3"
|
||||
BODY="$(mktemp)"; printf '%s' "$value" > "$BODY"
|
||||
_curl PATCH "$(_vault_url "$path")" \
|
||||
-H 'Operation: replace' -H 'Target-Type: frontmatter' -H "Target: $field" \
|
||||
-H 'Content-Type: application/json' --data-binary @"$BODY"
|
||||
_check "fm $path.$field"; echo "ok: frontmatter $field -> $path" ;;
|
||||
|
||||
bump)
|
||||
[ $# -ge 1 ] || usage
|
||||
path="$1"; d="${2:-$(_today)}"
|
||||
exec "$0" fm "$path" updated "\"$d\"" ;;
|
||||
|
||||
delete)
|
||||
[ $# -ge 1 ] || usage
|
||||
_curl DELETE "$(_vault_url "$1")"; _check "delete $1"; echo "ok: DELETE $1" ;;
|
||||
|
||||
lock)
|
||||
# advisory lock: _agent/locks/vault.lock holds "<owner> @ <iso>". Honored cooperatively.
|
||||
[ $# -ge 1 ] || usage
|
||||
owner="$1"; lockpath="_agent/locks/vault.lock"
|
||||
_curl GET "$(_vault_url "$lockpath")"
|
||||
if [ "$HTTP" = "200" ] && [ -s "$RESP" ]; then
|
||||
cur="$(cat "$RESP")"; held_owner="${cur%% @ *}"; held_iso="${cur##* @ }"; held_iso="${held_iso%$'\n'}"
|
||||
held_epoch="$(date -u -j -f '%Y-%m-%dT%H:%M:%SZ' "$held_iso" +%s 2>/dev/null \
|
||||
|| date -u -d "$held_iso" +%s 2>/dev/null || echo 0)"
|
||||
now_epoch="$(date -u +%s)"
|
||||
if [ "$held_owner" != "$owner" ] && [ $((now_epoch - held_epoch)) -lt "$ECHO_LOCK_TTL" ]; then
|
||||
echo "lock held by '$held_owner' since $held_iso (fresh)" >&2; exit 75
|
||||
fi
|
||||
fi
|
||||
BODY="$(mktemp)"; printf '%s @ %s\n' "$owner" "$(_now_iso)" > "$BODY"
|
||||
_curl PUT "$(_vault_url "$lockpath")" -H 'Content-Type: text/markdown' --data-binary @"$BODY"
|
||||
_check "lock"; echo "ok: locked by $owner" ;;
|
||||
|
||||
unlock)
|
||||
[ $# -ge 1 ] || usage
|
||||
owner="$1"; lockpath="_agent/locks/vault.lock"
|
||||
_curl GET "$(_vault_url "$lockpath")"
|
||||
if [ "$HTTP" = "200" ]; then
|
||||
cur="$(cat "$RESP")"; held_owner="${cur%% @ *}"
|
||||
[ "$held_owner" = "$owner" ] || { echo "lock owned by '$held_owner', not '$owner' — not releasing" >&2; exit 75; }
|
||||
fi
|
||||
_curl DELETE "$(_vault_url "$lockpath")"
|
||||
[ "$HTTP" = "404" ] || _check "unlock"
|
||||
echo "ok: unlocked" ;;
|
||||
|
||||
""|-h|--help|help) usage ;;
|
||||
*) die "unknown command '$cmd' (try: get map ls search put post append patch fm bump delete lock unlock)" ;;
|
||||
esac
|
||||
@@ -0,0 +1,97 @@
|
||||
#!/usr/bin/env bash
|
||||
# migrate.sh — bring an existing ECHO vault up to the plugin's current schema.
|
||||
#
|
||||
# Reads the marker's schema_version and applies each intervening migration in order.
|
||||
# Migrations are idempotent and additive; every destructive step (DELETE) is gated
|
||||
# behind --apply AND prints what it will do first. Default mode is a DRY-RUN plan.
|
||||
#
|
||||
# Usage:
|
||||
# migrate.sh # print the migration plan (no changes)
|
||||
# migrate.sh --apply # perform the migration (moves/deletes included)
|
||||
#
|
||||
# Env: ECHO_BASE, ECHO_KEY (via echo.sh).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
ECHO="$SCRIPT_DIR/echo.sh"
|
||||
CURRENT_SCHEMA=2
|
||||
APPLY=0
|
||||
[ "${1:-}" = "--apply" ] && APPLY=1
|
||||
|
||||
[ -x "$ECHO" ] || chmod +x "$ECHO" 2>/dev/null || true
|
||||
say() { echo "migrate: $*"; }
|
||||
do_or_show() { # do_or_show "<human description>" cmd args...
|
||||
local desc="$1"; shift
|
||||
if [ "$APPLY" = "1" ]; then say "APPLY $desc"; "$@"; else say "PLAN $desc"; fi
|
||||
}
|
||||
|
||||
# ---- Read current schema -----------------------------------------------------
|
||||
if ! marker="$("$ECHO" get _agent/echo-vault.md 2>/dev/null)"; then
|
||||
say "marker missing — vault not bootstrapped. Run bootstrap.sh, not migrate.sh."
|
||||
exit 3
|
||||
fi
|
||||
FROM="$(printf '%s' "$marker" | sed -n 's/^schema_version:[[:space:]]*//p' | head -1)"
|
||||
FROM="${FROM:-0}"
|
||||
say "vault schema_version=$FROM, plugin schema=$CURRENT_SCHEMA $([ "$APPLY" = 1 ] && echo '(APPLY)' || echo '(dry-run)')"
|
||||
|
||||
if [ "$FROM" -ge "$CURRENT_SCHEMA" ] 2>/dev/null; then
|
||||
say "up to date — nothing to do."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
ls_files() { "$ECHO" ls "$1" 2>/dev/null | python3 -c 'import sys,json;print("\n".join(json.load(sys.stdin).get("files",[])))' 2>/dev/null || true; }
|
||||
move() { # move SRC DST preserving content (PUT dst <- get src, then delete src)
|
||||
local src="$1" dst="$2"
|
||||
"$ECHO" get "$src" 2>/dev/null | "$ECHO" put "$dst" - >/dev/null
|
||||
"$ECHO" delete "$src" >/dev/null
|
||||
}
|
||||
|
||||
# ---- 0 -> 1 : control docs moved into the plugin -----------------------------
|
||||
mig_0_1() {
|
||||
say "[0->1] retire in-vault control docs (CLAUDE/BOOTSTRAP/STRUCTURE/index.md)"
|
||||
for f in CLAUDE.md BOOTSTRAP.md STRUCTURE.md index.md; do
|
||||
if ECHO_VERIFY=0 "$ECHO" get "$f" >/dev/null 2>&1; then
|
||||
do_or_show "delete vault/$f (back it up outside the vault first)" "$ECHO" delete "$f"
|
||||
fi
|
||||
done
|
||||
say "[0->1] reminder: scrub dangling [[CLAUDE]]/[[BOOTSTRAP]]/[[STRUCTURE]]/[[index]] links from ## Related sections (manual/agent step)."
|
||||
}
|
||||
|
||||
# ---- 1 -> 2 : reviews/ folded into journal/ + _agent/health/ -----------------
|
||||
mig_1_2() {
|
||||
say "[1->2] fold reviews/ into journal/ and _agent/health/"
|
||||
for f in $(ls_files reviews/weekly); do
|
||||
[ "${f%.md}" != "$f" ] || continue
|
||||
dst="journal/weekly/$(printf '%s' "$f" | sed 's/-review\.md$/.md/')"
|
||||
do_or_show "move reviews/weekly/$f -> $dst" move "reviews/weekly/$f" "$dst"
|
||||
done
|
||||
for f in $(ls_files reviews/monthly); do
|
||||
[ "${f%.md}" != "$f" ] || continue
|
||||
case "$f" in
|
||||
*vault-health.md) dst="_agent/health/$f" ;;
|
||||
*) dst="journal/monthly/$f" ;;
|
||||
esac
|
||||
do_or_show "move reviews/monthly/$f -> $dst" move "reviews/monthly/$f" "$dst"
|
||||
done
|
||||
for period in quarterly annual; do
|
||||
for f in $(ls_files "reviews/$period"); do
|
||||
[ "${f%.md}" != "$f" ] || continue
|
||||
do_or_show "move reviews/$period/$f -> journal/$period/$f" move "reviews/$period/$f" "journal/$period/$f"
|
||||
done
|
||||
done
|
||||
say "[1->2] reminder: update inbound [[reviews/...]] wikilinks in ## Related sections (manual/agent step)."
|
||||
}
|
||||
|
||||
[ "$FROM" -lt 1 ] && mig_0_1
|
||||
[ "$FROM" -lt 2 ] && mig_1_2
|
||||
|
||||
# ---- Stamp the marker --------------------------------------------------------
|
||||
do_or_show "set _agent/echo-vault.md schema_version -> $CURRENT_SCHEMA" \
|
||||
"$ECHO" fm _agent/echo-vault.md schema_version "$CURRENT_SCHEMA"
|
||||
|
||||
if [ "$APPLY" = "1" ]; then
|
||||
say "migration complete -> schema $CURRENT_SCHEMA. Run vault-lint.sh to confirm invariants."
|
||||
else
|
||||
say "dry-run only. Re-run with --apply to perform the moves/deletes above."
|
||||
fi
|
||||
@@ -0,0 +1,54 @@
|
||||
{
|
||||
"$comment": "CANONICAL machine-readable routing manifest for the ECHO vault. This is the single source of truth for 'what paths may be written to'. The human-readable tables in SKILL.md, references/routing-map.md, and references/api-reference.md are DERIVED views of this file — when they disagree, this file wins. vault-lint.sh consumes it to enforce the core rule: if a path matches no route here (and is not a retired path), nothing should be written to it. Patterns are Python regexes matched against vault-root-relative paths (no leading slash, no /vault/ prefix).",
|
||||
"schema_version": 2,
|
||||
"routes": [
|
||||
{ "id": "inbox-captures", "pattern": "^inbox/captures/inbox\\.md$", "method": "POST", "trigger": "Destination unknown at capture time", "distinct_because": "Only path whose contract is deferred routing" },
|
||||
{ "id": "inbox-imports", "pattern": "^inbox/imports/[^/]+\\.md$", "method": "PUT", "trigger": "Raw external material dropped wholesale", "distinct_because": "Bulk un-triaged material vs single-line captures" },
|
||||
{ "id": "inbox-processing-log", "pattern": "^inbox/processing-log/\\d{4}-\\d{2}-\\d{2}\\.md$", "method": "POST", "trigger": "An inbox item is routed to its real home", "distinct_because": "Audit trail of moves, not memory itself" },
|
||||
|
||||
{ "id": "journal-daily", "pattern": "^journal/daily/\\d{4}-\\d{2}-\\d{2}\\.md$", "method": "PATCH", "trigger": "First agent activity on a given day", "distinct_because": "Finest grain; PATCHed repeatedly within its period" },
|
||||
{ "id": "journal-weekly", "pattern": "^journal/weekly/\\d{4}-W\\d{2}\\.md$", "method": "PUT", "trigger": "First substantive session of a new ISO week (opt-in)", "distinct_because": "ISO-week grain" },
|
||||
{ "id": "journal-monthly", "pattern": "^journal/monthly/\\d{4}-\\d{2}\\.md$", "method": "PUT", "trigger": "First substantive session of a new month", "distinct_because": "Month grain" },
|
||||
{ "id": "journal-quarterly", "pattern": "^journal/quarterly/\\d{4}-Q[1-4]\\.md$", "method": "PUT", "trigger": "Manual / on request only", "distinct_because": "Strategic grain; never auto-fires" },
|
||||
{ "id": "journal-annual", "pattern": "^journal/annual/\\d{4}\\.md$", "method": "PUT", "trigger": "Manual / on request only", "distinct_because": "Coarsest grain; never auto-fires" },
|
||||
{ "id": "journal-templates", "pattern": "^journal/templates/.+\\.md$", "method": "PUT", "trigger": "Bootstrap seed only", "distinct_because": "Holds templates, not journal content" },
|
||||
|
||||
{ "id": "projects-active", "pattern": "^projects/active/[^/]+\\.md$", "method": "PUT", "trigger": "Work in motion now", "distinct_because": "Default state for anything being worked", "status": "active" },
|
||||
{ "id": "projects-incubating", "pattern": "^projects/incubating/[^/]+\\.md$", "method": "PUT", "trigger": "Idea captured, work not started", "distinct_because": "Pre-work", "status": "incubating" },
|
||||
{ "id": "projects-on-hold", "pattern": "^projects/on-hold/[^/]+\\.md$", "method": "PUT", "trigger": "Paused but still tracked", "distinct_because": "Resumable; not terminal", "status": "on-hold" },
|
||||
{ "id": "projects-archived", "pattern": "^projects/archived/[^/]+\\.md$", "method": "PUT", "trigger": "Done, abandoned, or rolled up", "distinct_because": "Terminal; kept for history", "status": "archived" },
|
||||
{ "id": "projects-template", "pattern": "^projects/project-template\\.md$", "method": "PUT", "trigger": "Bootstrap seed only", "distinct_because": "Template, not a project" },
|
||||
|
||||
{ "id": "areas", "pattern": "^areas/(business|personal|learning|systems)/[^/]+\\.md$", "method": "PUT", "trigger": "Ongoing domain with no finish line", "distinct_because": "No end state — disqualifies it from projects/" },
|
||||
|
||||
{ "id": "resources-people", "pattern": "^resources/people/[^/]+\\.md$", "method": "PUT", "trigger": "A fact about a specific person", "distinct_because": "Keyed to a person" },
|
||||
{ "id": "resources-companies", "pattern": "^resources/companies/[^/]+\\.md$", "method": "PUT", "trigger": "A fact about an organization", "distinct_because": "Keyed to an organization, not an individual" },
|
||||
{ "id": "resources-concepts", "pattern": "^resources/concepts/[^/]+\\.md$", "method": "PUT", "trigger": "A reusable concept/idea", "distinct_because": "An idea vs an external source" },
|
||||
{ "id": "resources-references", "pattern": "^resources/references/[^/]+\\.md$", "method": "PUT", "trigger": "An external source/link worth keeping", "distinct_because": "Points outward" },
|
||||
{ "id": "resources-meetings", "pattern": "^resources/meetings/\\d{4}-\\d{2}-\\d{2}-[^/]+\\.md$", "method": "PUT", "trigger": "Notes tied to a specific meeting", "distinct_because": "Event-anchored to a meeting" },
|
||||
|
||||
{ "id": "decisions-by-date", "pattern": "^decisions/by-date/\\d{4}-\\d{2}-\\d{2}-[^/]+\\.md$", "method": "PUT", "trigger": "A non-obvious decision worth recording", "distinct_because": "Chronological system of record for decisions" },
|
||||
{ "id": "decisions-template", "pattern": "^decisions/decision-template\\.md$", "method": "PUT", "trigger": "Bootstrap seed only", "distinct_because": "Template, not a decision" },
|
||||
|
||||
{ "id": "agent-marker", "pattern": "^_agent/echo-vault\\.md$", "method": "PUT", "trigger": "Bootstrap / schema migration only", "distinct_because": "Plugin-owned probe; never hand-edited" },
|
||||
{ "id": "agent-context", "pattern": "^_agent/context/[^/]+\\.md$", "method": "PATCH", "trigger": "Active scope changes / task bundles", "distinct_because": "Single live scope pointer + bundles" },
|
||||
{ "id": "agent-semantic", "pattern": "^_agent/memory/semantic/[^/]+\\.md$", "method": "PUT", "trigger": "A durable fact/pattern (incl. operator-preferences.md)", "distinct_because": "Timeless fact" },
|
||||
{ "id": "agent-episodic", "pattern": "^_agent/memory/episodic/[^/]+\\.md$", "method": "PUT", "trigger": "A record of what happened, when", "distinct_because": "Anchored to an event in time" },
|
||||
{ "id": "agent-working", "pattern": "^_agent/memory/working/[^/]+\\.md$", "method": "PUT", "trigger": "Short-lived state for the current effort", "distinct_because": "Explicitly transient" },
|
||||
{ "id": "agent-sessions", "pattern": "^_agent/sessions/\\d{4}-\\d{2}-\\d{2}(-\\d{4})?-[^/]+\\.md$", "method": "PUT", "trigger": "A substantive session ends", "distinct_because": "Per-session record (new ones require HHMM)" },
|
||||
{ "id": "agent-health", "pattern": "^_agent/health/\\d{4}-\\d{2}-vault-health\\.md$", "method": "PUT", "trigger": "First substantive session of a month", "distinct_because": "Vault integrity, not work narrative" },
|
||||
{ "id": "agent-heartbeat", "pattern": "^_agent/heartbeat/[^/]+\\.md$", "method": "PUT", "trigger": "End of every session", "distinct_because": "O(1) orientation pointer; overwritten, never grows" },
|
||||
{ "id": "agent-templates", "pattern": "^_agent/templates/.+\\.md$", "method": "PUT", "trigger": "Bootstrap seed only", "distinct_because": "Holds templates, not memory" },
|
||||
{ "id": "agent-skills-active", "pattern": "^_agent/skills/active/[^/]+\\.md$", "method": "PUT", "trigger": "A skill/plugin catalogued as a capability", "distinct_because": "Catalogs a capability vs the build effort" },
|
||||
{ "id": "agent-skills-archived","pattern": "^_agent/skills/archived/[^/]+\\.md$", "method": "PUT", "trigger": "A catalogued skill is retired", "distinct_because": "Terminal state of the skill catalog" },
|
||||
{ "id": "agent-locks", "pattern": "^_agent/locks/[^/]+\\.lock$", "method": "PUT", "trigger": "Advisory multi-writer lock acquire/release", "distinct_because": "Concurrency coordination, not memory" },
|
||||
|
||||
{ "id": "leaf-readme", "pattern": "^(.+/)?README\\.md$", "method": "PUT", "trigger": "Bootstrap leaf signpost / vault root README", "distinct_because": "Human signpost, not read for routing" }
|
||||
],
|
||||
"retired": [
|
||||
{ "pattern": "^reviews/", "retired_in_schema": 2, "replacement": "journal/{weekly,monthly,quarterly,annual}/ and _agent/health/" },
|
||||
{ "pattern": "^decisions/by-project/", "retired_in_schema": 1, "replacement": "[[wikilink]] under the project's ## Key Decisions" },
|
||||
{ "pattern": "^archive/", "retired_in_schema": 0, "replacement": "projects/archived/ and _agent/skills/archived/" },
|
||||
{ "pattern": "^(CLAUDE|BOOTSTRAP|STRUCTURE|index)\\.md$", "retired_in_schema": 1, "replacement": "All control logic lives in the plugin references/, not the vault" }
|
||||
]
|
||||
}
|
||||
Regular → Executable
+177
-59
@@ -1,27 +1,37 @@
|
||||
#!/usr/bin/env bash
|
||||
# vault-lint.sh — mechanically assert ECHO vault invariants.
|
||||
#
|
||||
# Catches the recurring "invariant violation" bugs that prose rules can't enforce
|
||||
# on their own: folder<->status drift, duplicate slugs, wikilinks leaking into
|
||||
# frontmatter, duplicate "## Agent Log" headings, stale active projects, and
|
||||
# aging inbox captures. Invoked by the monthly Vault Health pass (see SKILL.md),
|
||||
# but safe to run any time — it is READ-ONLY and never modifies the vault.
|
||||
# Catches the recurring "invariant violation" bugs that prose rules can't enforce:
|
||||
# folder<->status drift, duplicate slugs, wikilinks leaking into frontmatter,
|
||||
# duplicate "## Agent Log" headings, stale active projects, aging inbox captures,
|
||||
# impossible dates, bad status values, missing frontmatter, broken source_notes, and
|
||||
# paths that no route in routing.json permits. Invoked by the monthly Vault Health
|
||||
# pass (see SKILL.md), but safe to run any time — it is READ-ONLY.
|
||||
#
|
||||
# Exit status: 0 = clean, 1 = violations found, 2 = vault unreachable.
|
||||
# Exit status: 0 = clean, 1 = violations found, 2 = vault unreachable,
|
||||
# 3 = vault not bootstrapped (marker missing).
|
||||
#
|
||||
# Config is hardcoded to match the rest of the plugin; override via env if needed:
|
||||
# Config (env overrides):
|
||||
# ECHO_BASE (default https://echoapi.alwisp.com)
|
||||
# ECHO_KEY (default the plugin's bearer token)
|
||||
# ECHO_TODAY (default the machine date) — pass the conversation's currentDate so
|
||||
# stale/aging math uses the SAME clock the agent writes with (YYYY-MM-DD)
|
||||
# STALE_DAYS (default 30) INBOX_DAYS (default 14)
|
||||
#
|
||||
# routing.json (canonical route manifest) is read from this script's own directory.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
ECHO_BASE="${ECHO_BASE:-https://echoapi.alwisp.com}"
|
||||
ECHO_KEY="${ECHO_KEY:-241265fbe6830934a9a4ad3e69335f64a42153b663aa5b0017cb1ea1217b2bab}"
|
||||
STALE_DAYS="${STALE_DAYS:-30}"
|
||||
INBOX_DAYS="${INBOX_DAYS:-14}"
|
||||
ECHO_TODAY="${ECHO_TODAY:-$(date +%Y-%m-%d)}"
|
||||
|
||||
ECHO_BASE="$ECHO_BASE" ECHO_KEY="$ECHO_KEY" STALE_DAYS="$STALE_DAYS" INBOX_DAYS="$INBOX_DAYS" \
|
||||
ECHO_TODAY="$ECHO_TODAY" ROUTING_JSON="$SCRIPT_DIR/routing.json" \
|
||||
python3 - <<'PY'
|
||||
import os, sys, json, re, datetime, urllib.request, urllib.error
|
||||
|
||||
@@ -29,9 +39,21 @@ BASE = os.environ["ECHO_BASE"].rstrip("/")
|
||||
KEY = os.environ["ECHO_KEY"]
|
||||
STALE_DAYS = int(os.environ["STALE_DAYS"])
|
||||
INBOX_DAYS = int(os.environ["INBOX_DAYS"])
|
||||
TODAY = datetime.date.today()
|
||||
TODAY = datetime.date.fromisoformat(os.environ["ECHO_TODAY"])
|
||||
ROUTING_JSON = os.environ["ROUTING_JSON"]
|
||||
LIFECYCLES = ["active", "incubating", "on-hold", "archived"]
|
||||
SKIP = {"README.md", "project-template.md", "decision-template.md"}
|
||||
REQUIRED_FM = ("type", "created")
|
||||
# Project status vocabulary IS enforced (status must equal the lifecycle folder) by the
|
||||
# folder/status check below. Other note kinds (decisions/concepts) carry free-form status
|
||||
# vocab (accepted, shipped, reference, ...), so there is no global status allow-list.
|
||||
|
||||
# optional real YAML parser; fall back to a tolerant line parser
|
||||
try:
|
||||
import yaml # type: ignore
|
||||
HAVE_YAML = True
|
||||
except Exception:
|
||||
HAVE_YAML = False
|
||||
|
||||
violations = []
|
||||
def flag(check, msg): violations.append((check, msg))
|
||||
@@ -48,32 +70,72 @@ def get(path):
|
||||
return None
|
||||
raise
|
||||
|
||||
def listdir(path):
|
||||
body = get(path if path.endswith("/") else path + "/")
|
||||
if body is None:
|
||||
return []
|
||||
def list_dir(path):
|
||||
"""Return (files, folders) for a vault directory. Directories may arrive either in a
|
||||
'folders' key OR as 'files' entries ending in '/'; handle both. Root is '' -> /vault/.
|
||||
Tolerates non-404 errors (e.g. a 400 on an odd path) by returning empty."""
|
||||
p = "" if path in ("", "/") else (path if path.endswith("/") else path + "/")
|
||||
try:
|
||||
return json.loads(body).get("files", [])
|
||||
body = get(p)
|
||||
except urllib.error.HTTPError:
|
||||
return [], []
|
||||
if body is None:
|
||||
return [], []
|
||||
try:
|
||||
j = json.loads(body)
|
||||
except json.JSONDecodeError:
|
||||
return []
|
||||
return [], []
|
||||
entries = list(j.get("files", [])) + list(j.get("folders", []))
|
||||
files = [e for e in entries if not e.endswith("/")]
|
||||
folders = [e[:-1] for e in entries if e.endswith("/")]
|
||||
return files, folders
|
||||
|
||||
def frontmatter(text):
|
||||
"""Return (raw_frontmatter_str, dict_of_scalar_fields). Empty if no block."""
|
||||
if not text or not text.startswith("---"):
|
||||
def walk(prefix=""):
|
||||
"""Yield every file path under prefix (recursive). prefix is '' or ends with '/'."""
|
||||
files, folders = list_dir(prefix)
|
||||
for f in files:
|
||||
yield prefix + f
|
||||
for d in folders:
|
||||
yield from walk(f"{prefix}{d}/")
|
||||
|
||||
def split_frontmatter(text):
|
||||
"""Return (raw_yaml_str, body) splitting on anchored ^---$ delimiters. ('', text) if none."""
|
||||
if not text:
|
||||
return "", ""
|
||||
lines = text.splitlines()
|
||||
if not lines or lines[0].strip() != "---":
|
||||
return "", text
|
||||
for i in range(1, len(lines)):
|
||||
if lines[i].strip() == "---":
|
||||
return "\n".join(lines[1:i]), "\n".join(lines[i+1:])
|
||||
return "", text # unterminated block -> treat as no frontmatter
|
||||
|
||||
def parse_fm(text):
|
||||
"""Return (raw_yaml_str, dict). Uses PyYAML when available, else a tolerant parser."""
|
||||
raw, _ = split_frontmatter(text)
|
||||
if not raw:
|
||||
return "", {}
|
||||
end = text.find("\n---", 3)
|
||||
if end == -1:
|
||||
return "", {}
|
||||
raw = text[3:end]
|
||||
if HAVE_YAML:
|
||||
try:
|
||||
d = yaml.safe_load(raw)
|
||||
return raw, (d if isinstance(d, dict) else {})
|
||||
except Exception:
|
||||
pass
|
||||
# fallback: scalar + simple inline-list lines (keys may contain digits, _, -)
|
||||
fields = {}
|
||||
for line in raw.splitlines():
|
||||
m = re.match(r"^([A-Za-z_]+):\s*(.*)$", line)
|
||||
m = re.match(r"^([A-Za-z_][\w-]*):\s*(.*)$", line)
|
||||
if m:
|
||||
fields[m.group(1)] = m.group(2).strip()
|
||||
v = m.group(2).strip()
|
||||
if v.startswith("[") and v.endswith("]"):
|
||||
v = [x.strip().strip('"').strip("'") for x in v[1:-1].split(",") if x.strip()]
|
||||
else:
|
||||
v = v.strip('"').strip("'")
|
||||
fields[m.group(1)] = v
|
||||
return raw, fields
|
||||
|
||||
def parse_date(s):
|
||||
m = re.match(r"(\d{4}-\d{2}-\d{2})", s or "")
|
||||
m = re.match(r"(\d{4}-\d{2}-\d{2})", str(s or ""))
|
||||
if not m:
|
||||
return None
|
||||
try:
|
||||
@@ -81,66 +143,115 @@ def parse_date(s):
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
# Reachability probe
|
||||
def as_list(v):
|
||||
if v is None or v == "":
|
||||
return []
|
||||
return v if isinstance(v, list) else [v]
|
||||
|
||||
# ---- Reachability + bootstrap probe (M2: do NOT silently report clean) -------
|
||||
try:
|
||||
if get("_agent/echo-vault.md") is None:
|
||||
print("vault-lint: marker missing — vault may not be bootstrapped.", file=sys.stderr)
|
||||
print("vault-lint: marker missing — vault not bootstrapped (run bootstrap.sh).", file=sys.stderr)
|
||||
sys.exit(3)
|
||||
except Exception as e:
|
||||
print(f"vault-lint: vault unreachable ({e}).", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
|
||||
# ---- Projects: folder<->status, stale active, wikilinks-in-frontmatter, dup slugs
|
||||
# ---- Load canonical routing manifest (S3) ------------------------------------
|
||||
ROUTES, RETIRED = [], []
|
||||
try:
|
||||
with open(ROUTING_JSON) as fh:
|
||||
rj = json.load(fh)
|
||||
ROUTES = [(r["id"], re.compile(r["pattern"])) for r in rj.get("routes", [])]
|
||||
RETIRED = [(re.compile(r["pattern"]), r.get("replacement", "")) for r in rj.get("retired", [])]
|
||||
except Exception as e:
|
||||
flag("routing-manifest", f"could not load routing.json ({e}) — path checks skipped")
|
||||
|
||||
# ---- Single full walk feeds every path-level check ---------------------------
|
||||
all_files = list(walk())
|
||||
|
||||
def route_for(path):
|
||||
for rid, rx in ROUTES:
|
||||
if rx.match(path):
|
||||
return rid
|
||||
return None
|
||||
|
||||
# Path membership + retired-path detection (S3)
|
||||
for path in all_files:
|
||||
if ROUTES and route_for(path) is None:
|
||||
hit = next((repl for rx, repl in RETIRED if rx.match(path)), None)
|
||||
if hit is not None:
|
||||
flag("retired-path", f"{path}: retired location — should be {hit}")
|
||||
else:
|
||||
flag("unknown-path", f"{path}: matches no route in routing.json")
|
||||
|
||||
# ---- Per-note frontmatter checks (M5) ----------------------------------------
|
||||
TEMPLATE_RE = re.compile(r"(^|/)(templates/|.*-template\.md$)")
|
||||
for path in all_files:
|
||||
base = path.rsplit("/", 1)[-1]
|
||||
if base in SKIP or TEMPLATE_RE.search(path) or not path.endswith(".md"):
|
||||
continue
|
||||
text = get(path)
|
||||
if text is None:
|
||||
continue
|
||||
raw, fm = parse_fm(text)
|
||||
|
||||
# wikilinks anywhere in frontmatter (widened sweep — all folders)
|
||||
if "[[" in raw:
|
||||
flag("frontmatter-wikilink", f"{path}: '[[...]]' inside frontmatter")
|
||||
|
||||
# missing required frontmatter
|
||||
missing = [k for k in REQUIRED_FM if not str(fm.get(k, "")).strip()]
|
||||
if fm and missing:
|
||||
flag("missing-frontmatter", f"{path}: missing {', '.join(missing)}")
|
||||
|
||||
# impossible dates: updated < created
|
||||
c, u = parse_date(fm.get("created")), parse_date(fm.get("updated"))
|
||||
if c and u and u < c:
|
||||
flag("date-order", f"{path}: updated {u} is before created {c}")
|
||||
if u and u > TODAY:
|
||||
flag("future-date", f"{path}: updated {u} is in the future (today {TODAY})")
|
||||
|
||||
# source_notes hygiene: plain relative paths, never wikilinks, no self-reference
|
||||
for sn in as_list(fm.get("source_notes")):
|
||||
s = str(sn)
|
||||
if "[[" in s:
|
||||
flag("source-notes-wikilink", f"{path}: source_notes contains a wikilink '{s}'")
|
||||
|
||||
# ---- Projects: folder<->status, stale active, duplicate slugs ----------------
|
||||
slug_homes = {}
|
||||
for lc in LIFECYCLES:
|
||||
for fn in listdir(f"projects/{lc}"):
|
||||
if fn.endswith("/") or fn in SKIP:
|
||||
files, _ = list_dir(f"projects/{lc}")
|
||||
for fn in files:
|
||||
if fn.endswith("/") or fn in SKIP or not fn.endswith(".md"):
|
||||
continue
|
||||
slug = fn[:-3] if fn.endswith(".md") else fn
|
||||
slug = fn[:-3]
|
||||
slug_homes.setdefault(slug, []).append(lc)
|
||||
text = get(f"projects/{lc}/{fn}")
|
||||
if text is None:
|
||||
continue
|
||||
raw, fm = frontmatter(text)
|
||||
status = fm.get("status", "").strip().strip('"').strip("'")
|
||||
_, fm = parse_fm(text)
|
||||
status = str(fm.get("status", "")).strip().strip('"').strip("'")
|
||||
if status and status != lc:
|
||||
flag("folder/status", f"projects/{lc}/{fn}: status='{status}' but folder='{lc}'")
|
||||
if "[[" in raw:
|
||||
flag("frontmatter-wikilink", f"projects/{lc}/{fn}: '[[...]]' inside frontmatter")
|
||||
if lc == "active":
|
||||
d = parse_date(fm.get("updated", ""))
|
||||
d = parse_date(fm.get("updated"))
|
||||
if d and (TODAY - d).days > STALE_DAYS:
|
||||
flag("stale-active", f"projects/active/{fn}: updated {d} ({(TODAY-d).days}d ago) — consider on-hold/")
|
||||
for slug, homes in slug_homes.items():
|
||||
if len(homes) > 1:
|
||||
flag("duplicate-slug", f"'{slug}' exists in {', '.join(homes)}")
|
||||
|
||||
# ---- Wikilinks-in-frontmatter for other high-churn notes
|
||||
extra = ["_agent/context/current-context.md",
|
||||
"_agent/memory/semantic/operator-preferences.md"]
|
||||
for fn in listdir("resources/people"):
|
||||
if fn.endswith(".md") and fn not in SKIP:
|
||||
extra.append(f"resources/people/{fn}")
|
||||
for fn in listdir("_agent/memory/semantic"):
|
||||
if fn.endswith(".md") and fn not in SKIP:
|
||||
extra.append(f"_agent/memory/semantic/{fn}")
|
||||
for path in extra:
|
||||
text = get(path)
|
||||
if text is None:
|
||||
# ---- Daily notes: duplicate "## Agent Log" headings --------------------------
|
||||
for path in all_files:
|
||||
if not re.match(r"^journal/daily/.*\.md$", path):
|
||||
continue
|
||||
raw, _ = frontmatter(text)
|
||||
if "[[" in raw:
|
||||
flag("frontmatter-wikilink", f"{path}: '[[...]]' inside frontmatter")
|
||||
|
||||
# ---- Daily notes: duplicate "## Agent Log" headings
|
||||
for fn in listdir("journal/daily"):
|
||||
if not fn.endswith(".md") or fn in SKIP:
|
||||
continue
|
||||
text = get(f"journal/daily/{fn}") or ""
|
||||
text = get(path) or ""
|
||||
n = len(re.findall(r"(?m)^## Agent Log\s*$", text))
|
||||
if n > 1:
|
||||
flag("duplicate-agent-log", f"journal/daily/{fn}: {n} '## Agent Log' headings")
|
||||
flag("duplicate-agent-log", f"{path}: {n} '## Agent Log' headings")
|
||||
|
||||
# ---- Inbox: captures aging past INBOX_DAYS
|
||||
# ---- Inbox: captures aging past INBOX_DAYS -----------------------------------
|
||||
inbox = get("inbox/captures/inbox.md") or ""
|
||||
for line in inbox.splitlines():
|
||||
m = re.match(r"^\s*-\s*(\d{4}-\d{2}-\d{2})\b", line)
|
||||
@@ -149,7 +260,7 @@ for line in inbox.splitlines():
|
||||
if d and (TODAY - d).days > INBOX_DAYS:
|
||||
flag("aging-inbox", f"inbox capture {d} ({(TODAY-d).days}d): {line.strip()[:80]}")
|
||||
|
||||
# ---- Report
|
||||
# ---- Report ------------------------------------------------------------------
|
||||
if not violations:
|
||||
print("vault-lint: clean — all invariants hold.")
|
||||
sys.exit(0)
|
||||
@@ -165,6 +276,13 @@ labels = {
|
||||
"duplicate-agent-log": "Duplicate '## Agent Log' heading",
|
||||
"stale-active": f"Stale active project (updated > {STALE_DAYS}d)",
|
||||
"aging-inbox": f"Inbox capture aging (> {INBOX_DAYS}d)",
|
||||
"unknown-path": "Path matches no route in routing.json",
|
||||
"retired-path": "Write to a retired/dead path",
|
||||
"missing-frontmatter": "Missing required frontmatter field",
|
||||
"date-order": "updated earlier than created",
|
||||
"future-date": "updated date is in the future",
|
||||
"source-notes-wikilink": "Wikilink in source_notes (must be plain paths)",
|
||||
"routing-manifest": "routing.json problem",
|
||||
}
|
||||
for check, msgs in by.items():
|
||||
print(f"## {labels.get(check, check)}")
|
||||
|
||||
@@ -0,0 +1,87 @@
|
||||
# echo-memory eval — 0.6 vs 0.7 A/B harness
|
||||
|
||||
A reproducible, credential-free A/B comparison of the plugin **before** (0.6: raw-curl
|
||||
recipes from `SKILL.md`) and **after** (0.7: the shipped `scripts/echo.sh` client) the
|
||||
hardening work. It quantifies the claims in the comparative analysis: token cost of the
|
||||
I/O layer, and the rate of **silent write failures**.
|
||||
|
||||
## Run it
|
||||
|
||||
```bash
|
||||
cd eval
|
||||
python3 run_eval.py # default params
|
||||
python3 run_eval.py --recovery 2500 # sensitivity-test the recovery assumption
|
||||
python3 run_eval.py --cpt 3.5 # different chars/token proxy
|
||||
```
|
||||
|
||||
No network, no API key, no live vault. Pure stdlib (Python 3 + bash for `echo.sh`).
|
||||
Results table prints to stdout and a machine-readable copy lands in `results/latest.json`.
|
||||
|
||||
## How it works
|
||||
|
||||
- **`mock_olrapi.py`** — a deterministic mock of the Obsidian Local REST API surface the
|
||||
plugin uses, reproducing its real behaviors and quirks (404 shape, the `/vault//`
|
||||
double-slash 400, directory listings with `dir/` entries, `PATCH` heading targets that
|
||||
return `400 invalid-target / 40080` when the heading is absent). Faults are triggered by
|
||||
path markers so one server serves every scenario:
|
||||
- `flaky` in the path → first write returns `503`, then succeeds (tests retry).
|
||||
- `phantom` in the path → `PUT` returns `200` but does **not** persist (tests read-back verify).
|
||||
- a `PATCH` to a missing heading → `400` (the silent-write-loss trigger).
|
||||
- **`run_eval.py`** — for each scenario, runs both methods against a freshly reset +
|
||||
re-seeded server (so faults are identical for both), then reads ground truth back
|
||||
**independently** from the mock. The 0.7 side executes the *actual shipped `echo.sh`*;
|
||||
the 0.6 side faithfully models the documented recipe (real HTTP, but no status check,
|
||||
no retry, no verify, no dedupe).
|
||||
|
||||
## Metrics
|
||||
|
||||
| metric | meaning |
|
||||
|---|---|
|
||||
| `gen_tokens` | output tokens the model must generate for the op (`len(emitted)/cpt` proxy) |
|
||||
| `silent_failure` | method **reported success** but ground truth is wrong (lost write or dup) — and nobody noticed |
|
||||
| `detected` | the method surfaced the failure (nonzero exit) instead of hiding it |
|
||||
| `effective_tokens` | `gen + silent_failures*recovery + detected*detect_cost` |
|
||||
| `silent-error-free ops` | the headline accuracy number |
|
||||
| `writes actually persisted` | did the single op land (separate from "was it silent") |
|
||||
|
||||
## Scenarios
|
||||
|
||||
1. **agent-log-missing-heading** — `PATCH` append to a note lacking the target heading (`400`).
|
||||
2. **scope-switch** — clean `PATCH replace` (no fault; pure token comparison).
|
||||
3. **inbox-capture-replayed** — same capture issued twice (retry/replay): dedup vs duplicate.
|
||||
4. **session-log-flaky-network** — one-time `503`: retry vs single-shot.
|
||||
5. **heartbeat-phantom-write** — accepted-but-not-persisted: read-back verify vs none.
|
||||
6. **cold-start-load-6-reads** — 6 GETs (no fault; pure token comparison).
|
||||
|
||||
## Representative result (defaults)
|
||||
|
||||
```
|
||||
generated tokens 723 -> 174 (+76% fewer)
|
||||
silent failures 4 -> 0 (-4)
|
||||
duplicate lines 1 -> 0 (-1)
|
||||
silent-error-free ops 1/5 -> 5/5
|
||||
effective tokens (assumed) 6723 -> 334
|
||||
```
|
||||
|
||||
## Honest caveats
|
||||
|
||||
- **Mechanics, not reasoning.** This measures the deterministic plumbing differences. It
|
||||
does **not** measure model judgment (routing choices, prose quality) — that needs a live model.
|
||||
- **`recovery` and `detect-cost` are assumptions**, not measurements. The headline
|
||||
"silent failures: 4 → 0" is a hard count from ground truth; the `effective_tokens` figure
|
||||
is a model on top of it — tune `--recovery` to see the sensitivity.
|
||||
- **`gen_tokens` is a `chars/cpt` proxy** for the I/O layer only, not a tokenizer count, and
|
||||
excludes the one-time `+12%` SKILL.md context cost noted in the analysis (that's a
|
||||
per-session context cost, not per-op).
|
||||
|
||||
## Extending to a live-model run (optional)
|
||||
|
||||
To measure real model behavior and true token counts:
|
||||
1. Define the same six scenarios as natural-language tasks (e.g. "log a session note for X").
|
||||
2. Run each twice — once with the 0.6 skill files, once with 0.7 — through the Agent SDK
|
||||
against the **mock** server (point `ECHO_BASE` at it) so faults stay deterministic.
|
||||
3. Record `usage.output_tokens` per task from the API and whether the vault ended correct
|
||||
(same ground-truth read used here).
|
||||
|
||||
The mock + ground-truth checks in this harness are reusable as-is for that; only the driver
|
||||
changes from "scripted ops" to "model-driven ops".
|
||||
@@ -0,0 +1,172 @@
|
||||
#!/usr/bin/env python3
|
||||
"""mock_olrapi.py — a deterministic mock of the Obsidian Local REST API surface the
|
||||
echo-memory plugin uses, with controllable fault injection.
|
||||
|
||||
It reproduces the real API's observed behavior and quirks so the A/B eval can run
|
||||
without credentials and without touching the live vault:
|
||||
|
||||
GET /vault/<path> 200 + body, or 404 {errorCode:40400}
|
||||
GET /vault/ (or dir + '/') 200 {"files":[... , "sub/"]} (dirs end in '/')
|
||||
GET /vault// 400 (the double-slash quirk)
|
||||
GET ... Accept: document-map 200 {"headings":[...]}
|
||||
PUT /vault/<path> 201/200, stores body, creates parents
|
||||
POST /vault/<path> 200, appends (creates if absent)
|
||||
PATCH/vault/<path> heading target missing -> 400 {errorCode:40080}; else apply
|
||||
POST /search/simple/?query= 200 []
|
||||
DELETE /vault/<path> 200 / 404
|
||||
|
||||
Fault injection is triggered by markers in the path (so a single server serves every
|
||||
scenario deterministically):
|
||||
* path contains "flaky" -> the FIRST write (PUT/POST) to that path returns 503,
|
||||
subsequent writes succeed (tests retry-on-5xx).
|
||||
* path contains "phantom" -> PUT returns 200 but does NOT persist (accepted-but-lost;
|
||||
tests read-back verify).
|
||||
* a PATCH heading whose Target heading is absent from the stored doc -> 400 40080
|
||||
(the silent-write-loss trigger).
|
||||
|
||||
Debug (out of band, for the harness ground-truth checks):
|
||||
GET /__debug__?path=<p> -> raw stored content, or "<<MISSING>>"
|
||||
POST /__debug__reset -> clear all state + fault memory
|
||||
"""
|
||||
import json, re, sys, argparse
|
||||
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
|
||||
from urllib.parse import urlparse, parse_qs, unquote
|
||||
|
||||
STATE = {} # vault-path -> content (str)
|
||||
FLAKY_FIRED = set() # paths that already consumed their one-time 503
|
||||
|
||||
def headings_of(text):
|
||||
return [m.group(1).strip() for m in re.finditer(r"(?m)^#{1,6}\s+(.*)$", text or "")]
|
||||
|
||||
def heading_present(text, target):
|
||||
# target may be a full "A::B" path; match its leaf heading line anywhere in the doc
|
||||
leaf = target.split("::")[-1].strip()
|
||||
return any(h == leaf or h.endswith("::" + leaf) or h == target for h in headings_of(text)) \
|
||||
or bool(re.search(r"(?m)^#{1,6}\s+" + re.escape(leaf) + r"\s*$", text or ""))
|
||||
|
||||
class H(BaseHTTPRequestHandler):
|
||||
def log_message(self, *a): pass # quiet
|
||||
|
||||
def _send(self, code, body="", ctype="text/markdown"):
|
||||
b = body.encode() if isinstance(body, str) else body
|
||||
self.send_response(code)
|
||||
self.send_header("Content-Type", ctype)
|
||||
self.send_header("Content-Length", str(len(b)))
|
||||
self.end_headers()
|
||||
self.wfile.write(b)
|
||||
|
||||
def _json(self, code, obj):
|
||||
self._send(code, json.dumps(obj, indent=2), "application/json")
|
||||
|
||||
def _read_body(self):
|
||||
n = int(self.headers.get("Content-Length", 0) or 0)
|
||||
return self.rfile.read(n).decode("utf-8", "replace") if n else ""
|
||||
|
||||
def _vpath(self):
|
||||
p = urlparse(self.path).path
|
||||
if p.startswith("/vault/"):
|
||||
return p[len("/vault/"):] # keep trailing slash if present
|
||||
return None
|
||||
|
||||
# ---- debug -------------------------------------------------------------
|
||||
def _maybe_debug(self):
|
||||
u = urlparse(self.path)
|
||||
if u.path == "/__debug__":
|
||||
q = parse_qs(u.query)
|
||||
path = unquote(q.get("path", [""])[0])
|
||||
self._send(200, STATE.get(path, "<<MISSING>>"))
|
||||
return True
|
||||
if u.path == "/__debug__reset":
|
||||
STATE.clear(); FLAKY_FIRED.clear()
|
||||
self._json(200, {"ok": True})
|
||||
return True
|
||||
return False
|
||||
|
||||
def do_GET(self):
|
||||
if self._maybe_debug(): return
|
||||
raw = urlparse(self.path).path
|
||||
if raw.startswith("/search/simple"):
|
||||
return self._json(200, [])
|
||||
vp = self._vpath()
|
||||
if vp is None:
|
||||
return self._json(404, {"errorCode": 40400, "message": "Not Found"})
|
||||
if "//" in self.path.replace("/vault/", "", 1): # double-slash quirk
|
||||
return self._json(400, {"errorCode": 40000, "message": "Bad Request"})
|
||||
if vp == "" or vp.endswith("/"): # directory listing
|
||||
prefix = vp
|
||||
kids = set()
|
||||
for k in STATE:
|
||||
if k.startswith(prefix) and k != prefix:
|
||||
rest = k[len(prefix):]
|
||||
kids.add(rest.split("/")[0] + ("/" if "/" in rest else ""))
|
||||
return self._json(200, {"files": sorted(kids)})
|
||||
if vp in STATE:
|
||||
if "document-map" in (self.headers.get("Accept", "")):
|
||||
return self._json(200, {"headings": headings_of(STATE[vp]),
|
||||
"blocks": [], "frontmatterFields": []})
|
||||
return self._send(200, STATE[vp])
|
||||
return self._json(404, {"errorCode": 40400, "message": "Not Found"})
|
||||
|
||||
def _flaky_once(self, vp):
|
||||
if "flaky" in vp and vp not in FLAKY_FIRED:
|
||||
FLAKY_FIRED.add(vp)
|
||||
return True
|
||||
return False
|
||||
|
||||
def do_PUT(self):
|
||||
vp = self._vpath(); body = self._read_body()
|
||||
if vp is None: return self._json(400, {"message": "bad path"})
|
||||
if self._flaky_once(vp):
|
||||
return self._json(503, {"message": "Service Unavailable"})
|
||||
if "phantom" in vp: # accepted but NOT persisted
|
||||
return self._send(200, "")
|
||||
STATE[vp] = body
|
||||
return self._send(200, "")
|
||||
|
||||
def do_POST(self):
|
||||
if self._maybe_debug(): return
|
||||
raw = urlparse(self.path).path
|
||||
if raw.startswith("/search/simple"):
|
||||
return self._json(200, [])
|
||||
vp = self._vpath(); body = self._read_body()
|
||||
if vp is None: return self._json(400, {"message": "bad path"})
|
||||
if self._flaky_once(vp):
|
||||
return self._json(503, {"message": "Service Unavailable"})
|
||||
STATE[vp] = STATE.get(vp, "") + body # append
|
||||
return self._send(200, "")
|
||||
|
||||
def do_PATCH(self):
|
||||
vp = self._vpath(); body = self._read_body()
|
||||
if vp is None: return self._json(400, {"message": "bad path"})
|
||||
ttype = self.headers.get("Target-Type", "")
|
||||
op = self.headers.get("Operation", "append")
|
||||
target = self.headers.get("Target", "")
|
||||
cur = STATE.get(vp, "")
|
||||
if ttype == "heading":
|
||||
if not heading_present(cur, target):
|
||||
return self._json(400, {"errorCode": 40080, "message": "invalid-target"})
|
||||
# apply: naive append/prepend/replace around the heading line
|
||||
STATE[vp] = cur + ("\n" + body if op != "prepend" else body + "\n")
|
||||
return self._send(200, "")
|
||||
if ttype == "frontmatter":
|
||||
STATE[vp] = cur + f"\n<fm:{target}={body}>"
|
||||
return self._send(200, "")
|
||||
STATE[vp] = cur + "\n" + body
|
||||
return self._send(200, "")
|
||||
|
||||
def do_DELETE(self):
|
||||
vp = self._vpath()
|
||||
if vp in STATE:
|
||||
del STATE[vp]; return self._send(200, "")
|
||||
return self._json(404, {"errorCode": 40400, "message": "Not Found"})
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser()
|
||||
ap.add_argument("--port", type=int, default=8799)
|
||||
a = ap.parse_args()
|
||||
srv = ThreadingHTTPServer(("127.0.0.1", a.port), H)
|
||||
print(f"mock-olrapi listening on http://127.0.0.1:{a.port}", flush=True)
|
||||
srv.serve_forever()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,154 @@
|
||||
{
|
||||
"params": {
|
||||
"port": 8799,
|
||||
"cpt": 4.0,
|
||||
"recovery": 3000,
|
||||
"detect_cost": 80
|
||||
},
|
||||
"rows": [
|
||||
{
|
||||
"scenario": "agent-log-missing-heading",
|
||||
"fault": "bad heading -> 400 invalid-target",
|
||||
"06": {
|
||||
"emit_tokens": 92,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": false,
|
||||
"duplicates": 0,
|
||||
"silent_failure": true
|
||||
},
|
||||
"07": {
|
||||
"emit_tokens": 22,
|
||||
"claimed_ok": false,
|
||||
"detected": true,
|
||||
"persisted": false,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"scenario": "scope-switch",
|
||||
"fault": "(none)",
|
||||
"06": {
|
||||
"emit_tokens": 93,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": true,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
},
|
||||
"07": {
|
||||
"emit_tokens": 24,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": true,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"scenario": "inbox-capture-replayed",
|
||||
"fault": "duplicate attempt (retry/replay)",
|
||||
"06": {
|
||||
"emit_tokens": 144,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": true,
|
||||
"duplicates": 1,
|
||||
"silent_failure": true
|
||||
},
|
||||
"07": {
|
||||
"emit_tokens": 33,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": true,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"scenario": "session-log-flaky-network",
|
||||
"fault": "one-time 503 (transient)",
|
||||
"06": {
|
||||
"emit_tokens": 74,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": false,
|
||||
"duplicates": 0,
|
||||
"silent_failure": true
|
||||
},
|
||||
"07": {
|
||||
"emit_tokens": 17,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": true,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"scenario": "heartbeat-phantom-write",
|
||||
"fault": "accepted-but-not-persisted",
|
||||
"06": {
|
||||
"emit_tokens": 71,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": false,
|
||||
"duplicates": 0,
|
||||
"silent_failure": true
|
||||
},
|
||||
"07": {
|
||||
"emit_tokens": 14,
|
||||
"claimed_ok": false,
|
||||
"detected": true,
|
||||
"persisted": false,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"scenario": "cold-start-load-6-reads",
|
||||
"fault": "(none)",
|
||||
"06": {
|
||||
"emit_tokens": 249,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": true,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
},
|
||||
"07": {
|
||||
"emit_tokens": 64,
|
||||
"claimed_ok": true,
|
||||
"detected": false,
|
||||
"persisted": true,
|
||||
"duplicates": 0,
|
||||
"silent_failure": false
|
||||
}
|
||||
}
|
||||
],
|
||||
"totals": {
|
||||
"0.6": {
|
||||
"gen_tokens": 723,
|
||||
"silent_failures": 4,
|
||||
"detected_failures": 0,
|
||||
"duplicates": 1,
|
||||
"effective_tokens": 12723
|
||||
},
|
||||
"0.7": {
|
||||
"gen_tokens": 174,
|
||||
"silent_failures": 0,
|
||||
"detected_failures": 2,
|
||||
"duplicates": 0,
|
||||
"effective_tokens": 334
|
||||
}
|
||||
},
|
||||
"silent_error_free": {
|
||||
"0.6": "1/5",
|
||||
"0.7": "5/5"
|
||||
},
|
||||
"writes_persisted": {
|
||||
"0.6": "2/5",
|
||||
"0.7": "3/5"
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,303 @@
|
||||
#!/usr/bin/env python3
|
||||
"""run_eval.py — A/B eval of echo-memory 0.6 (raw curl, documented recipes) vs
|
||||
0.7 (the shipped echo.sh client) over representative memory operations, with
|
||||
fault injection.
|
||||
|
||||
Design
|
||||
------
|
||||
* A mock Obsidian REST API (mock_olrapi.py) gives deterministic behavior + faults,
|
||||
so no credentials are needed and the real vault is never touched.
|
||||
* The 0.7 side runs the ACTUAL shipped scripts/echo.sh (status-checked, retry, verify,
|
||||
idempotent append).
|
||||
* The 0.6 side faithfully models the documented raw-curl recipes: it performs the
|
||||
same HTTP but does NOT inspect status, does NOT retry, does NOT verify, and does
|
||||
NOT dedupe — exactly what SKILL.md 0.6 told the model to emit.
|
||||
* Ground truth is read back independently from the mock after each method, so a
|
||||
"silent failure" (method reported success but the vault is actually wrong, and
|
||||
nobody noticed) is detected regardless of what the method claimed.
|
||||
* Each (scenario, method) runs against a freshly reset + re-seeded server, so faults
|
||||
(e.g. one-time 503) are identical for both methods.
|
||||
|
||||
Metrics
|
||||
-------
|
||||
* gen_tokens : output tokens the MODEL must generate for the op (len(emitted)/CPT).
|
||||
* silent_failure : claimed success BUT ground truth is wrong (lost write or duplicate).
|
||||
* detected : the method surfaced the failure (loud, retryable) instead of hiding it.
|
||||
* effective_tokens = gen_tokens + silent_failures*RECOVERY + detected*DETECT_COST
|
||||
(RECOVERY/DETECT_COST are labeled assumptions, tune via env).
|
||||
|
||||
Usage: python3 run_eval.py [--port 8799] [--cpt 4] [--recovery 1500] [--detect-cost 80]
|
||||
"""
|
||||
import os, sys, json, time, subprocess, tempfile, argparse, urllib.request, urllib.error
|
||||
from pathlib import Path
|
||||
|
||||
HERE = Path(__file__).resolve().parent
|
||||
ECHO = HERE.parent / "echo-memory.plugin.src" / "skills" / "echo-memory" / "scripts" / "echo.sh"
|
||||
KEY = "241265fbe6830934a9a4ad3e69335f64a42153b663aa5b0017cb1ea1217b2bab"
|
||||
|
||||
# ----- tiny HTTP helpers (used for setup, ground truth, and the 0.6 model) -----
|
||||
def http(method, url, body=None, headers=None):
|
||||
data = body.encode() if isinstance(body, str) else body
|
||||
req = urllib.request.Request(url, data=data, method=method,
|
||||
headers={"Authorization": f"Bearer {KEY}", **(headers or {})})
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=10) as r:
|
||||
return r.status, r.read().decode("utf-8", "replace")
|
||||
except urllib.error.HTTPError as e:
|
||||
return e.code, e.read().decode("utf-8", "replace")
|
||||
except Exception as e:
|
||||
return 0, str(e)
|
||||
|
||||
class Eval:
|
||||
def __init__(self, base, cpt):
|
||||
self.base, self.cpt = base, cpt
|
||||
def reset(self): http("POST", f"{self.base}/__debug__reset")
|
||||
def seed(self, path, content): http("PUT", f"{self.base}/vault/{path}", content)
|
||||
def ground(self, path):
|
||||
st, body = http("GET", f"{self.base}/__debug__?path={path}")
|
||||
return None if body == "<<MISSING>>" else body
|
||||
def toks(self, text): return round(len(text) / self.cpt)
|
||||
def echo(self, *args):
|
||||
env = dict(os.environ, ECHO_BASE=self.base, ECHO_KEY=KEY, ECHO_VERIFY="1", ECHO_LOCK_TTL="900")
|
||||
p = subprocess.run(["bash", str(ECHO), *args], capture_output=True, text=True, env=env)
|
||||
return p.returncode
|
||||
|
||||
# ---- faithful 0.6 recipe text (what the model emitted), for token accounting ----
|
||||
def recipe_get(path):
|
||||
return (f'curl -s -H "Authorization: Bearer {KEY}" '
|
||||
f'"https://echoapi.alwisp.com/vault/{path}"')
|
||||
def recipe_put(path):
|
||||
return (f"cat > /tmp/obs_file.md << 'EOF'\n<body>\nEOF\n\n"
|
||||
f'curl -s -X PUT -H "Authorization: Bearer {KEY}" '
|
||||
f'-H "Content-Type: text/markdown" --data-binary @/tmp/obs_file.md '
|
||||
f'"https://echoapi.alwisp.com/vault/{path}"')
|
||||
def recipe_post(path):
|
||||
return (f"cat > /tmp/obs_entry.md << 'EOF'\n<line>\nEOF\n\n"
|
||||
f'curl -s -X POST -H "Authorization: Bearer {KEY}" '
|
||||
f'-H "Content-Type: text/markdown" --data-binary @/tmp/obs_entry.md '
|
||||
f'"https://echoapi.alwisp.com/vault/{path}"')
|
||||
def recipe_patch(path, target):
|
||||
return (f"cat > /tmp/obs_patch.md << 'EOF'\n<body>\nEOF\n\n"
|
||||
f'curl -s -X PATCH -H "Authorization: Bearer {KEY}" '
|
||||
f'-H "Operation: append" -H "Target-Type: heading" -H "Target: {target}" '
|
||||
f'-H "Content-Type: text/markdown" --data-binary @/tmp/obs_patch.md '
|
||||
f'"https://echoapi.alwisp.com/vault/{path}"')
|
||||
|
||||
def tmpfile(content):
|
||||
f = tempfile.NamedTemporaryFile("w", suffix=".md", delete=False); f.write(content); f.close()
|
||||
return f.name
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Scenarios. Each defines seed(), and a run for each method that returns:
|
||||
# {emit: <text model generates>, claimed_ok: bool, detected: bool}
|
||||
# plus a ground-truth check producing {persisted, duplicates}.
|
||||
# ---------------------------------------------------------------------------
|
||||
def scenarios(ev):
|
||||
out = []
|
||||
|
||||
# S1 — agent-log append where the heading is MISSING (-> 400 invalid-target)
|
||||
def s1():
|
||||
path, tgt = "journal/daily/2026-06-19.md", "2026-06-19::Agent Log"
|
||||
line = "- 2026-06-19: EVALMARK-s1"
|
||||
def seed(): ev.seed(path, "---\ntype: daily-note\n---\n\n# 2026-06-19\n\n## Notes\n")
|
||||
def m06():
|
||||
http("PATCH", f"{ev.base}/vault/{path}", line,
|
||||
{"Operation": "append", "Target-Type": "heading", "Target": tgt}) # status ignored
|
||||
return {"emit": recipe_patch(path, tgt), "claimed_ok": True, "detected": False}
|
||||
def m07():
|
||||
rc = ev.echo("patch", path, "append", "heading", tgt, tmpfile(line))
|
||||
return {"emit": f'"$ECHO" patch {path} append heading "{tgt}" /tmp/x.md',
|
||||
"claimed_ok": rc == 0, "detected": rc != 0}
|
||||
def gt():
|
||||
c = ev.ground(path) or ""
|
||||
return {"persisted": "EVALMARK-s1" in c, "duplicates": max(0, c.count("EVALMARK-s1") - 1)}
|
||||
return dict(name="agent-log-missing-heading", fault="bad heading -> 400 invalid-target",
|
||||
seed=seed, m06=m06, m07=m07, gt=gt)
|
||||
out.append(s1())
|
||||
|
||||
# S2 — scope switch (no fault; pure token comparison on a PATCH replace)
|
||||
def s2():
|
||||
path, tgt = "_agent/context/current-context.md", "Current Context::Scope"
|
||||
def seed(): ev.seed(path, "---\ntype: context-bundle\n---\n\n# Current Context\n\n## Scope\nold scope\n")
|
||||
def m06():
|
||||
http("PATCH", f"{ev.base}/vault/{path}", "EVALMARK-s2 new scope",
|
||||
{"Operation": "replace", "Target-Type": "heading", "Target": tgt})
|
||||
return {"emit": recipe_patch(path, tgt), "claimed_ok": True, "detected": False}
|
||||
def m07():
|
||||
rc = ev.echo("patch", path, "replace", "heading", tgt, tmpfile("EVALMARK-s2 new scope"))
|
||||
return {"emit": f'"$ECHO" patch {path} replace heading "{tgt}" /tmp/x.md',
|
||||
"claimed_ok": rc == 0, "detected": rc != 0}
|
||||
def gt():
|
||||
c = ev.ground(path) or ""
|
||||
return {"persisted": "EVALMARK-s2" in c, "duplicates": 0}
|
||||
return dict(name="scope-switch", fault="(none)", seed=seed, m06=m06, m07=m07, gt=gt)
|
||||
out.append(s2())
|
||||
|
||||
# S3 — inbox capture issued TWICE (retry/replay): dedup vs duplicate line
|
||||
def s3():
|
||||
path = "inbox/captures/inbox.md"; line = "- 2026-06-19: EVALMARK-s3"
|
||||
def seed(): ev.seed(path, "# Inbox\n")
|
||||
def m06():
|
||||
http("POST", f"{ev.base}/vault/{path}", line + "\n") # no idempotency check
|
||||
http("POST", f"{ev.base}/vault/{path}", line + "\n")
|
||||
return {"emit": recipe_post(path) + "\n# (repeated on retry)\n" + recipe_post(path),
|
||||
"claimed_ok": True, "detected": False}
|
||||
def m07():
|
||||
rc1 = ev.echo("append", path, line)
|
||||
rc2 = ev.echo("append", path, line) # second is skipped by read-before-POST
|
||||
return {"emit": f'"$ECHO" append {path} "{line}"\n"$ECHO" append {path} "{line}"',
|
||||
"claimed_ok": rc1 == 0 and rc2 == 0, "detected": False}
|
||||
def gt():
|
||||
c = ev.ground(path) or ""
|
||||
return {"persisted": "EVALMARK-s3" in c, "duplicates": max(0, c.count("EVALMARK-s3") - 1)}
|
||||
return dict(name="inbox-capture-replayed", fault="duplicate attempt (retry/replay)",
|
||||
seed=seed, m06=m06, m07=m07, gt=gt)
|
||||
out.append(s3())
|
||||
|
||||
# S4 — session-log PUT under a flaky network (one-time 503 then success)
|
||||
def s4():
|
||||
path = "_agent/sessions/2026-06-19-1430-flaky-eval.md"
|
||||
body = "---\ntype: session-log\n---\n\n# Session\nEVALMARK-s4\n"
|
||||
def seed(): pass # file does not pre-exist
|
||||
def m06():
|
||||
http("PUT", f"{ev.base}/vault/{path}", body) # single shot, no retry, status ignored
|
||||
return {"emit": recipe_put(path), "claimed_ok": True, "detected": False}
|
||||
def m07():
|
||||
rc = ev.echo("put", path, tmpfile(body)) # echo.sh retries the 503 once
|
||||
return {"emit": f'"$ECHO" put {path} /tmp/x.md', "claimed_ok": rc == 0, "detected": rc != 0}
|
||||
def gt():
|
||||
c = ev.ground(path) or ""
|
||||
return {"persisted": "EVALMARK-s4" in c, "duplicates": 0}
|
||||
return dict(name="session-log-flaky-network", fault="one-time 503 (transient)",
|
||||
seed=seed, m06=m06, m07=m07, gt=gt)
|
||||
out.append(s4())
|
||||
|
||||
# S5 — heartbeat PUT accepted-but-not-persisted (proxy hiccup): verify catches it
|
||||
def s5():
|
||||
path = "_agent/heartbeat/phantom-eval.md"; body = "EVALMARK-s5 @ 2026-06-19T14:30:00Z\n"
|
||||
def seed(): pass
|
||||
def m06():
|
||||
http("PUT", f"{ev.base}/vault/{path}", body) # 200 returned; no read-back verify
|
||||
return {"emit": recipe_put(path), "claimed_ok": True, "detected": False}
|
||||
def m07():
|
||||
rc = ev.echo("put", path, tmpfile(body)) # verify GET -> 404 -> die
|
||||
return {"emit": f'"$ECHO" put {path} /tmp/x.md', "claimed_ok": rc == 0, "detected": rc != 0}
|
||||
def gt():
|
||||
c = ev.ground(path) or ""
|
||||
return {"persisted": "EVALMARK-s5" in c, "duplicates": 0}
|
||||
return dict(name="heartbeat-phantom-write", fault="accepted-but-not-persisted",
|
||||
seed=seed, m06=m06, m07=m07, gt=gt)
|
||||
out.append(s5())
|
||||
|
||||
# S6 — cold-start load: 6 reads (no fault; pure token comparison)
|
||||
def s6():
|
||||
paths = ["_agent/echo-vault.md", "_agent/memory/semantic/operator-preferences.md",
|
||||
"_agent/context/current-context.md", "_agent/heartbeat/last-session.md",
|
||||
"journal/daily/2026-06-19.md", "inbox/captures/inbox.md"]
|
||||
def seed():
|
||||
for p in paths: ev.seed(p, f"# {p}\nEVALMARK-s6\n")
|
||||
def m06():
|
||||
for p in paths: http("GET", f"{ev.base}/vault/{p}")
|
||||
return {"emit": "\n".join(recipe_get(p) for p in paths), "claimed_ok": True, "detected": False}
|
||||
def m07():
|
||||
for p in paths: ev.echo("get", p)
|
||||
return {"emit": "\n".join(f'"$ECHO" get {p}' for p in paths), "claimed_ok": True, "detected": False}
|
||||
def gt(): return {"persisted": True, "duplicates": 0}
|
||||
return dict(name="cold-start-load-6-reads", fault="(none)", seed=seed, m06=m06, m07=m07, gt=gt)
|
||||
out.append(s6())
|
||||
|
||||
return out
|
||||
|
||||
def run_method(ev, scn, key):
|
||||
ev.reset(); scn["seed"]()
|
||||
res = scn[key]()
|
||||
g = scn["gt"]()
|
||||
# a write op "should persist"; reads (s6) and replays are judged by their own gt
|
||||
bad = (not g["persisted"] and scn["name"] != "cold-start-load-6-reads") or g["duplicates"] > 0
|
||||
silent = bool(res["claimed_ok"] and bad)
|
||||
return {"emit_tokens": ev.toks(res["emit"]), "claimed_ok": res["claimed_ok"],
|
||||
"detected": res["detected"], "persisted": g["persisted"],
|
||||
"duplicates": g["duplicates"], "silent_failure": silent}
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser()
|
||||
ap.add_argument("--port", type=int, default=8799)
|
||||
ap.add_argument("--cpt", type=float, default=4.0, help="chars per token")
|
||||
ap.add_argument("--recovery", type=int, default=1500, help="token penalty per silent failure (recovery loop)")
|
||||
ap.add_argument("--detect-cost", type=int, default=80, help="token cost to re-issue a corrected call after a detected failure")
|
||||
a = ap.parse_args()
|
||||
|
||||
base = f"http://127.0.0.1:{a.port}"
|
||||
srv = subprocess.Popen([sys.executable, str(HERE / "mock_olrapi.py"), "--port", str(a.port)],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)
|
||||
try:
|
||||
for _ in range(50): # wait for ready
|
||||
try:
|
||||
urllib.request.urlopen(f"{base}/__debug__reset", data=b"", timeout=1); break
|
||||
except Exception: time.sleep(0.1)
|
||||
ev = Eval(base, a.cpt)
|
||||
rows, agg = [], {"06": {}, "07": {}}
|
||||
for scn in scenarios(ev):
|
||||
r06 = run_method(ev, scn, "m06")
|
||||
r07 = run_method(ev, scn, "m07")
|
||||
rows.append({"scenario": scn["name"], "fault": scn["fault"], "06": r06, "07": r07})
|
||||
|
||||
def totals(side):
|
||||
gen = sum(r[side]["emit_tokens"] for r in rows)
|
||||
sil = sum(1 for r in rows if r[side]["silent_failure"])
|
||||
det = sum(1 for r in rows if r[side]["detected"])
|
||||
dup = sum(r[side]["duplicates"] for r in rows)
|
||||
eff = gen + sil * a.recovery + det * a.detect_cost
|
||||
return dict(gen_tokens=gen, silent_failures=sil, detected_failures=det,
|
||||
duplicates=dup, effective_tokens=eff)
|
||||
T06, T07 = totals("06"), totals("07")
|
||||
|
||||
# ---- report ----
|
||||
line = "=" * 78
|
||||
print(f"\n{line}\nECHO MEMORY — 0.6 vs 0.7 A/B EVAL")
|
||||
print(f"(mock OLRAPI; chars/token={a.cpt}, recovery-penalty={a.recovery}, detect-cost={a.detect_cost})\n{line}")
|
||||
hdr = f"{'scenario':28} {'fault':32} {'gen06':>5} {'gen07':>5} {'silent06':>8} {'silent07':>8}"
|
||||
print(hdr); print("-" * len(hdr))
|
||||
for r in rows:
|
||||
print(f"{r['scenario']:28} {r['fault']:32} "
|
||||
f"{r['06']['emit_tokens']:>5} {r['07']['emit_tokens']:>5} "
|
||||
f"{('YES' if r['06']['silent_failure'] else '-'):>8} "
|
||||
f"{('YES' if r['07']['silent_failure'] else '-'):>8}")
|
||||
print("-" * len(hdr))
|
||||
|
||||
def pct(old, new): return f"{(old-new)/old*100:+.0f}%" if old else "n/a"
|
||||
print(f"\n{'metric':30} {'0.6':>10} {'0.7':>10} {'delta':>10}")
|
||||
print("-" * 62)
|
||||
for label, k in [("generated tokens", "gen_tokens"),
|
||||
("silent failures", "silent_failures"),
|
||||
("duplicate lines", "duplicates"),
|
||||
("detected (loud) failures", "detected_failures"),
|
||||
("effective tokens (incl. recovery)", "effective_tokens")]:
|
||||
o, n = T06[k], T07[k]
|
||||
d = pct(o, n) if "token" in k else f"{n-o:+d}"
|
||||
print(f"{label:30} {o:>10} {n:>10} {d:>10}")
|
||||
wrows = [r for r in rows if r['scenario'] != 'cold-start-load-6-reads']
|
||||
denom = len(wrows)
|
||||
sf06 = sum(1 for r in wrows if not r['06']['silent_failure']) # silent-error-free
|
||||
sf07 = sum(1 for r in wrows if not r['07']['silent_failure'])
|
||||
pr06 = sum(1 for r in wrows if r['06']['persisted']) # write actually landed
|
||||
pr07 = sum(1 for r in wrows if r['07']['persisted'])
|
||||
print("-" * 62)
|
||||
print(f"{'silent-error-free ops':30} {str(sf06)+'/'+str(denom):>10} {str(sf07)+'/'+str(denom):>10}")
|
||||
print(f"{'writes actually persisted':30} {str(pr06)+'/'+str(denom):>10} {str(pr07)+'/'+str(denom):>10}")
|
||||
print(f" (the 2 un-persisted 0.7 ops are bad-heading + phantom: not persistable in a single")
|
||||
print(f" call, but 0.7 fails LOUD so the agent's ensure-heading/retry path can recover.)")
|
||||
print(f"\nNOTE: recovery-penalty and detect-cost are tunable ASSUMPTIONS, not measured.")
|
||||
print(f" gen tokens are a chars/{a.cpt} proxy for output tokens of the I/O layer only.")
|
||||
print(f" This harness measures mechanics, not model reasoning quality.\n")
|
||||
|
||||
report = {"params": vars(a), "rows": rows, "totals": {"0.6": T06, "0.7": T07},
|
||||
"silent_error_free": {"0.6": f"{sf06}/{denom}", "0.7": f"{sf07}/{denom}"},
|
||||
"writes_persisted": {"0.6": f"{pr06}/{denom}", "0.7": f"{pr07}/{denom}"}}
|
||||
(HERE / "results" / "latest.json").write_text(json.dumps(report, indent=2))
|
||||
print(f"wrote {HERE/'results'/'latest.json'}")
|
||||
finally:
|
||||
srv.terminate()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user