cold testing prompts

This commit is contained in:
2026-06-06 23:52:25 -05:00
parent c64f6e33da
commit 342501d670
2 changed files with 76 additions and 0 deletions
+39
View File
@@ -0,0 +1,39 @@
# Scope Switching — Explicit (mechanics) cold-start test
Tests whether echo-memory executes the 3-step Scope Switching protocol **correctly
when explicitly asked**. Isolates the mechanics from the detection question — run this
when the implicit test fails, to confirm the underlying PATCH sequence still works.
## Preconditions
- Run in a **fresh / cold session**.
- `_agent/context/current-context.md` `## Scope` holds a domain-A scope that is clearly
not WISP docs (e.g. `echo-memory-0.4.0-validation`). Check/seed before testing:
```bash
curl -s -H "Authorization: Bearer 241265fbe6830934a9a4ad3e69335f64a42153b663aa5b0017cb1ea1217b2bab" \
"https://echoapi.alwisp.com/vault/_agent/context/current-context.md"
```
## Opening prompt (paste verbatim into the cold session)
> Switch my scope to the ALABAMA wISP site-survey template work.
## Expected behavior (the 3-step protocol — SKILL.md "Scope Switching")
1. PATCH-**prepend** a dated bullet for the **prior** scope to `## Scope History`
(create the heading first if absent).
2. PATCH-**replace** `## Scope` with the new WISP scope.
3. PATCH the frontmatter `updated:` field to today.
## Pass / fail rubric
After the session, re-GET `current-context.md` and check:
- ✅ **PASS** — all three: prior scope prepended to `## Scope History`, `## Scope` replaced
with the WISP scope, `updated:` bumped.
- ⚠️ **PARTIAL** — one or two of the three done (note which step was missed).
- ❌ **FAIL** — scope not switched, or written to the wrong file / wrong heading.
## Notes
- If implicit FAILS but explicit PASSES → mechanics are sound, only the auto-fire hook is
missing → improvement #9 is purely a load-time trigger, not a rewrite of the procedure.
- See `implicit.md` (auto-fire variant) and [[_agent/memory/semantic/echo-skill-improvements]].
+37
View File
@@ -0,0 +1,37 @@
# Scope Switching — Implicit (auto-fire) cold-start test
Tests whether echo-memory **auto-fires** the Scope Switching protocol when the
operator pivots to a new domain without being asked to switch scope. This is the
high-value test — it measures detection, not just mechanics.
## Preconditions
- Run in a **fresh / cold session** (no memory of any prior scope-switch discussion).
- `_agent/context/current-context.md` `## Scope` should hold a domain-A scope that is
clearly *not* WISP docs (e.g. `echo-memory-0.4.0-validation`). Check/seed before testing:
```bash
curl -s -H "Authorization: Bearer 241265fbe6830934a9a4ad3e69335f64a42153b663aa5b0017cb1ea1217b2bab" \
"https://echoapi.alwisp.com/vault/_agent/context/current-context.md"
```
## Opening prompt (paste verbatim into the cold session)
> What was I last working on? Okay, set that aside — let's work on the ALABAMA wISP site-survey template.
The first clause forces a memory load (reads scope A); the pivot moves to domain B.
Do **not** mention "scope", "context", or "switch" — the point is to see if it fires unprompted.
## Pass / fail rubric
After the session, re-GET `current-context.md` and check:
- ✅ **PASS** — `## Scope History` gained a top bullet for the prior scope
(`- <date>: echo-memory-0.4.0-validation ...`); `## Scope` now describes the WISP task;
frontmatter `updated:` bumped.
- ⚠️ **PARTIAL** — switched but skipped a step (commonly: no Scope History prepend, or `updated:` not bumped).
- ❌ **FAIL (no auto-fire)** — did the WISP work without touching the context bundle.
## Notes
- Prediction: likely FAIL (no auto-fire), mirroring Inbox Triage improvement #8 — a
documented behavior with no load-time hook compelling it. A FAIL here justifies
improvement #9 (load-time scope-mismatch detection) for 0.5.0.
- See [[_agent/memory/semantic/echo-skill-improvements]] and `explicit.md` (mechanics-only variant).