Merges the full hardened stack (#788 closets, #789 entity/BM25/diary,
#790 tunnels) and reimplements the drawer-grep feature in a way that
composes with the chunk-level closet-first search instead of fighting it.
## Background
The original PR added "drawer-grep" on top of the pre-hardening closet
code that returned whole-file blobs. My #788 hardening changed that
path to return *chunk-level* hits by parsing each closet's
``→drawer_id`` pointers and hydrating exactly those drawers. That made
the original drawer-grep grep-over-all-drawers logic redundant — the
closet already points at the relevant chunk.
What remained valuable from the original PR was the *context expansion*
idea: a chunk boundary can clip a thought mid-stride (matched chunk
says "here's a breakdown:" and the breakdown lives in the next chunk),
so callers want ±1 neighbor chunks for free rather than a follow-up
get_drawer call.
## Change
New ``_expand_with_neighbors(drawers_col, doc, meta, radius=1)`` helper
in searcher.py:
* Reads ``source_file`` + ``chunk_index`` from the matched drawer's
metadata.
* Fetches the ±radius sibling chunks in a SINGLE ChromaDB query using
``$and + $in`` — no "fetch all drawers for source" blowup.
* Sorts retrieved chunks by chunk_index, joins with ``\n\n``.
* Does a cheap metadata-only second query to compute ``total_drawers``
so callers know where in the file they landed.
* Graceful fallback to the matched doc alone on any ChromaDB failure or
missing metadata — search never breaks because expansion failed.
``_closet_first_hits`` now calls this helper and tags each hit with
``drawer_index`` + ``total_drawers``. Hit shape stays consistent with
the direct-search path (both still carry ``matched_via``) so callers
can't tell which path produced a given hit except via that field.
## Tests
6 new cases in TestDrawerGrepExpansion:
* neighbors returned in chunk_index order (not hash order)
* edge case: matched chunk at index 0 — only next neighbor surfaces
* edge case: matched chunk at last index — only prev neighbor surfaces
* edge case: 1-drawer file — returns just the matched doc
* missing/non-int chunk_index metadata — graceful fallback
* end-to-end via ``search_memories`` — closet-first hit carries
drawer_index, total_drawers, and includes ±1 neighbors
761/761 suite pass; ruff + format clean on CI-pinned 0.4.x.
Merge resolutions: miner.py kept develop's purge+NORMALIZE_VERSION;
searcher.py dropped the old whole-file-blob block entirely in favor of
rebuilding context expansion on top of ``_closet_first_hits``;
test_closets.py took develop's 47-test baseline and appended
TestDrawerGrepExpansion.
Merges the hardened closet/entity/BM25/diary stack from #789 and fixes
five correctness/durability issues in the tunnels module plus the
directional/symmetric design question.
## Design: tunnels are now symmetric
Per review discussion: a tunnel represents "these two things relate",
not "A causes B". The canonical ID now hashes the *sorted* endpoint
pair, so ``create_tunnel(A, B)`` and ``create_tunnel(B, A)`` resolve to
the same record and the second call updates the label rather than
creating a duplicate. ``follow_tunnels`` can be called from either
endpoint and surfaces the other side consistently.
The returned dict still preserves ``source``/``target`` in the order
the caller supplied, so UIs that want to render the connection
directionally can do so.
## Correctness fixes
* **Atomic write** — ``_save_tunnels`` writes to ``tunnels.json.tmp``
and ``os.replace``s it into place. A crash mid-write can no longer
leave a truncated file that silently reads back as ``[]`` and wipes
every tunnel. Includes ``f.flush() + os.fsync`` before replace on
platforms that support it.
* **Concurrent-write lock** — ``create_tunnel`` and ``delete_tunnel``
wrap the load→mutate→save cycle in ``mine_lock(_TUNNEL_FILE)``.
Without this, two agents creating tunnels simultaneously would both
read the same snapshot and the later writer would drop the earlier
writer's tunnel.
* **Corrupt-file tolerance** — ``_load_tunnels`` now uses a context
manager, validates that the loaded JSON is a list, and returns ``[]``
for any read failure. Subsequent ``create_tunnel`` then overwrites
the corrupt file via atomic write — no manual recovery needed.
* **Input validation** — new ``_require_name`` helper rejects empty or
whitespace-only wing/room names with a clear ``ValueError``. Prevents
phantom tunnels with blank endpoints from ever reaching the JSON
store.
* **Timezone-aware timestamps** — ``created_at`` / ``updated_at`` now
use ``datetime.now(timezone.utc).isoformat()``, matching diary ingest
and other recent modules.
## Tests (12 in TestTunnels)
5 original + 7 regression cases:
* ``test_tunnel_is_symmetric`` — A↔B and B↔A dedupe to one record.
* ``test_follow_tunnels_works_from_either_endpoint`` — symmetric surface.
* ``test_empty_endpoint_fields_rejected`` — validation guard.
* ``test_corrupt_tunnel_file_does_not_lose_new_writes`` — truncated
JSON treated as empty; next create persists cleanly.
* ``test_atomic_write_leaves_no_stray_tmp_file`` — no leftover ``.tmp``.
* ``test_concurrent_creates_preserve_all_tunnels`` — 5 threads each
create a distinct tunnel; all 5 persisted (regression for the
read-modify-write race).
* ``test_created_at_is_timezone_aware`` — ISO8601 has tz suffix.
Merge resolutions: tests/test_closets.py combined develop's hardened
closet/entity/BM25/diary tests with this PR's TestTunnels class.
755/755 tests pass. ruff + format clean under CI-pinned 0.4.x.
Merges develop (closet hardening #826, strip_noise #785, lock #784) and
replaces every sub-feature in this PR with a correct, tested
implementation. Shippable now.
## 1. Real Okapi-BM25 (searcher.py)
The prior `_bm25_score()` hardcoded `idf = log(2.0)` for every term — it
was really a scaled TF, not BM25, and couldn't tell a discriminative
term from a generic one. Replaced with `_bm25_scores(query, documents)`
that computes proper IDF over the provided candidate corpus using the
Lucene smoothed formula `log((N - df + 0.5) / (df + 0.5) + 1)`. Well-
defined for re-ranking vector-retrieval candidates — IDF there measures
how discriminative each term is *within the candidate set*, exactly the
signal we want.
`_hybrid_rank` also fixed:
- Vector normalization is now absolute `max(0, 1 - dist)`, not
`1 - dist/max_dist` — adding/removing a candidate no longer reshuffles
the others.
- BM25 is min-max normalized within candidates (bounded [0, 1]).
- Closet path now re-ranks too (was previously returning closet-order
hits without hybrid scoring).
- `_hybrid_score` internal field stripped from output; `bm25_score`
exposed for debugging.
## 2. Entity metadata (miner.py)
- Reuses `_ENTITY_STOPLIST` from palace.py so sentence-starters like
"When", "After", "The" no longer land as entities (regression test
covers this).
- Known-entity registry is cached at module level, keyed by the
registry file's mtime — no more disk read per drawer.
- File handle now uses a context manager.
- Truncates the entity LIST (to 25) before joining — never splits a
name in the middle.
## 3. Diary ingest (diary_ingest.py)
- State file now lives at `~/.mempalace/state/diary_ingest_<hash>.json`,
keyed by (palace_path, diary_dir). No more pollution of the user's
content directory.
- Drawer IDs now hash `(wing, date_str)` — a user with personal + work
diaries on the same day no longer silently clobbers.
- Each day's upsert runs inside `mine_lock(source_file)` so concurrent
ingest from two terminals can't race.
- `force=True` now calls `purge_file_closets` before rebuild so
leftover numbered closets from a longer prior day don't orphan.
## 4. Tests (tests/test_closets.py)
Merged this PR's MineLock/Entity/BM25/Diary tests with develop's
hardened Build/Upsert/Purge/Rebuild/SearchClosetFirst tests. Added
specific regression tests for every fix above:
- entity stoplist applies (no "When/After/The")
- entity list capped before join (no partial tokens)
- registry cached by mtime (mock-verified zero re-reads)
- BM25 IDF downweights terms present in every doc (real BM25 evidence)
- hybrid rank absolute normalization stable against outliers
- diary state file outside user's diary dir
- diary wing-prefixed IDs prevent cross-wing date collisions
35/35 closet tests pass; full suite 743/743. ruff + format clean under
CI-pinned 0.4.x.
Merges develop (#820 version sync, #785 strip_noise + NORMALIZE_VERSION,
#784 file locking) and addresses six concerns surfaced during PR review
of the closet feature:
1. Closet append-on-rebuild bug — upsert_closet_lines used to APPEND to
existing closets (mismatched the doc's "fully replaced" promise). With
NORMALIZE_VERSION rebuilds on develop, this would have stacked stale
v1 topics on top of fresh v2 content forever. Fix:
- Drop the read-and-append branch from upsert_closet_lines (now a pure
numbered-id overwrite).
- Add purge_file_closets(closets_col, source_file) helper that wipes
every closet for a source file by where-filter.
- process_file calls purge_file_closets before upsert on every mine,
mirroring the existing drawer purge.
2. Searcher returned whole-file blobs from the closet path while the
direct path returned chunk-level drawers. Refactored:
- _extract_drawer_ids_from_closet parses the `→drawer_a,drawer_b`
pointers out of closet documents.
- _closet_first_hits hydrates exactly those drawer IDs (chunk-level),
not collection.get(where=source_file) (which returned everything).
- Same hit shape as direct-search path; both now carry matched_via.
3. max_distance was bypassed on the closet path. Now applied per-hit;
when every closet candidate gets filtered, _closet_first_hits returns
None and the caller falls through to direct drawer search.
4. Entity extraction caught sentence-starters like "When", "The",
"After" as proper nouns. Added _ENTITY_STOPLIST (~40 common false
positives + day/month names + role words). Real names like Igor /
Milla still survive — covered by tests.
5. CLOSETS.md drifted from the code (claimed "replaced via upsert" but
code appended; claimed BM25 hybrid that doesn't exist; claimed a
10K char hydration cap that wasn't enforced). Rewritten to describe
what actually ships, with explicit notes on the BM25 / convo-closet
follow-ups.
6. Zero tests for ~250 lines. Added tests/test_closets.py with 17 cases:
- build_closet_lines: pointer shape, header extraction, stoplist
filtering (with regression case for "When/After/The"), real-name
survival, fallback-line guarantee, drawer-ref slicing.
- upsert_closet_lines: pure overwrite semantics (regression for the
append bug), char-limit packing without splitting lines.
- purge_file_closets: scoped to source_file, doesn't touch others.
- End-to-end miner rebuild: re-mining a file with fewer topics fully
purges leftover numbered closets from the larger first run.
- _extract_drawer_ids_from_closet: parsing + dedup edge cases.
- search_memories closet-first: fallback when empty, chunk-level
hits with matched_via, no whole-file glue, max_distance enforced.
Merge resolutions: miner.py imports combined NORMALIZE_VERSION/mine_lock
from develop with the closet helpers from this branch. process_file
auto-merged cleanly (closet block sits inside develop's lock body).
724/724 tests pass. ruff + format clean under CI-pinned 0.4.x.
Without this, the strip_noise improvement only helps new mines. Every
user who had already mined Claude Code JSONL sessions would keep their
noise-polluted drawers forever, because convo_miner's file_already_mined
skip short-circuits before re-processing.
Adds a versioned schema gate so upgrades propagate silently:
- palace.NORMALIZE_VERSION=2 — bumped when the normalization pipeline
changes shape (this PR's strip_noise is the v1→v2 bump).
- file_already_mined now returns False if the stored normalize_version
is missing or less than current, triggering a rebuild on next mine.
- Both miners stamp drawers with the current normalize_version.
- convo_miner now purges stale drawers before inserting fresh chunks
(mirrors miner.py's existing delete+insert), extracted into
_file_convo_chunks helper to keep mine_convos under ruff's C901 limit.
User experience: upgrade mempalace, run `mempalace mine` as usual, old
noisy drawers get silently replaced with clean ones. No erase needed,
no "you need to rebuild" changelog footgun.
Tests:
- test_file_already_mined_returns_false_for_stale_normalize_version —
pins the version gate contract for missing/v1/current.
- test_add_drawer_stamps_normalize_version — fresh project-miner drawers
carry the field.
- test_mine_convos_rebuilds_stale_drawers_after_schema_bump — end-to-end
proof that a pre-v2 palace gets silently cleaned on next mine, with
orphan drawers purged and NOT skipped.
Existing test_file_already_mined_check_mtime updated to include the
new field; all other tests unaffected.
The initial strip_noise() regressed on three fronts when audited against
adversarial user content — each verified with executable repros against
the cherry-picked code:
1. `<tag>.*?</tag>` with re.DOTALL span-ate across messages: one
stray unclosed <system-reminder> anywhere in a session merged with
the next closing tag, silently deleting everything between them
(including full assistant replies).
2. `.*\(ctrl\+o to expand\).*\n?` nuked entire lines of user prose
whenever a user happened to document the TUI shortcut.
3. `Ran \d+ (?:stop|pre|post)\s*hook.*` with IGNORECASE ate the
second sentence from "our CI has a stop hook ... Ran 2 stop hooks
last week" — legitimate user commentary.
These are unambiguous violations of the project's "Verbatim always"
design principle.
Fixes:
- All tag patterns are now line-anchored (`(?m)^(?:> )?<tag>`) and their
body forbids crossing a blank line (`(?:(?!\n\s*\n)[\s\S])*?`), so a
dangling open tag cannot eat neighboring messages.
- `_NOISE_LINE_PREFIXES` are line-anchored and case-sensitive — user
prose mentioning "CURRENT TIME:" mid-sentence is preserved.
- Hook-run chrome requires `(?m)^`, explicit hook names (Stop,
PreCompact, PreToolUse, etc.), and no IGNORECASE.
- "… +N lines" is line-anchored.
- "(ctrl+o to expand)" only matches Claude Code's actual collapsed-
output chrome shape `[N tokens] (ctrl+o to expand)`; a bare
parenthetical in user prose stays intact.
Scope:
- `strip_noise()` is no longer called on every normalization path.
Only `_try_claude_code_jsonl` invokes it, per-extracted-message — so
Claude.ai exports, ChatGPT exports, Slack JSON, Codex JSONL, and
plain text with `>` markers pass through fully verbatim. Per-message
application also makes span-eating structurally impossible.
Tests:
- 15 new tests in test_normalize.py pin the boundary: 6 guard user
content that must survive (each of the adversarial repros), 9 assert
real system chrome is still stripped. All pass; full suite 702 pass
(2 failures are the unrelated pre-existing version.py bug, cleared
by #820).
Known limitation (not fixed here): convo_miner.py does not delete
drawers on re-mine, so transcripts mined before this PR keep noise-
filled drawers until the user manually erases + re-mines. Proper fix
needs a schema-version field on drawer metadata + re-mine trigger —
out of scope for this PR.
Appended from Milla's omnibus test_closets.py — covers create,
list, delete, dedup, and follow_tunnels behavior. 21/21 pass.
Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
Trimmed version of Milla's omnibus test_closets.py to only cover
features present in this PR stack (#784 lock, #788 closets, this
PR's entity/BM25/diary). Strip-noise tests will land with #785;
tunnel tests will land with the tunnels PR.
16/16 pass.
Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
* fix: parse Claude.ai privacy export with messages key and sender field (#677)
The privacy-export branch in _try_claude_ai_json only checked for the
"chat_messages" key, missing exports that use "messages" instead. It
also only read the "role" field while real privacy exports use "sender".
Both gaps caused the file to fall through to plain-text, producing a
single giant drawer.
Changes:
- Accept "messages" alongside "chat_messages" in the conversation-object
guard and inner extraction.
- Accept "sender" alongside "role" as the author field.
- Fall back to a top-level "text" key when content blocks are empty.
- Produce one transcript per conversation instead of concatenating all
conversations into a single blob.
- Extract shared logic into _collect_claude_messages helper.
- Add 6 regression tests covering each variant.
* style: apply ruff format to normalize.py
* fix: guard against null text field in Claude.ai export parsing
item.get("text", "").strip() crashes when "text" is explicitly null
in the JSON (legal and observed in some exports). Use
(item.get("text") or "").strip() and add a regression test.
---------
Co-authored-by: Igor Lins e Silva <4753812+igorls@users.noreply.github.com>
When external tools write to the palace database (CLI mining, scripts), the MCP server's cached ChromaDB collection becomes stale — its HNSW index doesn't know about new vectors. Develop already invalidates on inode changes (catches rebuilds) but not on mtime changes (misses in-place writes).
This PR:
- Adds st_mtime tracking alongside st_ino in _get_client; invalidates the cached client on either change.
- Adds the mempalace_reconnect MCP tool for explicit cache flush.
Original author: @jphein (#663). Original approval: @Ari4ka.
Skips test_missing_db_invalidates_cache on Windows (ChromaDB holds chroma.sqlite3 open).
* fix: register 0-chunk files to prevent re-processing on every mine (#654)
mine_convos() has three early-exit paths (OSError, content too short,
zero chunks) that skip writing anything to ChromaDB. Since
file_already_mined() checks for the presence of a document with a
matching source_file, these files are re-read and re-processed on
every subsequent run.
Add _register_file() that upserts a lightweight sentinel document
(room="_registry", ingest_mode="registry") so file_already_mined()
returns True on future runs.
Note: Bug 2 from the issue (drawers_added counter always 0) was
already resolved upstream via the switch from collection.add() to
collection.upsert().
* fix: resolve macOS path symlink in test + remove unused variable
* fix: return "general" room from process_file error paths (#586)
process_file() returned (0, None) for already-mined, unreadable, and
too-short files. In --dry-run mode the caller always enters the
room_counts branch, so None ended up as a dict key and crashed the
summary printer with "unsupported format string passed to
NoneType.__format__".
Returning "general" instead of None makes the function contract
explicit: it always yields (int, str). This matches the consensus
fix discussed in the issue thread.
* style: apply ruff format to test_miner.py
* fix: allow Unicode in sanitize_name() — Latvian, CJK, Cyrillic names (#637)
_SAFE_NAME_RE was ASCII-only ([a-zA-Z0-9]), rejecting valid Unicode
names like "Jānis" or "太郎". Changed to \w which matches Unicode
word characters (letters, digits, underscore) in Python 3.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: tighten Unicode regex, add sanitize_name tests
Use [^\W_] for first/last char to allow Unicode letters/digits but
reject leading/trailing underscores (Copilot feedback). Add 7 tests
covering Latvian, CJK, Cyrillic, path traversal, and edge cases.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Note from code review: (1) silent exception swallow on migration failure means caller proceeds with potentially corrupt DB — consider returning a boolean or re-raising in a follow-up. (2) No blob length validation before int.from_bytes — malformed rows could produce wrong seq_id values. Both are edge cases; the fix is still valuable for the common chromadb 0.6→1.5 migration path.
* refactor: add stage-1 backend abstraction seam
Introduce the first upstreamable storage seam for MemPalace without
bringing in the PostgreSQL spike or any benchmark artifacts.
This change adds a small backend package with:
- BaseCollection as the minimal collection contract
- ChromaBackend/ChromaCollection as the default implementation
It then routes the main runtime collection consumers through that seam:
- palace.py
- searcher.py
- layers.py
- palace_graph.py
- mcp_server.py
- miner.status()
Behavioral constraints kept for stage 1:
- ChromaDB remains the only backend and the default path
- no config/env backend selection yet
- no PostgreSQL code
- no benchmark or research files
- existing tests stay unchanged
Important compatibility details:
- read paths now call the seam with create=False so they still surface
the existing 'no palace found' behavior instead of silently creating
empty collections
- write paths keep create=True semantics through palace.get_collection()
- layers/searcher retain a chromadb module attribute so the existing
mock-based tests can keep patching PersistentClient unchanged
- ChromaBackend only creates palace directories on create=True, which
preserves mocked read-path tests that use fake read-only paths
Verification:
- python3 -m py_compile mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q # 529 passed, 106 deselected
* refactor: clean up stage-1 seam compatibility shims
Tighten the stage-1 backend abstraction branch after review.
This follow-up does three small things:
- keep the chromadb compatibility hook in searcher.py and layers.py,
but express it through the backends.chroma module so it no longer
reads like an accidental unused import
- fix the palace_graph.py helper alias to avoid the local name collision
flagged by ruff (imported helper vs local _get_collection wrapper)
- preserve the existing mock-based test patch points unchanged while
keeping the new backend seam intact
Why this matters:
- the direct form looked like a
dead import in review, even though it was intentionally preserving the
existing test seam ( and
)
- palace_graph.py had a real lint issue ( redefinition) that was
small but worth fixing before a public PR
Verification:
- /opt/homebrew/bin/ruff check mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q tests/test_layers.py tests/test_searcher.py
- pytest -q # 529 passed, 106 deselected
* docs: explain backend shim imports in search paths
Add short code comments in searcher.py and layers.py explaining why the
module-level `chromadb` alias remains after the stage-1 backend seam
refactor.
The alias is intentional: it preserves the existing mock patch points used
by the current test suite (`mempalace.searcher.chromadb.PersistentClient`
and `mempalace.layers.chromadb.PersistentClient`) while the runtime logic
now flows through the backend abstraction.
This keeps the public PR easier to review because the apparent "unused
import" now has an explicit reason next to it.
Verification:
- /opt/homebrew/bin/ruff check mempalace/searcher.py mempalace/layers.py
- pytest -q tests/test_layers.py tests/test_searcher.py
* refactor: reuse a default backend instance in palace helper
Tighten the stage-1 backend seam by promoting the default Chroma backend
adapter to a module-level singleton in `mempalace/palace.py`.
This keeps the stage-1 scope unchanged — Chroma is still the only backend
wired in this branch — but avoids constructing a fresh `ChromaBackend()`
object on every `get_collection()` call. The backend is stateless today,
so this is a readability/cleanup change rather than a behavioral one.
Why this helps:
- makes `palace.get_collection()` read like a real default factory instead
of an inline constructor call
- keeps the stage-1 branch a little cleaner before opening the public PR
- does not widen the backend surface or change any config/runtime behavior
Verification:
- python3 -m py_compile mempalace/palace.py
- pytest -q tests/test_miner.py tests/test_layers.py tests/test_searcher.py
- pytest -q # 529 passed, 106 deselected
* fix: harden read-only seam behavior and update seam tests
Preserve the stage-1 backend abstraction while closing the real read-path
regression surfaced in PR review.
What changed:
- make ChromaBackend.get_collection(create=False) fail fast when the palace
directory does not exist instead of letting PersistentClient create it as a
side effect
- update miner.status() to call get_collection(..., create=False) so status
keeps the historical 'No palace found' behavior
- remove the temporary chromadb shim aliases from layers.py and searcher.py
now that the tests patch the seam directly
- add focused tests for the new backends package, including ChromaCollection
delegation and ChromaBackend create=True/create=False behavior
- retarget layer/searcher tests to patch the backend seam instead of patching
chromadb.PersistentClient inside production modules
- add a regression test that status() does not create an empty palace when the
target path is missing
Verification:
- ruff check .
- uv run pytest -q
- uv run pytest -q tests/test_backends.py tests/test_cli.py tests/test_mcp_server.py tests/test_layers.py tests/test_searcher.py tests/test_miner.py
Notes:
- the separate benchmark/slow/stress layer was started as a soak but not used
as the merge gate for this PR branch
* refactor: drop duplicate mcp collection cache declaration
Remove a redundant `_collection_cache = None` assignment in
`mempalace/mcp_server.py` left over after the stage-1 backend seam refactor.
This does not change behavior; it only trims review noise in the MCP server
module after the read-path hardening pass.
Verification:
- ruff check mempalace/mcp_server.py
- uv run pytest -q tests/test_mcp_server.py
---------
Co-authored-by: Sergey Kuznetsov <sergey@iterudit.com>
On Windows, projects containing git-submodule junctions or dev-drive
reparse points cause iterdir() to list the entry successfully but
Path.is_dir() to raise OSError when it calls stat() internally.
Reproducer: any Windows project with a submodule checked out as a
junction (e.g. skills/pr-perfect) crashes mempalace init with:
OSError: [WinError 448] The path cannot be traversed because it
contains an untrusted mount point
Fix: wrap every is_dir() call in detect_rooms_from_folders with
try/except OSError so the scanner skips inaccessible entries and
continues rather than aborting.
Covers both the top-level pass and the one-level-deep nested pass.
Two new tests mock the OSError on specific paths and verify the
function returns correct rooms from the remaining accessible entries.
Addresses Issue #333: AI agents prepending system prompts to search queries
causes embedding retrieval to collapse (89.8% → 1.0% R@10).
Mitigation approach (減災):
- New query_sanitizer.py with 4-stage pipeline:
Step 1: passthrough for short queries (≤200 chars)
Step 2: question extraction (finds ? sentences) → ~85-89% recovery
Step 3: tail sentence extraction → ~80-89% recovery
Step 4: tail truncation fallback → ~70-80% recovery
Worst case without sanitizer: 1.0% (catastrophic)
Worst case with sanitizer: ~70-80% (survivable)
- mcp_server.py: tool_search applies sanitizer before ChromaDB query
- MCP schema: query description warns agents not to include prompts
- New 'context' parameter separates background info from search intent
- Sanitizer metadata included in response when triggered
22 new tests covering all pipeline stages and real-world scenarios.
Made-with: Cursor
The initialize handler hardcoded protocolVersion "2024-11-05", which
causes newer MCP clients (e.g. Claude Code) to reject the connection
when they negotiate "2025-11-25" or later.
Echo the client's requested version if it is in the supported set,
otherwise fall back to the latest supported version. This keeps
backwards compatibility with older clients while allowing newer ones
to connect.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merged both the PR's benchmark suite additions (psutil dep, pytest
markers, --ignore=tests/benchmarks) and upstream's coverage changes
(pytest-cov, --cov-fail-under=30, coverage config) so both coexist.
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
_patch_mcp_server had palace_path removed from its signature but the
assertion body still referenced it, causing NameError at runtime and
F821 from ruff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add tests for config, convo_miner, spellcheck, knowledge_graph
- Fix Windows PermissionError in test cleanup (chromadb file locks)
- Add UTF-8 encoding to split_mega_files, entity_registry, hooks_cli
- Fix mcp_server parse_known_args logging for unknown args
- Set coverage threshold to 85 in pyproject.toml and CI
- Reset all version files to 3.0.11
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>