mempalace

Author	SHA1	Message	Date
Igor Lins e Silva	1dc20e307b	test: verify mine_lock via disjoint critical-section intervals The previous revision used multiprocessing but still relied on timing ("second process waited at least N seconds") which flakes on CI where spawn overhead eats into the hold window. Linux CI observed the second process report a 0.088s wait — below the 0.1s threshold — even though the lock behavior was correct; spawn was just slow enough that the first process had nearly finished holding when the second got past its own spawn. Switch to effect-based verification: each worker logs its [enter_time, exit_time] inside the critical section, and the test asserts the two intervals are disjoint after sorting. A broken lock would produce overlapping intervals regardless of spawn latency; a working lock cannot. Also removed the mp.Queue since we no longer pass timing data back.	2026-04-13 19:08:57 -03:00
Igor Lins e Silva	e052074624	test: serialize mine_lock concurrency test with multiprocessing The macOS CI job failed ``test_lock_blocks_concurrent_access`` because ``fcntl.flock`` on BSD/macOS is per-process, not per-FD: two threads in the same process both acquire even when they open their own file descriptors. The test passed on Linux (per-FD flock) and Windows (per-FD ``msvcrt.locking``) but was never actually exercising the lock's real contract. ``mine_lock`` is designed to serialize multi-agent access — i.e., separate processes, not threads. Switch the test to ``multiprocessing.get_context('spawn')`` with a module-level worker (so the spawn pickles cleanly) so it: 1. reflects the actual use case (one lock per mining process); 2. passes on all three OSes without flock-semantics branching; 3. catches real regressions (a broken lock would now let both processes through, exactly what we care about). Hold time bumped to 0.3s and the "wait until p1 acquires" delay to 0.2s to tolerate spawn's higher startup latency on macOS/Windows.	2026-04-13 19:02:51 -03:00
Igor Lins e Silva	7192552624	test: make diary state path assertion platform-neutral The Windows CI job failed on: assert '/.mempalace/state/' in str(state_path) because Windows uses ``\`` as the path separator, so the substring never matches. The behavior under test (state file lives outside the diary dir, under ``~/.mempalace/state/``) is already correct on both platforms — only the assertion was Unix-only. Switch to ``state_path.parent`` comparisons that work on any OS.	2026-04-13 18:55:36 -03:00
Igor Lins e Silva	6b7dcc53d4	merge: pr/closet-llm-generic + harden LLM regen path for production Brings in PR #793 (optional LLM-based closet regeneration via user-configured OpenAI-compatible endpoint) and PR #795 (hybrid closet+drawer search — closets boost, never gate). Stack: #784 → #788 → #789 → #790 → #791 → #792 → #793 (+ #795). Findings hardened on our side ───────────────────────────── 1) closet_llm.regenerate_closets didn't use the blessed palace helpers. Before: * manual closets_col.get(where=...) + .delete(ids=...) with a silent ``except Exception: pass`` around both — if the purge failed, pre-existing regex closets survived alongside fresh LLM closets, giving the searcher double hits for the same source. * ``source.split('/')[-1][:30]`` to build the closet_id — quietly wrong on Windows paths (``C:\\proj\\a.md`` has no ``/``, so the whole string ends up in the ID). * no mine_lock around purge+upsert — a concurrent regex rebuild of the same source could interleave with our purge and leave a mix of regex and LLM pointers. * no ``normalize_version`` stamp on the LLM closets — the miner's stale-version gate would treat them as leftovers from an older schema and rebuild over them on the next mine. After: routes through ``purge_file_closets`` + ``mine_lock`` + ``os.path.basename`` + ``NORMALIZE_VERSION`` stamp. Regression tests cover each. 2) searcher.search_memories was still closet-first. PR #795 merged into #793's head to fix the recall regression documented in that PR (R@1 0.25 on narrative content vs. 0.42 baseline). The hybrid design makes closets a ranking boost rather than a gate: drawers are always queried at the floor, and matching closet hits (rank 0-4 within CLOSET_DISTANCE_CAP=1.5) add a boost of 0.40/0.25/0.15/0.08/0.04 to the effective distance. Merged to take the incoming hybrid design, with two cleanups: * kept the ``_expand_with_neighbors`` / ``_extract_drawer_ids_from_closet`` helpers as separately-tested utilities (still imported by tests and future callers); * replaced the fragile ``source_file.endswith(basename)`` reverse- lookup in the enrichment step with internal ``_source_file_full`` / ``_chunk_index`` fields stripped before return, so enrichment doesn't silently pick the wrong path when two sources share a basename across directories; * drawer-grep enrichment now sorts by ``chunk_index`` before neighbor expansion, so ``best_idx ± 1`` corresponds to actual document order rather than whatever order Chroma returned. 3) Closet-first tests in test_closets.py (``TestSearchMemoriesClosetFirst``, end-to-end ``test_closet_first_search_includes_drawer_index_and_total``) pinned contracts that the hybrid path now violates (``matched_via`` went from ``"closet"`` to ``"drawer+closet"``). Rewrote them around the new invariant: direct drawers are always the floor, closet agreement flips the hit's matched_via and exposes closet_preview. Verification ──────────── * 805/805 pass under ``uv run pytest tests/ -v --ignore=tests/benchmarks`` (13 new tests from PR #793 + 5 from PR #795 + 2 new regressions for the closet_llm hardening + the rewritten hybrid assertions in test_closets.py). * CI-pinned ruff 0.4.x clean on ``mempalace/`` + ``tests/`` (check + format both pass). * No new deps — closet_llm.py still uses stdlib ``urllib.request`` per the PR's "zero new dependencies" promise. Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>	2026-04-13 18:40:36 -03:00
Igor Lins e Silva	1263c3c91e	merge: full hardened stack + rewrite fact_checker around actual KG API Merges the full hardened stack (up through #791 drawer-grep) and turns fact_checker from "dead code hidden behind bare except" into an actually-working offline contradiction detector with tests. ## Dead paths the PR body advertised but the code never executed Both buried by a single outer ``except Exception: pass``: * ``kg.query(subject)`` — ``KnowledgeGraph`` has no ``query()`` method; it has ``query_entity()``. The attribute error was silently swallowed and the entire KG branch always returned ``[]``. Now using ``kg.query_entity(subject, direction="outgoing")`` with proper handling of the ``predicate``/``object``/``current``/``valid_to`` fields the real API returns. * ``KnowledgeGraph(palace_path=palace_path)`` — the constructor's only kwarg is ``db_path``. Passing ``palace_path`` raised TypeError, silently swallowed. Now computing the db_path correctly from ``<palace>/knowledge_graph.sqlite3``, matching the convention the MCP server already uses. ## Contradiction logic rewritten The previous ``if kg_pred in claim and fact.object not in claim`` only fired when text used the SAME predicate word as the KG fact — the exact opposite of the stated use case ("Bob is Alice's brother" when KG says husband" would NOT have fired). Replaced with a proper parse → lookup → compare pipeline: * ``_extract_claims`` parses two surface forms ("X is Y's Z" and "X's Z is Y") into ``(subject, predicate, object)`` triples. * ``_check_kg_contradictions`` pulls the subject's outgoing facts and flags two classes: - ``relationship_mismatch`` when a current KG fact matches the same ``(subject, object)`` pair but with a different predicate. - ``stale_fact`` when the exact triple exists but is ``valid_to``-closed in the past. * Stale-fact detection is now implemented (the PR body claimed it; the old code silently didn't implement it). ## Performance fix — O(n²) → O(mentioned × n) ``_check_entity_confusion`` previously computed Levenshtein for every pair of registered names on every ``check_text`` call. For 1,000 registered names that's ~500K edit-distance calls per hook invocation. Now we first identify which registry names actually appear in the text (single regex scan), then only compute edit distance between mentioned and unmentioned names. Pinned by a test that asserts <200ms on a 500- name registry with zero mentions. Also: when both similar names are mentioned in the text, we no longer flag them — the user clearly knows they're different people. ## Shared entity-registry loader ``mempalace/miner.py`` already had an mtime-cached loader for ``~/.mempalace/known_entities.json``. fact_checker had a duplicate implementation that leaked file handles and ignored caching. Extended miner's cache to expose both the flat set (``_load_known_entities``) and the raw category dict (``_load_known_entities_raw``); fact_checker now imports the latter. No more double disk reads, no more handle leak. ## Tests — 24 cases in tests/test_fact_checker.py All three detection paths + both dead-code regressions: * ``test_kg_init_uses_db_path_not_palace_path_kwarg`` — pins the correct KG constructor signature so the ``palace_path=`` bug can't come back. * ``test_relationship_mismatch_detected`` — the headline example from the PR body now actually fires. * ``test_stale_fact_detected`` — valid_to-closed triple is flagged. * ``test_current_fact_same_triple_is_not_flagged`` — no false positive on a still-valid match. * ``test_performance_bounded_by_mentioned_names`` — 500-name registry, zero mentions, <200ms. Regression for the O(n²) blowup. * ``test_no_false_positive_when_both_names_mentioned`` — Mila and Milla in the same text is fine. * Plus claim extraction, flatten_names shapes, CLI exit code, empty text handling, missing-palace graceful fallback, registry-dict shape support. 785/785 suite pass. ruff + format clean on CI-pinned 0.4.x.	2026-04-13 18:20:11 -03:00
Igor Lins e Silva	e9201fb617	merge: pr/cross-wing-tunnels + rebuild drawer-grep on hardened closet path Merges the full hardened stack (#788 closets, #789 entity/BM25/diary, #790 tunnels) and reimplements the drawer-grep feature in a way that composes with the chunk-level closet-first search instead of fighting it. ## Background The original PR added "drawer-grep" on top of the pre-hardening closet code that returned whole-file blobs. My #788 hardening changed that path to return chunk-level hits by parsing each closet's ``→drawer_id`` pointers and hydrating exactly those drawers. That made the original drawer-grep grep-over-all-drawers logic redundant — the closet already points at the relevant chunk. What remained valuable from the original PR was the context expansion idea: a chunk boundary can clip a thought mid-stride (matched chunk says "here's a breakdown:" and the breakdown lives in the next chunk), so callers want ±1 neighbor chunks for free rather than a follow-up get_drawer call. ## Change New ``_expand_with_neighbors(drawers_col, doc, meta, radius=1)`` helper in searcher.py: * Reads ``source_file`` + ``chunk_index`` from the matched drawer's metadata. * Fetches the ±radius sibling chunks in a SINGLE ChromaDB query using ``$and + $in`` — no "fetch all drawers for source" blowup. * Sorts retrieved chunks by chunk_index, joins with ``\n\n``. * Does a cheap metadata-only second query to compute ``total_drawers`` so callers know where in the file they landed. * Graceful fallback to the matched doc alone on any ChromaDB failure or missing metadata — search never breaks because expansion failed. ``_closet_first_hits`` now calls this helper and tags each hit with ``drawer_index`` + ``total_drawers``. Hit shape stays consistent with the direct-search path (both still carry ``matched_via``) so callers can't tell which path produced a given hit except via that field. ## Tests 6 new cases in TestDrawerGrepExpansion: * neighbors returned in chunk_index order (not hash order) * edge case: matched chunk at index 0 — only next neighbor surfaces * edge case: matched chunk at last index — only prev neighbor surfaces * edge case: 1-drawer file — returns just the matched doc * missing/non-int chunk_index metadata — graceful fallback * end-to-end via ``search_memories`` — closet-first hit carries drawer_index, total_drawers, and includes ±1 neighbors 761/761 suite pass; ruff + format clean on CI-pinned 0.4.x. Merge resolutions: miner.py kept develop's purge+NORMALIZE_VERSION; searcher.py dropped the old whole-file-blob block entirely in favor of rebuilding context expansion on top of ``_closet_first_hits``; test_closets.py took develop's 47-test baseline and appended TestDrawerGrepExpansion.	2026-04-13 18:08:01 -03:00
Igor Lins e Silva	20255b05be	merge: develop + harden cross-wing tunnels for production Merges the hardened closet/entity/BM25/diary stack from #789 and fixes five correctness/durability issues in the tunnels module plus the directional/symmetric design question. ## Design: tunnels are now symmetric Per review discussion: a tunnel represents "these two things relate", not "A causes B". The canonical ID now hashes the sorted endpoint pair, so ``create_tunnel(A, B)`` and ``create_tunnel(B, A)`` resolve to the same record and the second call updates the label rather than creating a duplicate. ``follow_tunnels`` can be called from either endpoint and surfaces the other side consistently. The returned dict still preserves ``source``/``target`` in the order the caller supplied, so UIs that want to render the connection directionally can do so. ## Correctness fixes * Atomic write — ``_save_tunnels`` writes to ``tunnels.json.tmp`` and ``os.replace``s it into place. A crash mid-write can no longer leave a truncated file that silently reads back as ``[]`` and wipes every tunnel. Includes ``f.flush() + os.fsync`` before replace on platforms that support it. * Concurrent-write lock — ``create_tunnel`` and ``delete_tunnel`` wrap the load→mutate→save cycle in ``mine_lock(_TUNNEL_FILE)``. Without this, two agents creating tunnels simultaneously would both read the same snapshot and the later writer would drop the earlier writer's tunnel. * Corrupt-file tolerance — ``_load_tunnels`` now uses a context manager, validates that the loaded JSON is a list, and returns ``[]`` for any read failure. Subsequent ``create_tunnel`` then overwrites the corrupt file via atomic write — no manual recovery needed. * Input validation — new ``_require_name`` helper rejects empty or whitespace-only wing/room names with a clear ``ValueError``. Prevents phantom tunnels with blank endpoints from ever reaching the JSON store. * Timezone-aware timestamps — ``created_at`` / ``updated_at`` now use ``datetime.now(timezone.utc).isoformat()``, matching diary ingest and other recent modules. ## Tests (12 in TestTunnels) 5 original + 7 regression cases: * ``test_tunnel_is_symmetric`` — A↔B and B↔A dedupe to one record. * ``test_follow_tunnels_works_from_either_endpoint`` — symmetric surface. * ``test_empty_endpoint_fields_rejected`` — validation guard. * ``test_corrupt_tunnel_file_does_not_lose_new_writes`` — truncated JSON treated as empty; next create persists cleanly. * ``test_atomic_write_leaves_no_stray_tmp_file`` — no leftover ``.tmp``. * ``test_concurrent_creates_preserve_all_tunnels`` — 5 threads each create a distinct tunnel; all 5 persisted (regression for the read-modify-write race). * ``test_created_at_is_timezone_aware`` — ISO8601 has tz suffix. Merge resolutions: tests/test_closets.py combined develop's hardened closet/entity/BM25/diary tests with this PR's TestTunnels class. 755/755 tests pass. ruff + format clean under CI-pinned 0.4.x.	2026-04-13 17:50:43 -03:00
Igor Lins e Silva	32d7f4376b	merge: develop + harden entity metadata, BM25, and diary ingest for production Merges develop (closet hardening #826, strip_noise #785, lock #784) and replaces every sub-feature in this PR with a correct, tested implementation. Shippable now. ## 1. Real Okapi-BM25 (searcher.py) The prior `_bm25_score()` hardcoded `idf = log(2.0)` for every term — it was really a scaled TF, not BM25, and couldn't tell a discriminative term from a generic one. Replaced with `_bm25_scores(query, documents)` that computes proper IDF over the provided candidate corpus using the Lucene smoothed formula `log((N - df + 0.5) / (df + 0.5) + 1)`. Well- defined for re-ranking vector-retrieval candidates — IDF there measures how discriminative each term is within the candidate set, exactly the signal we want. `_hybrid_rank` also fixed: - Vector normalization is now absolute `max(0, 1 - dist)`, not `1 - dist/max_dist` — adding/removing a candidate no longer reshuffles the others. - BM25 is min-max normalized within candidates (bounded [0, 1]). - Closet path now re-ranks too (was previously returning closet-order hits without hybrid scoring). - `_hybrid_score` internal field stripped from output; `bm25_score` exposed for debugging. ## 2. Entity metadata (miner.py) - Reuses `_ENTITY_STOPLIST` from palace.py so sentence-starters like "When", "After", "The" no longer land as entities (regression test covers this). - Known-entity registry is cached at module level, keyed by the registry file's mtime — no more disk read per drawer. - File handle now uses a context manager. - Truncates the entity LIST (to 25) before joining — never splits a name in the middle. ## 3. Diary ingest (diary_ingest.py) - State file now lives at `~/.mempalace/state/diary_ingest_<hash>.json`, keyed by (palace_path, diary_dir). No more pollution of the user's content directory. - Drawer IDs now hash `(wing, date_str)` — a user with personal + work diaries on the same day no longer silently clobbers. - Each day's upsert runs inside `mine_lock(source_file)` so concurrent ingest from two terminals can't race. - `force=True` now calls `purge_file_closets` before rebuild so leftover numbered closets from a longer prior day don't orphan. ## 4. Tests (tests/test_closets.py) Merged this PR's MineLock/Entity/BM25/Diary tests with develop's hardened Build/Upsert/Purge/Rebuild/SearchClosetFirst tests. Added specific regression tests for every fix above: - entity stoplist applies (no "When/After/The") - entity list capped before join (no partial tokens) - registry cached by mtime (mock-verified zero re-reads) - BM25 IDF downweights terms present in every doc (real BM25 evidence) - hybrid rank absolute normalization stable against outliers - diary state file outside user's diary dir - diary wing-prefixed IDs prevent cross-wing date collisions 35/35 closet tests pass; full suite 743/743. ruff + format clean under CI-pinned 0.4.x.	2026-04-13 17:37:45 -03:00
Igor Lins e Silva	95a8d7176a	Merge pull request #826 from MemPalace/pr/multi-agent-lock chore: forward closet layer (#788) into develop	2026-04-13 17:21:38 -03:00
shafdev	5db651a543	fix: use microsecond timestamp and full content hash in diary entry ID (#819 )	2026-04-13 13:06:04 -07:00
Igor Lins e Silva	21d4a23430	merge: develop + harden closet layer for production Merges develop (#820 version sync, #785 strip_noise + NORMALIZE_VERSION, #784 file locking) and addresses six concerns surfaced during PR review of the closet feature: 1. Closet append-on-rebuild bug — upsert_closet_lines used to APPEND to existing closets (mismatched the doc's "fully replaced" promise). With NORMALIZE_VERSION rebuilds on develop, this would have stacked stale v1 topics on top of fresh v2 content forever. Fix: - Drop the read-and-append branch from upsert_closet_lines (now a pure numbered-id overwrite). - Add purge_file_closets(closets_col, source_file) helper that wipes every closet for a source file by where-filter. - process_file calls purge_file_closets before upsert on every mine, mirroring the existing drawer purge. 2. Searcher returned whole-file blobs from the closet path while the direct path returned chunk-level drawers. Refactored: - _extract_drawer_ids_from_closet parses the `→drawer_a,drawer_b` pointers out of closet documents. - _closet_first_hits hydrates exactly those drawer IDs (chunk-level), not collection.get(where=source_file) (which returned everything). - Same hit shape as direct-search path; both now carry matched_via. 3. max_distance was bypassed on the closet path. Now applied per-hit; when every closet candidate gets filtered, _closet_first_hits returns None and the caller falls through to direct drawer search. 4. Entity extraction caught sentence-starters like "When", "The", "After" as proper nouns. Added _ENTITY_STOPLIST (~40 common false positives + day/month names + role words). Real names like Igor / Milla still survive — covered by tests. 5. CLOSETS.md drifted from the code (claimed "replaced via upsert" but code appended; claimed BM25 hybrid that doesn't exist; claimed a 10K char hydration cap that wasn't enforced). Rewritten to describe what actually ships, with explicit notes on the BM25 / convo-closet follow-ups. 6. Zero tests for ~250 lines. Added tests/test_closets.py with 17 cases: - build_closet_lines: pointer shape, header extraction, stoplist filtering (with regression case for "When/After/The"), real-name survival, fallback-line guarantee, drawer-ref slicing. - upsert_closet_lines: pure overwrite semantics (regression for the append bug), char-limit packing without splitting lines. - purge_file_closets: scoped to source_file, doesn't touch others. - End-to-end miner rebuild: re-mining a file with fewer topics fully purges leftover numbered closets from the larger first run. - _extract_drawer_ids_from_closet: parsing + dedup edge cases. - search_memories closet-first: fallback when empty, chunk-level hits with matched_via, no whole-file glue, max_distance enforced. Merge resolutions: miner.py imports combined NORMALIZE_VERSION/mine_lock from develop with the closet helpers from this branch. process_file auto-merged cleanly (closet block sits inside develop's lock body). 724/724 tests pass. ruff + format clean under CI-pinned 0.4.x.	2026-04-13 17:00:55 -03:00
Igor Lins e Silva	7e5eeda9a5	feat(normalize): auto-rebuild stale drawers via NORMALIZE_VERSION schema gate Without this, the strip_noise improvement only helps new mines. Every user who had already mined Claude Code JSONL sessions would keep their noise-polluted drawers forever, because convo_miner's file_already_mined skip short-circuits before re-processing. Adds a versioned schema gate so upgrades propagate silently: - palace.NORMALIZE_VERSION=2 — bumped when the normalization pipeline changes shape (this PR's strip_noise is the v1→v2 bump). - file_already_mined now returns False if the stored normalize_version is missing or less than current, triggering a rebuild on next mine. - Both miners stamp drawers with the current normalize_version. - convo_miner now purges stale drawers before inserting fresh chunks (mirrors miner.py's existing delete+insert), extracted into _file_convo_chunks helper to keep mine_convos under ruff's C901 limit. User experience: upgrade mempalace, run `mempalace mine` as usual, old noisy drawers get silently replaced with clean ones. No erase needed, no "you need to rebuild" changelog footgun. Tests: - test_file_already_mined_returns_false_for_stale_normalize_version — pins the version gate contract for missing/v1/current. - test_add_drawer_stamps_normalize_version — fresh project-miner drawers carry the field. - test_mine_convos_rebuilds_stale_drawers_after_schema_bump — end-to-end proof that a pre-v2 palace gets silently cleaned on next mine, with orphan drawers purged and NOT skipped. Existing test_file_already_mined_check_mtime updated to include the new field; all other tests unaffected.	2026-04-13 16:20:55 -03:00
Igor Lins e Silva	ca2598a9f6	fix(normalize): make strip_noise verbatim-safe and scope it to Claude Code JSONL The initial strip_noise() regressed on three fronts when audited against adversarial user content — each verified with executable repros against the cherry-picked code: 1. `<tag>.?</tag>` with re.DOTALL span-ate across messages: one stray unclosed <system-reminder> anywhere in a session merged with the next closing tag, silently deleting everything between them (including full assistant replies). 2. `.$ctrl\+o to expand$.\n?` nuked entire lines of user prose whenever a user happened to document the TUI shortcut. 3. `Ran \d+ (?:stop\|pre\|post)\shook.` with IGNORECASE ate the second sentence from "our CI has a stop hook ... Ran 2 stop hooks last week" — legitimate user commentary. These are unambiguous violations of the project's "Verbatim always" design principle. Fixes: - All tag patterns are now line-anchored (`(?m)^(?:> )?<tag>`) and their body forbids crossing a blank line (`(?:(?!\n\s\n)[\s\S])*?`), so a dangling open tag cannot eat neighboring messages. - `_NOISE_LINE_PREFIXES` are line-anchored and case-sensitive — user prose mentioning "CURRENT TIME:" mid-sentence is preserved. - Hook-run chrome requires `(?m)^`, explicit hook names (Stop, PreCompact, PreToolUse, etc.), and no IGNORECASE. - "… +N lines" is line-anchored. - "(ctrl+o to expand)" only matches Claude Code's actual collapsed- output chrome shape `[N tokens] (ctrl+o to expand)`; a bare parenthetical in user prose stays intact. Scope: - `strip_noise()` is no longer called on every normalization path. Only `_try_claude_code_jsonl` invokes it, per-extracted-message — so Claude.ai exports, ChatGPT exports, Slack JSON, Codex JSONL, and plain text with `>` markers pass through fully verbatim. Per-message application also makes span-eating structurally impossible. Tests: - 15 new tests in test_normalize.py pin the boundary: 6 guard user content that must survive (each of the adversarial repros), 9 assert real system chrome is still stripped. All pass; full suite 702 pass (2 failures are the unrelated pre-existing version.py bug, cleared by #820). Known limitation (not fixed here): convo_miner.py does not delete drawers on re-mine, so transcripts mined before this PR keep noise- filled drawers until the user manually erases + re-mines. Proper fix needs a schema-version field on drawer metadata + re-mine trigger — out of scope for this PR.	2026-04-13 16:11:03 -03:00
Igor Lins e Silva	8e446f904c	fix(search): hybrid closet+drawer retrieval — closets boost, never gate (#795 )	2026-04-13 04:43:54 -07:00
Igor Lins e Silva	4d581cbb73	feat: optional LLM-based closet regeneration — bring-your-own endpoint Adds mempalace/closet_llm.py as an OPTIONAL path for richer closet generation. Regex closets remain the default and cover the local-first promise; users who want LLM-quality topics can bring their own endpoint. Configuration (env or CLI flag): LLM_ENDPOINT — OpenAI-compatible base URL (required) LLM_KEY — bearer token (optional; local inference skips this) LLM_MODEL — model name (required) Works with Ollama, vLLM, llama.cpp servers, OpenAI, OpenRouter, and any other provider that speaks OpenAI-compatible /chat/completions. Zero new dependencies — uses stdlib urllib. Replaces the original Anthropic-SDK-hardcoded version of this module from Milla's branch (commit 935f657). Same prompt, same parsing, same regenerate_closets flow; only the transport was generalised so the feature doesn't lock users into a specific vendor or require API keys for core memory operations (CLAUDE.md, "Local-first, zero API"). Includes 13 unit tests covering config resolution, request shape, auth-header omission when no key is set, code-fence stripping, and missing-config error path. All mocked — zero network calls in tests. Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>	2026-04-13 07:51:46 -03:00
Igor Lins e Silva	e2a9bb05d3	test: add TestTunnels for cross-wing tunnel operations Appended from Milla's omnibus test_closets.py — covers create, list, delete, dedup, and follow_tunnels behavior. 21/21 pass. Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>	2026-04-13 07:44:32 -03:00
Igor Lins e Silva	f72ffbbcb2	test: add tests for mine_lock, closets, entity metadata, BM25, diary Trimmed version of Milla's omnibus test_closets.py to only cover features present in this PR stack (#784 lock, #788 closets, this PR's entity/BM25/diary). Strip-noise tests will land with #785; tunnel tests will land with the tunnels PR. 16/16 pass. Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>	2026-04-13 07:42:25 -03:00
Mikhail Valentsev	a2432a3245	fix: parse Claude.ai privacy export with messages key and sender field (#677 ) (#685 ) * fix: parse Claude.ai privacy export with messages key and sender field (#677) The privacy-export branch in _try_claude_ai_json only checked for the "chat_messages" key, missing exports that use "messages" instead. It also only read the "role" field while real privacy exports use "sender". Both gaps caused the file to fall through to plain-text, producing a single giant drawer. Changes: - Accept "messages" alongside "chat_messages" in the conversation-object guard and inner extraction. - Accept "sender" alongside "role" as the author field. - Fall back to a top-level "text" key when content blocks are empty. - Produce one transcript per conversation instead of concatenating all conversations into a single blob. - Extract shared logic into _collect_claude_messages helper. - Add 6 regression tests covering each variant. * style: apply ruff format to normalize.py * fix: guard against null text field in Claude.ai export parsing item.get("text", "").strip() crashes when "text" is explicitly null in the JSON (legal and observed in some exports). Use (item.get("text") or "").strip() and add a regression test. --------- Co-authored-by: Igor Lins e Silva <4753812+igorls@users.noreply.github.com>	2026-04-13 02:11:03 -03:00
Igor Lins e Silva	e200ce2c8a	fix: detect mtime changes in _get_client to prevent stale HNSW index (#757 ) When external tools write to the palace database (CLI mining, scripts), the MCP server's cached ChromaDB collection becomes stale — its HNSW index doesn't know about new vectors. Develop already invalidates on inode changes (catches rebuilds) but not on mtime changes (misses in-place writes). This PR: - Adds st_mtime tracking alongside st_ino in _get_client; invalidates the cached client on either change. - Adds the mempalace_reconnect MCP tool for explicit cache flush. Original author: @jphein (#663). Original approval: @Ari4ka. Skips test_missing_db_invalidates_cache on Windows (ChromaDB holds chroma.sqlite3 open).	2026-04-13 01:53:13 -03:00
shafdev	f4226047cb	fix: hash full content in tool_add_drawer drawer ID (#716 ) * fix: hash full content in tool_add_drawer drawer ID * style: apply ruff format * style: fix ruff format for CI ruff 0.4.x	2026-04-13 01:40:46 -03:00
copilot-swe-agent[bot]	c383523768	chore: clarify security guardrails Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/775f2fc4-3051-462e-8586-6d694b55da0d Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>	2026-04-12 22:19:58 -03:00
copilot-swe-agent[bot]	b1a676fa24	fix: make quote trimming explicit Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/775f2fc4-3051-462e-8586-6d694b55da0d Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>	2026-04-12 22:19:58 -03:00
copilot-swe-agent[bot]	248ecd98f1	fix: polish sanitizer and repair messaging Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/775f2fc4-3051-462e-8586-6d694b55da0d Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>	2026-04-12 22:19:58 -03:00
copilot-swe-agent[bot]	d2d4e62543	test: expand security regression coverage Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/775f2fc4-3051-462e-8586-6d694b55da0d Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>	2026-04-12 22:19:58 -03:00
copilot-swe-agent[bot]	c478dfa173	fix: harden palace security checks Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/775f2fc4-3051-462e-8586-6d694b55da0d Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>	2026-04-12 22:19:58 -03:00
Mikhail Valentsev	87e8bafad8	fix: prevent convo_miner from re-processing 0-chunk files on every run (#654 ) (#732 ) * fix: register 0-chunk files to prevent re-processing on every mine (#654) mine_convos() has three early-exit paths (OSError, content too short, zero chunks) that skip writing anything to ChromaDB. Since file_already_mined() checks for the presence of a document with a matching source_file, these files are re-read and re-processed on every subsequent run. Add _register_file() that upserts a lightweight sentinel document (room="_registry", ingest_mode="registry") so file_already_mined() returns True on future runs. Note: Bug 2 from the issue (drawers_added counter always 0) was already resolved upstream via the switch from collection.add() to collection.upsert(). * fix: resolve macOS path symlink in test + remove unused variable	2026-04-12 14:25:34 -07:00
shafdev	d52d6c9622	fix: store full AI response in convo_miner exchange chunking (#695 )	2026-04-12 14:23:52 -07:00
Mikhail Valentsev	091c2fe1c6	fix: mine --dry-run TypeError on files with room=None (#586 ) (#687 ) * fix: return "general" room from process_file error paths (#586) process_file() returned (0, None) for already-mined, unreadable, and too-short files. In --dry-run mode the caller always enters the room_counts branch, so None ended up as a dict key and crashed the summary printer with "unsupported format string passed to NoneType.__format__". Returning "general" instead of None makes the function contract explicit: it always yields (int, str). This matches the consensus fix discussed in the issue thread. * style: apply ruff format to test_miner.py	2026-04-12 14:23:44 -07:00
Jeffrey Hein	6e2ced3287	fix: allow Unicode in sanitize_name() — Latvian, CJK, Cyrillic (#637 ) (#683 ) * fix: allow Unicode in sanitize_name() — Latvian, CJK, Cyrillic names (#637) _SAFE_NAME_RE was ASCII-only ([a-zA-Z0-9]), rejecting valid Unicode names like "Jānis" or "太郎". Changed to \w which matches Unicode word characters (letters, digits, underscore) in Python 3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: tighten Unicode regex, add sanitize_name tests Use [^\W_] for first/last char to allow Unicode letters/digits but reject leading/trailing underscores (Copilot feedback). Add 7 tests covering Latvian, CJK, Cyrillic, path traversal, and edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-12 14:23:34 -07:00
7. Sun	15d9ee1b51	fix: close KnowledgeGraph SQLite connections in test fixtures (#450 ) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 23:14:23 -07:00
Jeffrey Hein	abc99f4154	fix: auto-repair BLOB seq_ids from chromadb 0.6→1.5 migration (#664 ) Note from code review: (1) silent exception swallow on migration failure means caller proceeds with potentially corrupt DB — consider returning a boolean or re-raising in a follow-up. (2) No blob length validation before int.from_bytes — malformed rows could produce wrong seq_id values. Both are edge cases; the fix is still valuable for the common chromadb 0.6→1.5 migration path.	2026-04-11 23:06:01 -07:00
Ben Sigman	4621f85d7c	style: ruff format all Python files (#675 )	2026-04-11 22:59:34 -07:00
Ben Sigman	20c8f8e57b	feat: new MCP tools — get/list/update drawer, hook settings, export (resolves #635 ) (#667 ) * feat: MCP reliability — inode detection, WAL rotation, metadata cache, search limits Infrastructure hardening for the MCP server: - Detect palace DB replacement via inode tracking (repair command support) - WAL rotation to prevent unbounded WAL growth - _fetch_all_metadata() + _get_cached_metadata() with 60s TTL for taxonomy/status - _MAX_RESULTS cap (100) with limit clamping [1, _MAX_RESULTS] - max_distance parameter for similarity threshold in search - Handle all notifications/* methods, null arguments, method=None - Remove duplicate _client_cache = None declarations - searcher.py max_distance parameter passthrough Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: new MCP tools (get/list/update drawer, hook settings, memories filed), export, normalize New MCP tools: - mempalace_get_drawer: fetch single drawer by ID with full content - mempalace_list_drawers: paginated listing with wing/room filter - mempalace_update_drawer: update content/wing/room on existing drawers - mempalace_hook_settings: get/set hook behavior (silent_save, desktop_toast) - mempalace_memories_filed_away: check latest checkpoint status Also includes: - exporter.py: export palace as browsable markdown files - normalize.py: tool_use/tool_result capture for richer transcript mining - layers.py: updated for new tool integration - config.py: hook settings properties (hook_silent_save, hook_desktop_toast) Depends on PR 3 (reliability) for _MAX_RESULTS, _metadata_cache, WAL logging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: normalize.py handles string messages and Read offset type mismatch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: params null guard, L2→cosine docs, empty tool_use_map key guard - Handle explicit null in MCP params (request.get("params") or {}) - Fix search tool description: L2 → cosine distance (collection uses hnsw:space=cosine) - Guard against empty string key in tool_use_map from malformed JSONL entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: rename ambiguous var 'l' to 'line' (E741 lint) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address code review findings (5 issues) 1. min_similarity backwards-compat: convert similarity to distance scale (1.0 - similarity) instead of passing raw value as max_distance 2. Restore structured error reporting (error + partial fields) in tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy — reverts silent except:pass that dropped #647 security hardening 3. inode cache: remove falsy-zero short-circuit so missing DB file triggers reconnect instead of reusing stale client 4. _fetch_all_metadata: check for empty batch before extending/advancing offset to prevent infinite loop on concurrent deletion 5. KG initialization: only override path when --palace is explicit; default runs use KnowledgeGraph's built-in default path Co-authored-by: jphein <jphein@users.noreply.github.com> --------- Co-authored-by: jp <jp@jphein.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: jphein <jphein@users.noreply.github.com>	2026-04-11 21:25:04 -07:00
Sergey Kuznetsov	ae5196bc8d	Мempalace backend seam (#413 ) * refactor: add stage-1 backend abstraction seam Introduce the first upstreamable storage seam for MemPalace without bringing in the PostgreSQL spike or any benchmark artifacts. This change adds a small backend package with: - BaseCollection as the minimal collection contract - ChromaBackend/ChromaCollection as the default implementation It then routes the main runtime collection consumers through that seam: - palace.py - searcher.py - layers.py - palace_graph.py - mcp_server.py - miner.status() Behavioral constraints kept for stage 1: - ChromaDB remains the only backend and the default path - no config/env backend selection yet - no PostgreSQL code - no benchmark or research files - existing tests stay unchanged Important compatibility details: - read paths now call the seam with create=False so they still surface the existing 'no palace found' behavior instead of silently creating empty collections - write paths keep create=True semantics through palace.get_collection() - layers/searcher retain a chromadb module attribute so the existing mock-based tests can keep patching PersistentClient unchanged - ChromaBackend only creates palace directories on create=True, which preserves mocked read-path tests that use fake read-only paths Verification: - python3 -m py_compile mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py - pytest -q # 529 passed, 106 deselected * refactor: clean up stage-1 seam compatibility shims Tighten the stage-1 backend abstraction branch after review. This follow-up does three small things: - keep the chromadb compatibility hook in searcher.py and layers.py, but express it through the backends.chroma module so it no longer reads like an accidental unused import - fix the palace_graph.py helper alias to avoid the local name collision flagged by ruff (imported helper vs local _get_collection wrapper) - preserve the existing mock-based test patch points unchanged while keeping the new backend seam intact Why this matters: - the direct form looked like a dead import in review, even though it was intentionally preserving the existing test seam ( and ) - palace_graph.py had a real lint issue ( redefinition) that was small but worth fixing before a public PR Verification: - /opt/homebrew/bin/ruff check mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py - pytest -q tests/test_layers.py tests/test_searcher.py - pytest -q # 529 passed, 106 deselected * docs: explain backend shim imports in search paths Add short code comments in searcher.py and layers.py explaining why the module-level `chromadb` alias remains after the stage-1 backend seam refactor. The alias is intentional: it preserves the existing mock patch points used by the current test suite (`mempalace.searcher.chromadb.PersistentClient` and `mempalace.layers.chromadb.PersistentClient`) while the runtime logic now flows through the backend abstraction. This keeps the public PR easier to review because the apparent "unused import" now has an explicit reason next to it. Verification: - /opt/homebrew/bin/ruff check mempalace/searcher.py mempalace/layers.py - pytest -q tests/test_layers.py tests/test_searcher.py * refactor: reuse a default backend instance in palace helper Tighten the stage-1 backend seam by promoting the default Chroma backend adapter to a module-level singleton in `mempalace/palace.py`. This keeps the stage-1 scope unchanged — Chroma is still the only backend wired in this branch — but avoids constructing a fresh `ChromaBackend()` object on every `get_collection()` call. The backend is stateless today, so this is a readability/cleanup change rather than a behavioral one. Why this helps: - makes `palace.get_collection()` read like a real default factory instead of an inline constructor call - keeps the stage-1 branch a little cleaner before opening the public PR - does not widen the backend surface or change any config/runtime behavior Verification: - python3 -m py_compile mempalace/palace.py - pytest -q tests/test_miner.py tests/test_layers.py tests/test_searcher.py - pytest -q # 529 passed, 106 deselected * fix: harden read-only seam behavior and update seam tests Preserve the stage-1 backend abstraction while closing the real read-path regression surfaced in PR review. What changed: - make ChromaBackend.get_collection(create=False) fail fast when the palace directory does not exist instead of letting PersistentClient create it as a side effect - update miner.status() to call get_collection(..., create=False) so status keeps the historical 'No palace found' behavior - remove the temporary chromadb shim aliases from layers.py and searcher.py now that the tests patch the seam directly - add focused tests for the new backends package, including ChromaCollection delegation and ChromaBackend create=True/create=False behavior - retarget layer/searcher tests to patch the backend seam instead of patching chromadb.PersistentClient inside production modules - add a regression test that status() does not create an empty palace when the target path is missing Verification: - ruff check . - uv run pytest -q - uv run pytest -q tests/test_backends.py tests/test_cli.py tests/test_mcp_server.py tests/test_layers.py tests/test_searcher.py tests/test_miner.py Notes: - the separate benchmark/slow/stress layer was started as a soak but not used as the merge gate for this PR branch * refactor: drop duplicate mcp collection cache declaration Remove a redundant `_collection_cache = None` assignment in `mempalace/mcp_server.py` left over after the stage-1 backend seam refactor. This does not change behavior; it only trims review noise in the MCP server module after the read-path hardening pass. Verification: - ruff check mempalace/mcp_server.py - uv run pytest -q tests/test_mcp_server.py --------- Co-authored-by: Sergey Kuznetsov <sergey@iterudit.com>	2026-04-11 16:16:49 -07:00
grtninja	154e8a78ec	fix: implement MCP ping health checks (#600 )	2026-04-11 16:16:37 -07:00
Arnold Wender	89c0a58271	fix: align cmd_compress dict keys with compression_stats() return values (#569 ) * fix: align cmd_compress dict keys with compression_stats() return values * test: align compress test mocks with actual compression_stats() keys * fix: address review — add Total: assertion, move stats key test to test_dialect.py	2026-04-11 16:16:31 -07:00
Ahmad Othman Ammar Adi.	9c4b7302cc	fix: skip unreachable reparse points in detect_rooms_from_folders (#558 ) On Windows, projects containing git-submodule junctions or dev-drive reparse points cause iterdir() to list the entry successfully but Path.is_dir() to raise OSError when it calls stat() internally. Reproducer: any Windows project with a submodule checked out as a junction (e.g. skills/pr-perfect) crashes mempalace init with: OSError: [WinError 448] The path cannot be traversed because it contains an untrusted mount point Fix: wrap every is_dir() call in detect_rooms_from_folders with try/except OSError so the scanner skips inaccessible entries and continues rather than aborting. Covers both the top-level pass and the one-level-deep nested pass. Two new tests mock the OSError on specific paths and verify the function returns correct rooms from the remaining accessible entries.	2026-04-11 16:16:06 -07:00
Ben Sigman	ad806cf3f8	Merge branch 'main' into fix/query-sanitizer-prompt-contamination	2026-04-10 22:39:31 -07:00
MSL	e30c283fd8	style: ruff format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:49:35 -07:00
MSL	15c5a528ed	test: add 33 tests for repair.py and dedup.py - 18 tests for repair (scan, prune, rebuild, edge cases) - 15 tests for dedup (grouping, dedup logic, wing filter, stats) - Fixes coverage drop from adding new modules Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 08:45:27 -07:00
Kevin Pulikkottil	2981433535	fix: add mcp command with setup guidance (#315 ) * fix: add mcp command with setup guidance * fix: include --palace guidance in mcp command output * fix: make mcp guidance commands copy-pastable --------- Co-authored-by: Milla J <millaj1217@gmail.com>	2026-04-09 11:21:18 -07:00
bensig	b1adc047e6	fix: address Octocode review — move size check, add tests for all 3 fixes - Move file size check before try block so IOError propagates cleanly (not caught by the except OSError handler below it) - Wrap os.path.getsize in its own try/except to preserve existing test_normalize_io_error behavior on missing files - Add test_normalize_rejects_large_file (mocked getsize) - Add test_null_arguments_does_not_hang (#394) - Add test_cmd_repair_trailing_slash_does_not_recurse (#395) 532 tests pass locally, 0 regressions.	2026-04-09 10:40:53 -07:00
bensig	58b8d5b198	fix: release ChromaDB handles before rmtree on Windows	2026-04-09 09:31:55 -07:00
bensig	1c48f4d2c3	fix: use os.utime in mtime test for Windows compatibility	2026-04-09 09:23:08 -07:00
Ben Sigman	e293e290d5	Merge branch 'main' into fix/mcp-protocol-version-negotiation	2026-04-09 09:15:06 -07:00
bensig	2448ac0026	test: add coverage for file_already_mined mtime check Covers the check_mtime=True path in palace.py to meet 85% coverage threshold.	2026-04-09 08:56:28 -07:00
Ben Sigman	725fa2b6f1	Merge branch 'main' into fix/query-sanitizer-prompt-contamination	2026-04-09 08:11:39 -07:00
Ben Sigman	70f2160bd6	Merge branch 'main' into fix/mcp-protocol-version-negotiation	2026-04-09 08:09:57 -07:00
matrix9neonebuchadnezzar2199-sketch	7509a72502	fix: mitigate system prompt contamination in search queries (#333 ) Addresses Issue #333: AI agents prepending system prompts to search queries causes embedding retrieval to collapse (89.8% → 1.0% R@10). Mitigation approach (減災): - New query_sanitizer.py with 4-stage pipeline: Step 1: passthrough for short queries (≤200 chars) Step 2: question extraction (finds ? sentences) → ~85-89% recovery Step 3: tail sentence extraction → ~80-89% recovery Step 4: tail truncation fallback → ~70-80% recovery Worst case without sanitizer: 1.0% (catastrophic) Worst case with sanitizer: ~70-80% (survivable) - mcp_server.py: tool_search applies sanitizer before ChromaDB query - MCP schema: query description warns agents not to include prompts - New 'context' parameter separates background info from search intent - Sanitizer metadata included in response when triggered 22 new tests covering all pipeline stages and real-world scenarios. Made-with: Cursor	2026-04-09 23:28:59 +09:00
Tal Muskal	da64016a94	fix: format test_layers_bench.py with ruff to pass CI lint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 08:24:51 +03:00

1 2

87 Commits