mempalace

Author	SHA1	Message	Date
Igor Lins e Silva	8e446f904c	fix(search): hybrid closet+drawer retrieval — closets boost, never gate (#795 )	2026-04-13 04:43:54 -07:00
MSL	971b92da5d	feat(search): drawer-grep returns best-matching chunk + neighbors When a closet hit leads to a source file with many drawers, grep each chunk for query terms and return the BEST-MATCHING chunk + 1 neighbor on each side, instead of dumping the whole file truncated at MAX_HYDRATION_CHARS. Result now includes drawer_index and total_drawers so callers can request adjacent drawers explicitly. Extracted from Milla's commit 935f657 which bundled drawer-grep with closet_llm (deferred pending LLM_ENDPOINT refactor) and fact_checker (separate PR). Ported only the searcher.py change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 07:46:07 -03:00
MSL	f935e85ead	feat: entity metadata + diary ingest + BM25 hybrid search Three features that close the gap between the architecture docs and the actual codebase: 1. Entity metadata on drawers and closets - _extract_entities_for_metadata() pulls names from known_entities.json + proper nouns appearing 2+ times - Stamped as "entities" field in ChromaDB metadata - Enables filterable search by person/project name 2. Day-based diary ingest (diary_ingest.py) - ONE drawer per day, upserted as the day grows - Closets pack topics atomically, never split mid-topic - Tracks entry count in state file, only processes new entries - Usage: python -m mempalace.diary_ingest --dir ~/summaries 3. BM25 hybrid search in searcher.py - _bm25_score() keyword matching complements vector similarity - _hybrid_rank() combines both signals (60% vector, 40% BM25) - Catches exact name/term matches that embeddings miss - Applied to both closet-first and direct drawer search paths 689/689 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 07:40:32 -03:00
MSL	d3d7184f4e	feat: add closet layer — searchable index pointing to drawers The closet architecture was always part of MemPalace's design but never shipped in the public codebase. This adds it. Palace now has TWO collections: - mempalace_drawers — full verbatim content (unchanged) - mempalace_closets — compact AAAK-style index entries How it works: - When mining, each file gets a closet alongside its drawers - Closet contains extracted topics, entities, quotes as pointers - Closets pack up to 1500 chars, topics never split mid-entry - Search hits closets first (fast, small), then hydrates the full drawer content for matching files - Falls back to direct drawer search if no closets exist yet Files changed: - palace.py: get_closets_collection(), build_closet_text(), upsert_closet(), CLOSET_CHAR_LIMIT - miner.py: process_file() now creates closets after drawers - searcher.py: search_memories() tries closet-first search, hydrates drawers, falls back to direct search Backwards compatible — existing palaces without closets continue to work via the fallback path. Closets are created on next mine. 689/689 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 07:37:17 -03:00
Ben Sigman	20c8f8e57b	feat: new MCP tools — get/list/update drawer, hook settings, export (resolves #635 ) (#667 ) * feat: MCP reliability — inode detection, WAL rotation, metadata cache, search limits Infrastructure hardening for the MCP server: - Detect palace DB replacement via inode tracking (repair command support) - WAL rotation to prevent unbounded WAL growth - _fetch_all_metadata() + _get_cached_metadata() with 60s TTL for taxonomy/status - _MAX_RESULTS cap (100) with limit clamping [1, _MAX_RESULTS] - max_distance parameter for similarity threshold in search - Handle all notifications/* methods, null arguments, method=None - Remove duplicate _client_cache = None declarations - searcher.py max_distance parameter passthrough Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: new MCP tools (get/list/update drawer, hook settings, memories filed), export, normalize New MCP tools: - mempalace_get_drawer: fetch single drawer by ID with full content - mempalace_list_drawers: paginated listing with wing/room filter - mempalace_update_drawer: update content/wing/room on existing drawers - mempalace_hook_settings: get/set hook behavior (silent_save, desktop_toast) - mempalace_memories_filed_away: check latest checkpoint status Also includes: - exporter.py: export palace as browsable markdown files - normalize.py: tool_use/tool_result capture for richer transcript mining - layers.py: updated for new tool integration - config.py: hook settings properties (hook_silent_save, hook_desktop_toast) Depends on PR 3 (reliability) for _MAX_RESULTS, _metadata_cache, WAL logging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: normalize.py handles string messages and Read offset type mismatch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: params null guard, L2→cosine docs, empty tool_use_map key guard - Handle explicit null in MCP params (request.get("params") or {}) - Fix search tool description: L2 → cosine distance (collection uses hnsw:space=cosine) - Guard against empty string key in tool_use_map from malformed JSONL entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: rename ambiguous var 'l' to 'line' (E741 lint) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address code review findings (5 issues) 1. min_similarity backwards-compat: convert similarity to distance scale (1.0 - similarity) instead of passing raw value as max_distance 2. Restore structured error reporting (error + partial fields) in tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy — reverts silent except:pass that dropped #647 security hardening 3. inode cache: remove falsy-zero short-circuit so missing DB file triggers reconnect instead of reusing stale client 4. _fetch_all_metadata: check for empty batch before extending/advancing offset to prevent infinite loop on concurrent deletion 5. KG initialization: only override path when --palace is explicit; default runs use KnowledgeGraph's built-in default path Co-authored-by: jphein <jphein@users.noreply.github.com> --------- Co-authored-by: jp <jp@jphein.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: jphein <jphein@users.noreply.github.com>	2026-04-11 21:25:04 -07:00
Sergey Kuznetsov	ae5196bc8d	Мempalace backend seam (#413 ) * refactor: add stage-1 backend abstraction seam Introduce the first upstreamable storage seam for MemPalace without bringing in the PostgreSQL spike or any benchmark artifacts. This change adds a small backend package with: - BaseCollection as the minimal collection contract - ChromaBackend/ChromaCollection as the default implementation It then routes the main runtime collection consumers through that seam: - palace.py - searcher.py - layers.py - palace_graph.py - mcp_server.py - miner.status() Behavioral constraints kept for stage 1: - ChromaDB remains the only backend and the default path - no config/env backend selection yet - no PostgreSQL code - no benchmark or research files - existing tests stay unchanged Important compatibility details: - read paths now call the seam with create=False so they still surface the existing 'no palace found' behavior instead of silently creating empty collections - write paths keep create=True semantics through palace.get_collection() - layers/searcher retain a chromadb module attribute so the existing mock-based tests can keep patching PersistentClient unchanged - ChromaBackend only creates palace directories on create=True, which preserves mocked read-path tests that use fake read-only paths Verification: - python3 -m py_compile mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py - pytest -q # 529 passed, 106 deselected * refactor: clean up stage-1 seam compatibility shims Tighten the stage-1 backend abstraction branch after review. This follow-up does three small things: - keep the chromadb compatibility hook in searcher.py and layers.py, but express it through the backends.chroma module so it no longer reads like an accidental unused import - fix the palace_graph.py helper alias to avoid the local name collision flagged by ruff (imported helper vs local _get_collection wrapper) - preserve the existing mock-based test patch points unchanged while keeping the new backend seam intact Why this matters: - the direct form looked like a dead import in review, even though it was intentionally preserving the existing test seam ( and ) - palace_graph.py had a real lint issue ( redefinition) that was small but worth fixing before a public PR Verification: - /opt/homebrew/bin/ruff check mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py - pytest -q tests/test_layers.py tests/test_searcher.py - pytest -q # 529 passed, 106 deselected * docs: explain backend shim imports in search paths Add short code comments in searcher.py and layers.py explaining why the module-level `chromadb` alias remains after the stage-1 backend seam refactor. The alias is intentional: it preserves the existing mock patch points used by the current test suite (`mempalace.searcher.chromadb.PersistentClient` and `mempalace.layers.chromadb.PersistentClient`) while the runtime logic now flows through the backend abstraction. This keeps the public PR easier to review because the apparent "unused import" now has an explicit reason next to it. Verification: - /opt/homebrew/bin/ruff check mempalace/searcher.py mempalace/layers.py - pytest -q tests/test_layers.py tests/test_searcher.py * refactor: reuse a default backend instance in palace helper Tighten the stage-1 backend seam by promoting the default Chroma backend adapter to a module-level singleton in `mempalace/palace.py`. This keeps the stage-1 scope unchanged — Chroma is still the only backend wired in this branch — but avoids constructing a fresh `ChromaBackend()` object on every `get_collection()` call. The backend is stateless today, so this is a readability/cleanup change rather than a behavioral one. Why this helps: - makes `palace.get_collection()` read like a real default factory instead of an inline constructor call - keeps the stage-1 branch a little cleaner before opening the public PR - does not widen the backend surface or change any config/runtime behavior Verification: - python3 -m py_compile mempalace/palace.py - pytest -q tests/test_miner.py tests/test_layers.py tests/test_searcher.py - pytest -q # 529 passed, 106 deselected * fix: harden read-only seam behavior and update seam tests Preserve the stage-1 backend abstraction while closing the real read-path regression surfaced in PR review. What changed: - make ChromaBackend.get_collection(create=False) fail fast when the palace directory does not exist instead of letting PersistentClient create it as a side effect - update miner.status() to call get_collection(..., create=False) so status keeps the historical 'No palace found' behavior - remove the temporary chromadb shim aliases from layers.py and searcher.py now that the tests patch the seam directly - add focused tests for the new backends package, including ChromaCollection delegation and ChromaBackend create=True/create=False behavior - retarget layer/searcher tests to patch the backend seam instead of patching chromadb.PersistentClient inside production modules - add a regression test that status() does not create an empty palace when the target path is missing Verification: - ruff check . - uv run pytest -q - uv run pytest -q tests/test_backends.py tests/test_cli.py tests/test_mcp_server.py tests/test_layers.py tests/test_searcher.py tests/test_miner.py Notes: - the separate benchmark/slow/stress layer was started as a soak but not used as the merge gate for this PR branch * refactor: drop duplicate mcp collection cache declaration Remove a redundant `_collection_cache = None` assignment in `mempalace/mcp_server.py` left over after the stage-1 backend seam refactor. This does not change behavior; it only trims review noise in the MCP server module after the read-path hardening pass. Verification: - ruff check mempalace/mcp_server.py - uv run pytest -q tests/test_mcp_server.py --------- Co-authored-by: Sergey Kuznetsov <sergey@iterudit.com>	2026-04-11 16:16:49 -07:00
Igor Lins e Silva	5ac4947d02	fix: preserve CLI exit codes, log tracebacks, sanitize search errors, validate fixture	2026-04-07 18:26:39 -03:00
Igor Lins e Silva	c9135aad67	fix: sanitize error responses and remove sys.exit from library code - Remove palace_path from _no_palace() error response (prevents leaking filesystem paths to the LLM) - Replace str(e) with generic 'Internal tool error' in MCP dispatch catch block (full error is still logged server-side via stderr) - Replace sys.exit(1) with return in searcher.search() CLI function (prevents process termination if called from library context) - Remove unused sys import from searcher.py Findings: #12 (HIGH), #5 (MEDIUM), #15 (LOW) Includes test infrastructure from PR #131. 92 tests pass.	2026-04-07 18:24:10 -03:00
Milla Jovovich	068dbd9a7b	MemPalace: palace architecture, AAAK compression, knowledge graph The memory system: - Palace structure: Wings (people/projects) → Rooms (topics) → Closets (AAAK compressed) → Drawers (verbatim transcripts) - Halls connect related rooms within a wing - Tunnels cross-reference rooms across wings - AAAK: 30x lossless compression dialect for AI agents - Knowledge graph: temporal entity-relationship triples (SQLite) - Palace graph: room-based navigation with tunnel detection - MCP server: 19 tools — search, graph traversal, agent diary, AAAK auto-teach - Onboarding: guided setup generates wing config + AAAK entity registry - Contradiction detection: catches wrong pronouns, names, ages - Auto-save hooks for Claude Code 96.6% Recall@5 on LongMemEval — highest zero-API score published. 100% with optional Haiku rerank (500/500). Local. Free. No API key required.	2026-04-04 18:16:04 -07:00

9 Commits