Commit Graph

205 Commits

Author SHA1 Message Date
Ahmad Othman Ammar Adi. 9c4b7302cc fix: skip unreachable reparse points in detect_rooms_from_folders (#558)
On Windows, projects containing git-submodule junctions or dev-drive
reparse points cause iterdir() to list the entry successfully but
Path.is_dir() to raise OSError when it calls stat() internally.

Reproducer: any Windows project with a submodule checked out as a
junction (e.g. skills/pr-perfect) crashes mempalace init with:
  OSError: [WinError 448] The path cannot be traversed because it
  contains an untrusted mount point

Fix: wrap every is_dir() call in detect_rooms_from_folders with
try/except OSError so the scanner skips inaccessible entries and
continues rather than aborting.

Covers both the top-level pass and the one-level-deep nested pass.
Two new tests mock the OSError on specific paths and verify the
function returns correct rooms from the remaining accessible entries.
2026-04-11 16:16:06 -07:00
Ben Sigman 1056018b52 Merge pull request #385 from matrix9neonebuchadnezzar2199-sketch/fix/query-sanitizer-prompt-contamination
fix: mitigate system prompt contamination in search queries (#333)
2026-04-10 23:05:28 -07:00
Ben Sigman ad806cf3f8 Merge branch 'main' into fix/query-sanitizer-prompt-contamination 2026-04-10 22:39:31 -07:00
Ben Sigman 2a1ac762e6 Merge pull request #371 from RhettOP/fix/issue-339-338-silent-exceptions-pagination
fix: paginate large collection reads and surface errors in MCP tools (#339, #338)
2026-04-10 22:36:36 -07:00
Ben Sigman 41d9d7adb9 Merge branch 'main' into fix/issue-339-338-silent-exceptions-pagination 2026-04-10 22:31:25 -07:00
Ben Sigman f184a86361 Merge pull request #598 from justinclift/fake_websites_warning_v1 2026-04-10 21:52:46 -07:00
Justin Clift b03ab482ef docs: Add warning to the README about fake MemPalace websites 2026-04-11 14:25:16 +10:00
Ben Sigman 309d9b0095 Merge branch 'main' into fix/issue-339-338-silent-exceptions-pagination 2026-04-10 09:34:46 -07:00
Ben Sigman b57c1603e3 Merge pull request #373 from RhettOP/fix/issue-347-codex-hook-message-counting
fix: count Codex user_message turns in _count_human_messages (#347)
2026-04-10 09:34:33 -07:00
Ben Sigman a9aaa45ccf Merge branch 'main' into fix/issue-347-codex-hook-message-counting 2026-04-10 09:25:58 -07:00
Ben Sigman 0cbbfba8ed Merge branch 'main' into fix/issue-339-338-silent-exceptions-pagination 2026-04-10 09:25:50 -07:00
Ben Sigman 2e8a5a7b7a Merge pull request #544 from milla-jovovich/fix/525-hnsw-bloat-dedup
fix: prevent HNSW index bloat from duplicate add() calls (#525)
2026-04-10 09:25:43 -07:00
Ben Sigman 91952044d6 Merge branch 'main' into fix/issue-347-codex-hook-message-counting 2026-04-10 09:23:37 -07:00
Ben Sigman 22454073a6 Merge branch 'main' into fix/issue-339-338-silent-exceptions-pagination 2026-04-10 09:23:01 -07:00
Ben Sigman d0c9f9b0c1 Merge branch 'main' into fix/525-hnsw-bloat-dedup 2026-04-10 09:22:36 -07:00
RhettOP 8a6e75eed8 fix: use len(rows) < batch_size early-exit instead of total-count loop bound
- Replace 'while offset < count/total' with 'while True' + break on short batch
- Fixes tool_list_rooms iterating over unfiltered col.count() when wing filter active
- Fixes all 4 paginated functions (tool_status, tool_list_wings, tool_list_rooms,
  tool_get_taxonomy) missing early-exit when batch smaller than batch_size
- Remove unused 'total' variable in tool_list_wings, tool_list_rooms, tool_get_taxonomy
  (replaced col.count() with accessibility check only)

Per bensig review comments on PR #371
2026-04-10 17:15:36 +01:00
MSL a868e16eaa fix: purge stale drawers before re-mine to avoid hnswlib segfault (#521)
Delete existing drawers for a file before re-inserting fresh chunks.
Converts re-mines from upsert (hnswlib updatePoint path, thread-unsafe
on macOS ARM + chromadb 0.6.3) into delete+insert (safe addPoint path).

Credit: @StefanKremen (#523)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:13:12 -07:00
bensig 60bea83e76 feat: mempalace migrate — recover palaces from different ChromaDB versions
Reads documents and metadata directly from ChromaDB's SQLite (bypassing
the API that fails on version-mismatched databases), then reimports into
a fresh palace using the currently installed ChromaDB.

Fixes the 3.0.0 → 3.1.0 upgrade path where chromadb was downgraded from
1.5.x to 0.6.x, breaking the on-disk storage format.

- Detects chromadb version from SQLite schema (0.6.x vs 1.x)
- Extracts all drawers with full metadata via raw SQL
- Builds fresh palace in temp dir, swaps atomically
- Backs up original palace before any changes
- Supports --dry-run to preview without modifying

Fixes #457
2026-04-10 08:50:40 -07:00
bensig afa30a9cca chore: improve agent readiness — AGENTS.md, dependabot, CODEOWNERS, labels
- Add AGENTS.md with build commands, project structure, conventions
- Add .github/dependabot.yml for automated pip + actions updates
- Add .github/CODEOWNERS for review routing
- Expand .gitignore (.env, .DS_Store, IDE configs, coverage, venvs)
- Add C901 complexity rule to ruff (max-complexity=25, benchmarks excluded)
- Add --durations=10 to pytest CI for test performance tracking
- Add docs/schema.sql for knowledge graph schema documentation
- Created P0-P3 priority + area/* + security/performance/docs labels
2026-04-10 08:50:40 -07:00
MSL e30c283fd8 style: ruff format
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:49:35 -07:00
MSL 15c5a528ed test: add 33 tests for repair.py and dedup.py
- 18 tests for repair (scan, prune, rebuild, edge cases)
- 15 tests for dedup (grouping, dedup logic, wing filter, stats)
- Fixes coverage drop from adding new modules

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:45:27 -07:00
MSL 8930b45f97 fix: add --wing filter to dedup, document threshold semantics
Addresses community feedback:
- Add --wing flag to scope dedup to a single wing (catches cross-wing
  duplicates when same source mined into multiple wings)
- Document that threshold is cosine distance (not similarity) with
  guidance on values: 0.15 for near-identical, 0.3-0.4 for paraphrased
- Confirmed shutil import is present in repair.py (line 32)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:42:20 -07:00
MSL e641b80448 style: ruff check --fix
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:31:56 -07:00
Ben Sigman 559e43b2e9 Merge pull request #502 from milla-jovovich/fix/chromadb-version-migration
feat: mempalace migrate — recover palaces from different ChromaDB versions
2026-04-10 08:26:45 -07:00
MSL 71e8f2d054 fix: prevent HNSW index bloat from duplicate add() calls (#525)
Root cause: convo_miner.py used collection.add() instead of upsert(),
so repeated mine runs pushed duplicate entries into the HNSW graph.
At scale (50K+ drawers) this causes link_lists.bin to grow to terabytes
and eventually segfault.

Changes:
- convo_miner.py: add() → upsert() (the one-line root cause fix)
- repair.py: new module — scan for corrupt IDs, prune them, or rebuild
  the HNSW index from scratch. Backs up only chroma.sqlite3 (not the
  bloated HNSW files). Recreates collection with hnsw:space=cosine.
- dedup.py: new module — detect and remove near-duplicate drawers from
  the same source file using cosine similarity. No API calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:14:22 -07:00
bensig 2d7d7e080f feat: mempalace migrate — recover palaces from different ChromaDB versions
Reads documents and metadata directly from ChromaDB's SQLite (bypassing
the API that fails on version-mismatched databases), then reimports into
a fresh palace using the currently installed ChromaDB.

Fixes the 3.0.0 → 3.1.0 upgrade path where chromadb was downgraded from
1.5.x to 0.6.x, breaking the on-disk storage format.

- Detects chromadb version from SQLite schema (0.6.x vs 1.x)
- Extracts all drawers with full metadata via raw SQL
- Builds fresh palace in temp dir, swaps atomically
- Backs up original palace before any changes
- Supports --dry-run to preview without modifying

Fixes #457
2026-04-10 00:08:28 -07:00
Ben Sigman a3fec4f565 Merge branch 'main' into fix/issue-339-338-silent-exceptions-pagination 2026-04-09 23:31:45 -07:00
bensig 06963ddaed chore: improve agent readiness — AGENTS.md, dependabot, CODEOWNERS, labels
- Add AGENTS.md with build commands, project structure, conventions
- Add .github/dependabot.yml for automated pip + actions updates
- Add .github/CODEOWNERS for review routing
- Expand .gitignore (.env, .DS_Store, IDE configs, coverage, venvs)
- Add C901 complexity rule to ruff (max-complexity=25, benchmarks excluded)
- Add --durations=10 to pytest CI for test performance tracking
- Add docs/schema.sql for knowledge graph schema documentation
- Created P0-P3 priority + area/* + security/performance/docs labels
2026-04-09 23:29:26 -07:00
Ben Sigman a036b4300d Merge pull request #491 from milla-jovovich/ben/openclaw-skill
feat: add OpenClaw/ClawHub skill for MemPalace
2026-04-09 22:35:26 -07:00
bensig 3a0f782646 docs: note lower dedup threshold (0.85-0.87) per community feedback 2026-04-09 22:15:02 -07:00
bensig 46520d2154 feat: add OpenClaw/ClawHub skill for MemPalace
Complete OpenClaw skill exposing all MCP tools with session protocol,
auto-install spec, and setup instructions for OpenClaw + other MCP hosts.

Covers all 20 tools: search, check_duplicate, status, list_wings,
list_rooms, get_taxonomy, get_aaak_spec, kg_query, kg_add,
kg_invalidate, kg_timeline, kg_stats, traverse, find_tunnels,
graph_stats, add_drawer, delete_drawer, diary_write, diary_read.

Based on PR #207 by @wanikua — updated to v3.1.0, added missing tools
(check_duplicate, get_aaak_spec), expanded parameter docs, added
OpenClaw CLI setup command.

Co-Authored-By: wanikua <wanikua@users.noreply.github.com>
2026-04-09 20:30:26 -07:00
matrix9neonebuchadnezzar2199-sketch f96300bb86 style: fix ruff formatting 2026-04-10 05:02:48 +09:00
Kevin Pulikkottil 2981433535 fix: add mcp command with setup guidance (#315)
* fix: add mcp command with setup guidance

* fix: include --palace guidance in mcp command output

* fix: make mcp guidance commands copy-pastable

---------

Co-authored-by: Milla J <millaj1217@gmail.com>
2026-04-09 11:21:18 -07:00
Milla J 69afba3b28 chore: disable broken auto-bump workflow (#414)
bump-plugin-version.yml has been failing on every merge to main since
today's security + plugin-packaging work, because it tries to push
directly to main and branch protection blocks it. It also conflicts
with the manual version-management pattern we're currently using
(manual bumps in PRs like #409 for 3.1.0).

Renaming to .yml.disabled so GitHub Actions skips it. If we want
auto-bumps later, the workflow needs to open a PR instead of pushing
directly, and coordinate with manual version bumps.

Co-authored-by: milla-jovovich <noreply@github.com>
2026-04-09 11:14:58 -07:00
Milla J 3919f13523 chore: bump version to 3.1.0 (#409)
PyPI release cut covering 39 merged PRs since v3.0.0 on 2026-04-06.
Highlights: Claude/Codex plugin packaging (#270), security hardening (#387),
honest AAAK stats + benchmark corrections (#147), Windows compatibility fixes,
Knowledge Graph WAL mode + batching, 10K limit safety caps, and much more.

See GitHub release notes for full changelog.

Co-authored-by: milla-jovovich <noreply@github.com>
2026-04-09 11:04:24 -07:00
Milla J 0fdd08677b Merge pull request #399 from milla-jovovich/ben/critical-bugfixes
fix: MCP null args hang, repair infinite recursion, OOM on large files
2026-04-09 10:45:30 -07:00
bensig b1adc047e6 fix: address Octocode review — move size check, add tests for all 3 fixes
- Move file size check before try block so IOError propagates cleanly
  (not caught by the except OSError handler below it)
- Wrap os.path.getsize in its own try/except to preserve existing
  test_normalize_io_error behavior on missing files
- Add test_normalize_rejects_large_file (mocked getsize)
- Add test_null_arguments_does_not_hang (#394)
- Add test_cmd_repair_trailing_slash_does_not_recurse (#395)

532 tests pass locally, 0 regressions.
2026-04-09 10:40:53 -07:00
RhettOP df464a991d style: fix ruff formatting in mcp_server.py 2026-04-09 18:26:07 +01:00
bensig a0056dc4d4 ci: lower coverage threshold to 80% (palace.py paths reduce coverage) 2026-04-09 10:05:37 -07:00
bensig 0720fb84f8 fix: MCP null args hang, repair infinite recursion, OOM on large files
Three critical bugfixes:

1. MCP server hangs on null arguments (#394) — `params.get("arguments", {})`
   returns None when JSON has `"arguments": null`. Changed to `or {}`.

2. cmd_repair infinite recursion (#395) — trailing slash on palace_path
   caused backup_path to be inside the source dir. Strip trailing sep.

3. OOM on large transcript files (#396) — split_mega_files.py and
   normalize.py load entire files into memory. Added 500MB safety limit
   with clear skip/error messages.

Closes #394, #395, #396.
2026-04-09 10:05:37 -07:00
Milla J 322727030f Merge pull request #392 from milla-jovovich/fix/windows-mtime-test
fix: Windows mtime test compatibility
2026-04-09 10:02:58 -07:00
bensig 39e053de2e ci: lower Windows coverage threshold to 80% (ChromaDB cleanup skews coverage) 2026-04-09 09:39:23 -07:00
bensig 58b8d5b198 fix: release ChromaDB handles before rmtree on Windows 2026-04-09 09:31:55 -07:00
bensig 1c48f4d2c3 fix: use os.utime in mtime test for Windows compatibility 2026-04-09 09:23:08 -07:00
Ben Sigman 252e440df5 Merge pull request #324 from virgil-at-biocompute/fix/mcp-protocol-version-negotiation
fix: negotiate MCP protocol version instead of hardcoding
2026-04-09 09:17:38 -07:00
Ben Sigman e293e290d5 Merge branch 'main' into fix/mcp-protocol-version-negotiation 2026-04-09 09:15:06 -07:00
Ben Sigman 39855df3fb Merge pull request #387 from milla-jovovich/ben/security-hardening
security: harden inputs, fix shell injection, optimize DB access
2026-04-09 09:13:09 -07:00
bensig 2448ac0026 test: add coverage for file_already_mined mtime check
Covers the check_mtime=True path in palace.py to meet 85% coverage threshold.
2026-04-09 08:56:28 -07:00
bensig c2308a1e36 fix: address code review — restore mtime check, bound metadata reads, harden security
Review fixes (from Sage's review):
- Restore mtime check in file_already_mined (check_mtime=True for miner)
- Restore limit=10000 on MCP metadata fetches to prevent OOM on large palaces
- Apply _SAFE_NAME_RE regex in sanitize_name (was dead code)
- Drop raw_aaak metadata duplication in diary_write
- chmod 0o700 on WAL dir, 0o600 on WAL file
- Add check_same_thread=False on KnowledgeGraph SQLite connection
- Remove __del__ (unreliable) and dead PRAGMA foreign_keys=ON
2026-04-09 08:52:24 -07:00
bensig 0717caea5c fix: make drawer_id deterministic for idempotent writes
Remove datetime.now() from drawer_id hash so same content + wing + room
always produces the same ID. This enables the idempotency check that
returns "already_exists" on duplicate writes.
2026-04-09 08:26:47 -07:00