Commit Graph

72 Commits

Author SHA1 Message Date
marerem df33550945 fix: silence ChromaDB telemetry warnings and CoreML segfault on Apple Silicon
ChromaDB 0.6.x bundles a Posthog telemetry client whose capture()
signature is incompatible with the installed posthog library, producing
noisy "Failed to send telemetry event" stderr warnings on every
operation. Silence by raising the logger threshold to CRITICAL.

ONNX Runtime's CoreML execution provider segfaults during vector
queries on macOS ARM64 (issue #74). Auto-set ORT_DISABLE_COREML=1
on Apple Silicon to force CPU execution, while respecting any
user-provided override via os.environ.setdefault().

Made-with: Cursor
2026-04-08 12:43:09 +02:00
Ben Sigman 71736a3f4f Merge pull request #142 from igorls/chore/packaging-cleanup
chore: tighten chromadb version range and add py.typed marker
2026-04-07 16:05:20 -07:00
Ben Sigman c30bc9e71e Merge pull request #137 from igorls/fix/bounded-queries
fix: add limit=10000 safety cap to all unbounded ChromaDB .get() calls
2026-04-07 16:05:17 -07:00
Ben Sigman a8de2911e5 Merge pull request #136 from igorls/fix/kg-hardening
fix: enable SQLite WAL mode and add consistent LIMIT to KG timeline
2026-04-07 16:05:13 -07:00
Ben Sigman 168c3bc22e Merge pull request #139 from igorls/fix/error-handling
fix: sanitize error responses and remove sys.exit from library code
2026-04-07 16:05:11 -07:00
Ben Sigman a74061f739 Merge pull request #141 from igorls/fix/hook-security
fix: sanitize SESSION_ID in save hook to prevent path traversal
2026-04-07 16:05:09 -07:00
Igor Lins e Silva e5b3434e9b fix: update dialect tests for PR #147 stats API and remove unused fixture param 2026-04-07 18:58:28 -03:00
Igor Lins e Silva d3145e9a7b fix: update dialect tests for PR #147 stats API and remove unused fixture param 2026-04-07 18:58:25 -03:00
Igor Lins e Silva b5a58557e3 fix: update dialect tests for PR #147 stats API and remove unused fixture param 2026-04-07 18:58:22 -03:00
Igor Lins e Silva 6fa985eac2 fix: update dialect tests for PR #147 stats API and remove unused fixture param 2026-04-07 18:58:20 -03:00
Igor Lins e Silva 50239d4b49 fix: sanitize SESSION_ID in save hook to prevent path traversal
The save hook uses SESSION_ID in file paths (state_dir/).
A crafted session_id value like '../../etc/cron.d/evil' could write
state files outside the intended directory.

Strip everything except [a-zA-Z0-9_-] from SESSION_ID, defaulting
to 'unknown' if empty after sanitization.

Finding: #4 (HIGH — path traversal via SESSION_ID)

Includes test infrastructure from PR #131.
92 tests pass.
2026-04-07 18:53:31 -03:00
Igor Lins e Silva 541e9bd1ee chore: tighten chromadb version range and add py.typed marker
- Tighten chromadb dependency from >=0.4.0,<1 to >=0.5.0,<0.7
  (the collection API changed significantly across majors; this
  pins to the tested range)
- Add optional 'spellcheck' extras for the undeclared autocorrect
  dependency used in spellcheck.py
- Add PEP 561 py.typed marker for type checker support

Findings: #10 (HIGH — chromadb range too wide), #30 (LOW — undeclared
          autocorrect), #32 (LOW — missing py.typed)

Includes test infrastructure from PR #131.
92 tests pass.
2026-04-07 18:51:42 -03:00
Igor Lins e Silva b45bff9db1 test: add WAL mode and entity timeline limit assertions 2026-04-07 18:27:19 -03:00
Igor Lins e Silva 5ac4947d02 fix: preserve CLI exit codes, log tracebacks, sanitize search errors, validate fixture 2026-04-07 18:26:39 -03:00
Igor Lins e Silva 21f2248a3c fix: enable SQLite WAL mode and add consistent LIMIT to KG timeline
- Enable WAL journal mode in _conn() for better concurrent read
  performance and reduced SQLITE_BUSY risk
- Add LIMIT 100 to entity-filtered timeline query (was unbounded,
  while global timeline already had LIMIT 100)

Findings: #8 (HIGH — no WAL mode), #22 (LOW — inconsistent limits)

Includes test infrastructure from PR #131.
92 tests pass.
2026-04-07 18:25:57 -03:00
Igor Lins e Silva c9135aad67 fix: sanitize error responses and remove sys.exit from library code
- Remove palace_path from _no_palace() error response (prevents
  leaking filesystem paths to the LLM)
- Replace str(e) with generic 'Internal tool error' in MCP dispatch
  catch block (full error is still logged server-side via stderr)
- Replace sys.exit(1) with return in searcher.search() CLI function
  (prevents process termination if called from library context)
- Remove unused sys import from searcher.py

Findings: #12 (HIGH), #5 (MEDIUM), #15 (LOW)

Includes test infrastructure from PR #131.
92 tests pass.
2026-04-07 18:24:10 -03:00
Igor Lins e Silva 161a0d12a2 fix: cap diary_read query and update stale comment 2026-04-07 18:23:30 -03:00
Igor Lins e Silva 9491ffa92b fix: add limit=10000 safety cap to all unbounded ChromaDB .get() calls
Prevents OOM when the palace grows large. The following unbounded
metadata fetches now have a safety cap:

- tool_status: col.get(include=['metadatas'], limit=10000)
- tool_list_wings: same
- tool_list_rooms: same (including wing-filtered variant)
- tool_get_taxonomy: same
- Layer1.generate: col.get(include=['documents','metadatas'], limit=10000)

Layer2 already had a limit parameter — no change needed.

Finding: #3 (CRITICAL — unbounded data fetching causes OOM)

Includes test infrastructure from PR #131.
92 tests pass.
2026-04-07 18:23:12 -03:00
Ben Sigman 68e3414ed5 Merge pull request #147 from milla-jovovich/fix/aaak-honest-stats
fix: honest AAAK stats — word-based token estimator, lossy labels
2026-04-07 14:15:50 -07:00
Ben Sigman 27623a3b17 Merge pull request #131 from igorls/test/expand-coverage-and-uv-migration
test: expand coverage from 20 to 92 tests, migrate to uv
2026-04-07 14:15:01 -07:00
bensig 39c14be113 fix: honest AAAK stats — word-based token estimator, lossy labels
- Replace len(text)//3 token heuristic with word-based estimate (~1.3 tokens/word)
- Old heuristic inflated compression ratios by ~3-5x
- Update docstrings: "compression" → "lossy summarization"
- Update module docstring to clarify AAAK is NOT lossless
- compression_stats() now returns honest field names and a note
- CLI output labels ratios as lossy

Fixes #43
2026-04-07 14:14:31 -07:00
Ben Sigman 559417e8ac Merge pull request #145 from milla-jovovich/fix/room-keyword-path-matching
fix: room detection checks keywords against folder paths
2026-04-07 14:08:14 -07:00
bensig 71fb66d687 fix: room detection checks keywords against folder paths
detect_room() now matches folder path parts against room keywords,
not just the room name. Fixes docs/ files routing to general instead
of documentation room — "docs" wasn't a substring of "documentation"
but is now matched via the persisted keywords list.

Found during end-to-end testing after merging #108 keyword persistence.
2026-04-07 14:06:56 -07:00
Igor Lins e Silva 96de23cd97 fix: CI failures — update workflow for uv migration, fix lint and format
- Switch CI install step from `pip install -r requirements.txt` to
  `pip install -e ".[dev]"` since requirements.txt was removed
- Add noqa: E402 to intentionally-late imports in conftest.py
  (HOME must be isolated before mempalace imports)
- Remove unused KnowledgeGraph import in test_knowledge_graph.py
- Apply ruff formatting to test files
2026-04-07 17:59:21 -03:00
Ben Sigman 5f49282c85 Merge pull request #106 from JayadityaGit/docs/gemini-cli-integration
docs: add Gemini CLI setup guide
2026-04-07 13:58:57 -07:00
Ben Sigman 3068f75c2d Merge pull request #22 from sheetsync/bugfix/split-known-names-loading
refactor: consolidate split known-names config loading
2026-04-07 13:58:54 -07:00
Ben Sigman a59df81611 Merge pull request #66 from MARUCIE/fix/sqlite-batch-reads
fix: batch ChromaDB reads to avoid SQLite variable limit
2026-04-07 13:58:52 -07:00
Ben Sigman cea34366f5 Merge pull request #84 from AlexeySamosadov/fix/mcp-integer-coercion
fix: coerce MCP integer arguments to native Python int
2026-04-07 13:58:49 -07:00
Igor Lins e Silva cd8b245fdc fix: address Copilot review — remove unused imports, isolate HOME in tests, restore dev extra 2026-04-07 17:55:10 -03:00
Igor Lins e Silva 72c548b729 test: expand coverage from 20 to 92 tests, migrate to uv
- Migrate from setuptools to hatchling build backend
- Add dependency-groups (PEP 735) for dev tooling (pytest, ruff)
- Remove redundant requirements.txt in favor of uv.lock
- Fix __version__ mismatch (2.0.0 -> 3.0.0 to match pyproject.toml)

New test files:
- conftest.py: shared fixtures (isolated palace, KG, ChromaDB collection)
- test_knowledge_graph.py: 17 tests (entity CRUD, temporal queries, timeline)
- test_mcp_server.py: 25 tests (protocol dispatch, read/write/KG/diary tools)
- test_searcher.py: 7 tests (search_memories API, filters, error handling)
- test_dialect.py: 13 tests (AAAK compression, entity/emotion detection, zettel encoding)

All 92 tests pass on Python 3.13 with chromadb 0.6.3.
2026-04-07 17:55:10 -03:00
Ben Sigman 21462c6813 Merge pull request #103 from oussamalembarki/main
docs: add beginner-friendly hooks tutorial for issue #20
2026-04-07 13:54:42 -07:00
Ben Sigman 0b0e123f42 Merge pull request #61 from adv3nt3/feat/codex-cli-normalizer
feat: add OpenAI Codex CLI JSONL normalizer
2026-04-07 13:54:08 -07:00
Ben Sigman 6af6fe3dda Merge pull request #54 from adv3nt3/fix/narrow-exception-handling
fix: narrow bare except Exception to specific types where safe
2026-04-07 13:54:05 -07:00
Ben Sigman e8f9b47e31 Merge pull request #16 from sheetsync/bugfix/version-consistency
fix: unify package and MCP version reporting
2026-04-07 13:54:03 -07:00
Ben Sigman 1c62045b22 Merge pull request #21 from sheetsync/bugfix/mcp-docs-alignment
docs: align MCP setup examples with shipped server
2026-04-07 13:54:00 -07:00
Ben Sigman f1f0a4c966 Merge pull request #129 from minimexat/fix/windows-unicode-encoding
fix: replace Unicode separator character for Windows compatibility
2026-04-07 13:51:58 -07:00
Ben Sigman 30699b85a9 Merge pull request #42 from adv3nt3/fix/entity-registry-dead-code
fix: remove dead code and duplicate set items in entity_registry.py
2026-04-07 13:51:55 -07:00
Ben Sigman 6aa4272b65 Merge pull request #53 from adv3nt3/fix/md5-usedforsecurity-miners
fix: mark MD5 as non-security in miner drawer ID generation
2026-04-07 13:51:52 -07:00
Ben Sigman 3e6fc6ed9f Merge pull request #83 from renatoliveira/main
fix: update input prompt for entity confirmation in entity_detector.py
2026-04-07 13:51:50 -07:00
Ben Sigman c312ff539c Merge pull request #55 from salmanmkc/upgrade-github-actions-node24
Upgrade GitHub Actions for Node 24 compatibility
2026-04-07 13:46:49 -07:00
f-hoedl d214f6a854 fix: replace Unicode separator in convo_miner.py for Windows compatibility
Replace the ─ (U+2500) separator character with - in convo_miner.py.
Windows terminals using cp1252 encoding raise UnicodeEncodeError when
printing this character unless PYTHONUTF8=1 is set explicitly.

Fixes crash on Windows: UnicodeEncodeError: 'charmap' codec can't encode
character '\u2500'
2026-04-07 21:55:34 +02:00
Ben Sigman 45c2c92c4a Merge pull request #123 from milla-jovovich/fix/non-interactive-init
fix: --yes flag skips all interactive prompts in init
2026-04-07 12:18:03 -07:00
bensig caa1169f04 fix: --yes flag now skips room confirmation in init
Pass yes flag through to detect_rooms_local so init --yes
skips both entity detection AND room approval prompts.
Agents and CI can now run init without interactive input.

Fixes #8
2026-04-07 12:16:46 -07:00
Ben Sigman 01a21dd60f Merge pull request #78 from ac-opensource/feature/respect-gitignore-mining
Respect nested .gitignore rules when mining project files
2026-04-07 12:15:23 -07:00
Ben Sigman b49cfbf164 Merge pull request #119 from milla-jovovich/fix/repair-split-rooms
fix: repair command, split args, Claude export, room keywords
2026-04-07 12:03:49 -07:00
bensig 5e8a039e7c fix: repair command, split args, Claude export, room keywords
- Add `mempalace repair` command to rebuild vector index from SQLite
  when HNSW files are corrupted after crash/interrupt (fixes #74, #72, #96)
- Fix split command passing dir as positional instead of --source
  flag to split_mega_files (fixes #63)
- Handle Claude privacy export format (array of conversation objects
  with chat_messages inside each) in normalize.py (fixes #63)
- Persist room keywords in mempalace.yaml so mine can match files
  in docs/ to room "documentation" (fixes #108)
2026-04-07 12:02:34 -07:00
Ben Sigman d1afecc478 Merge pull request #114 from milla-jovovich/fix/security-mining-chromadb
fix: shell injection in hooks, Claude Code mining, chromadb pin
2026-04-07 11:53:12 -07:00
bensig 186bb2e3d1 fix: shell injection in hooks, Claude Code mining, chromadb pin
- hooks/mempal_save_hook.sh: pass $TRANSCRIPT_PATH as sys.argv
  instead of interpolating into python -c string (fixes #110)
- normalize.py: accept type "user" in addition to "human" for
  Claude Code JSONL sessions (fixes #111)
- convo_miner.py: skip tool-results/, memory/ dirs and .meta.json
  files when scanning for conversations (fixes #111)
- pyproject.toml: pin chromadb>=0.4.0,<1 to avoid crashing 1.x
  builds on macOS ARM64 (fixes #100)
2026-04-07 11:45:51 -07:00
Milla Jovovich aa10f8fbf1 README: honest update from Milla & Ben — own the mistakes, fix the claims
The community caught real problems within hours of launch. Addressing them
directly:

- Added prominent "A Note from Milla & Ben" section at top owning the issues
- Fixed AAAK section: removed "lossless" claim, removed bogus token example,
  honest about lossy nature and 84.2% regression on LongMemEval
- Headline benchmark table: clearly labeled as "raw mode" (the 96.6% number)
- Removed misleading "100%" headline (still real but rerank pipeline not
  in public scripts yet — addressing)
- Removed misleading "+34% palace boost" headline (it's metadata filtering,
  real but not a novel mechanism)
- Marked Contradiction Detection as "experimental, not yet wired into KG ops"
- Closet legend now notes plain-text summaries in v3.0.0, AAAK closets coming
- Intro pillars rewritten honestly — raw verbatim is the win, AAAK is
  experimental compression layer

Thank you to @panuhorsmalahti (#43), @lhl (#27), @gizmax (#39) and everyone
who filed issues in the first 48 hours. Brutal honest criticism is exactly
what makes open source work.
2026-04-07 11:08:53 -07:00
JayadityaGit ca2549da5a docs: add Gemini to MCP header in README 2026-04-07 22:12:20 +05:30