Commit Graph

11 Commits

Author SHA1 Message Date
copilot-swe-agent[bot] e7fe6cae14 fix(normalize): discard user/gemini turns before session_metadata sentinel
Agent-Logs-Url: https://github.com/MemPalace/mempalace/sessions/4511e9aa-38e7-440e-a6f8-eda91e576f0f

Co-authored-by: igorls <4753812+igorls@users.noreply.github.com>
2026-04-27 21:41:48 +00:00
MSL f4440f1ce0 feat(normalize): Gemini CLI session JSONL adapter
Adds a fifth format adapter to mempalace.normalize alongside the
existing Claude Code, Codex, Claude.ai, ChatGPT, and Slack parsers.
After this lands, mempalace mine --mode convos ingests Gemini CLI
session history without manual export.

Why now: Claude Code and Codex CLI are already supported by convo_miner;
adding Gemini closes the major-CLI-tool coverage gap. After this lands,
the README's "verbatim conversation history" promise is honestly
delivered for all three top-tier API-keyed coding CLIs (Claude Code,
Codex CLI, Gemini CLI), not just two of them. This is the third leg
of the trio Aya pushed for so the public claim matches the actual
ingest pipeline.

Gemini CLI stores sessions at ~/.gemini/tmp/<project_hash>/chats/ as
JSONL. The on-disk schema (per google-gemini/gemini-cli#15292):

    {"type":"session_metadata","sessionId":"...","projectHash":"...",...}
    {"type":"user","id":"msg1","content":[{"text":"Hello"}]}
    {"type":"gemini","id":"msg2","content":[{"text":"Hi"}]}
    {"type":"message_update","id":"msg2","tokens":{"input":10,"output":5}}

The new _try_gemini_jsonl parser:

  - requires a session_metadata record so it does not false-positive
    against Claude Code or Codex JSONL passing through the dispatch
    chain in _try_normalize_json
  - extracts user/gemini message text from each entry's content array
    of {"text": "..."} blocks, joining multiple blocks per message
    in order
  - skips message_update entries (token-count deltas with no message
    text) and any other unknown record types
  - returns None when fewer than two conversational messages are
    present, mirroring the codex parser's >=2-message guard

Test coverage: 9 new unit tests in tests/test_normalize.py mirroring
the codex test pattern - happy path, multi-turn, missing session
metadata, message_update skip, single-message rejection, multi-block
content concatenation, empty content skip, malformed-line resilience,
and explicit no-match against codex JSONL fixtures. Schema-level only;
real Gemini CLI session fixtures are a follow-up once a real user file
is available.

Closes part of #59 (the Gemini CLI portion of the umbrella request).
2026-04-27 01:25:03 -07:00
Marcio E. Heiderscheidt e61dc2adf8 fix: add provenance header and speaker IDs to Slack transcript imports (#815)
* fix: add provenance header and speaker IDs to Slack transcript imports

Slack exports are multi-party chats where no speaker is inherently
the "user" or "assistant". The parser previously assigned these roles
purely by position, allowing a crafted export to place attacker text
in the "user" role — making it appear as the memory owner's words
in all future retrieval (data poisoning via stored memory).

Changes:
- Add provenance header marking Slack transcripts as multi-party
  with positional (unverified) role assignment
- Prefix each message with the original speaker ID ([U1], [U2], etc.)
  so downstream consumers can distinguish authors
- Keep user/assistant role alternation for exchange-pair chunking
  compatibility with convo_miner.py

Tests:
- Provenance header presence and content
- Speaker ID preservation in output
- Attacker-first-message attribution verification

Refs: MemPalace/mempalace#809

* fix: move Slack provenance to footer, sanitize speaker IDs, extract constant

- Move provenance notice from header to footer to prevent it becoming
  a standalone ChromaDB drawer via paragraph chunking on exports
  with fewer than 3 exchange pairs (violates verbatim-always principle)
- Sanitize speaker user_id/username: strip brackets, newlines, and
  control characters to prevent chunk-boundary injection via crafted
  Slack exports
- Extract header string to _SLACK_PROVENANCE_FOOTER module constant,
  consistent with _TOOL_RESULT_* constants pattern; tests import it
  instead of duplicating the literal

Refs: MemPalace/mempalace#809
2026-04-15 00:27:01 -07:00
Igor Lins e Silva ca2598a9f6 fix(normalize): make strip_noise verbatim-safe and scope it to Claude Code JSONL
The initial strip_noise() regressed on three fronts when audited against
adversarial user content — each verified with executable repros against
the cherry-picked code:

  1. `<tag>.*?</tag>` with re.DOTALL span-ate across messages: one
     stray unclosed <system-reminder> anywhere in a session merged with
     the next closing tag, silently deleting everything between them
     (including full assistant replies).
  2. `.*\(ctrl\+o to expand\).*\n?` nuked entire lines of user prose
     whenever a user happened to document the TUI shortcut.
  3. `Ran \d+ (?:stop|pre|post)\s*hook.*` with IGNORECASE ate the
     second sentence from "our CI has a stop hook ... Ran 2 stop hooks
     last week" — legitimate user commentary.

These are unambiguous violations of the project's "Verbatim always"
design principle.

Fixes:

- All tag patterns are now line-anchored (`(?m)^(?:> )?<tag>`) and their
  body forbids crossing a blank line (`(?:(?!\n\s*\n)[\s\S])*?`), so a
  dangling open tag cannot eat neighboring messages.
- `_NOISE_LINE_PREFIXES` are line-anchored and case-sensitive — user
  prose mentioning "CURRENT TIME:" mid-sentence is preserved.
- Hook-run chrome requires `(?m)^`, explicit hook names (Stop,
  PreCompact, PreToolUse, etc.), and no IGNORECASE.
- "… +N lines" is line-anchored.
- "(ctrl+o to expand)" only matches Claude Code's actual collapsed-
  output chrome shape `[N tokens] (ctrl+o to expand)`; a bare
  parenthetical in user prose stays intact.

Scope:

- `strip_noise()` is no longer called on every normalization path.
  Only `_try_claude_code_jsonl` invokes it, per-extracted-message — so
  Claude.ai exports, ChatGPT exports, Slack JSON, Codex JSONL, and
  plain text with `>` markers pass through fully verbatim. Per-message
  application also makes span-eating structurally impossible.

Tests:

- 15 new tests in test_normalize.py pin the boundary: 6 guard user
  content that must survive (each of the adversarial repros), 9 assert
  real system chrome is still stripped. All pass; full suite 702 pass
  (2 failures are the unrelated pre-existing version.py bug, cleared
  by #820).

Known limitation (not fixed here): convo_miner.py does not delete
drawers on re-mine, so transcripts mined before this PR keep noise-
filled drawers until the user manually erases + re-mines. Proper fix
needs a schema-version field on drawer metadata + re-mine trigger —
out of scope for this PR.
2026-04-13 16:11:03 -03:00
Mikhail Valentsev a2432a3245 fix: parse Claude.ai privacy export with messages key and sender field (#677) (#685)
* fix: parse Claude.ai privacy export with messages key and sender field (#677)

The privacy-export branch in _try_claude_ai_json only checked for the
"chat_messages" key, missing exports that use "messages" instead.  It
also only read the "role" field while real privacy exports use "sender".
Both gaps caused the file to fall through to plain-text, producing a
single giant drawer.

Changes:
- Accept "messages" alongside "chat_messages" in the conversation-object
  guard and inner extraction.
- Accept "sender" alongside "role" as the author field.
- Fall back to a top-level "text" key when content blocks are empty.
- Produce one transcript per conversation instead of concatenating all
  conversations into a single blob.
- Extract shared logic into _collect_claude_messages helper.
- Add 6 regression tests covering each variant.

* style: apply ruff format to normalize.py

* fix: guard against null text field in Claude.ai export parsing

item.get("text", "").strip() crashes when "text" is explicitly null
in the JSON (legal and observed in some exports). Use
(item.get("text") or "").strip() and add a regression test.

---------

Co-authored-by: Igor Lins e Silva <4753812+igorls@users.noreply.github.com>
2026-04-13 02:11:03 -03:00
Ben Sigman 4621f85d7c style: ruff format all Python files (#675) 2026-04-11 22:59:34 -07:00
Ben Sigman 20c8f8e57b feat: new MCP tools — get/list/update drawer, hook settings, export (resolves #635) (#667)
* feat: MCP reliability — inode detection, WAL rotation, metadata cache, search limits

Infrastructure hardening for the MCP server:
- Detect palace DB replacement via inode tracking (repair command support)
- WAL rotation to prevent unbounded WAL growth
- _fetch_all_metadata() + _get_cached_metadata() with 60s TTL for taxonomy/status
- _MAX_RESULTS cap (100) with limit clamping [1, _MAX_RESULTS]
- max_distance parameter for similarity threshold in search
- Handle all notifications/* methods, null arguments, method=None
- Remove duplicate _client_cache = None declarations
- searcher.py max_distance parameter passthrough

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: new MCP tools (get/list/update drawer, hook settings, memories filed), export, normalize

New MCP tools:
- mempalace_get_drawer: fetch single drawer by ID with full content
- mempalace_list_drawers: paginated listing with wing/room filter
- mempalace_update_drawer: update content/wing/room on existing drawers
- mempalace_hook_settings: get/set hook behavior (silent_save, desktop_toast)
- mempalace_memories_filed_away: check latest checkpoint status

Also includes:
- exporter.py: export palace as browsable markdown files
- normalize.py: tool_use/tool_result capture for richer transcript mining
- layers.py: updated for new tool integration
- config.py: hook settings properties (hook_silent_save, hook_desktop_toast)

Depends on PR 3 (reliability) for _MAX_RESULTS, _metadata_cache, WAL logging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: normalize.py handles string messages and Read offset type mismatch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: params null guard, L2→cosine docs, empty tool_use_map key guard

- Handle explicit null in MCP params (request.get("params") or {})
- Fix search tool description: L2 → cosine distance (collection uses hnsw:space=cosine)
- Guard against empty string key in tool_use_map from malformed JSONL entries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: rename ambiguous var 'l' to 'line' (E741 lint)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address code review findings (5 issues)

1. min_similarity backwards-compat: convert similarity to distance scale
   (1.0 - similarity) instead of passing raw value as max_distance
2. Restore structured error reporting (error + partial fields) in
   tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy
   — reverts silent except:pass that dropped #647 security hardening
3. inode cache: remove falsy-zero short-circuit so missing DB file
   triggers reconnect instead of reusing stale client
4. _fetch_all_metadata: check for empty batch before extending/advancing
   offset to prevent infinite loop on concurrent deletion
5. KG initialization: only override path when --palace is explicit;
   default runs use KnowledgeGraph's built-in default path

Co-authored-by: jphein <jphein@users.noreply.github.com>

---------

Co-authored-by: jp <jp@jphein.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: jphein <jphein@users.noreply.github.com>
2026-04-11 21:25:04 -07:00
bensig b1adc047e6 fix: address Octocode review — move size check, add tests for all 3 fixes
- Move file size check before try block so IOError propagates cleanly
  (not caught by the except OSError handler below it)
- Wrap os.path.getsize in its own try/except to preserve existing
  test_normalize_io_error behavior on missing files
- Add test_normalize_rejects_large_file (mocked getsize)
- Add test_null_arguments_does_not_hang (#394)
- Add test_cmd_repair_trailing_slash_does_not_recurse (#395)

532 tests pass locally, 0 regressions.
2026-04-09 10:40:53 -07:00
Tal Muskal 9ca70264f3 style: format test files with ruff
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 21:08:49 +03:00
Tal Muskal e24d8ca733 test: expand coverage to 70%, fix mcp_server CI crash (threshold 60%)
Add/expand tests for normalize (39%→97%), searcher (39%→100%),
layers (28%→97%), split_mega_files (34%→72%).

Fix mcp_server.py parse_args→parse_known_args to prevent SystemExit
when imported during pytest (CI was crashing on all test jobs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 21:07:03 +03:00
bensig 0f8fa8c7d5 bench: add benchmark runners, results docs, and test suite
Benchmarks: LongMemEval, LoCoMo, ConvoMem, MemBench runners with
methodology docs and hybrid retrieval analysis.

Tests: config, miner, convo_miner, normalize — 9 tests, all passing.
2026-04-04 18:33:42 -07:00