Merge pull request #1183 from MemPalace/feat/init-mine-ux

feat(cli): init prompts to mine, mine handles Ctrl-C gracefully (#1181, #1182)
This commit is contained in:
Igor Lins e Silva
2026-04-25 01:24:23 -03:00
committed by GitHub
5 changed files with 610 additions and 49 deletions
+7 -4
View File
@@ -8,14 +8,17 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
## [3.3.4] — unreleased ## [3.3.4] — unreleased
### Added
- **`mempalace init` now prompts to mine the same directory.** After entity confirmation, room detection, and gitignore guard, `init` shows a one-line scope estimate (e.g. `~423 files (~12 MB) would be mined into this palace.`) computed from its existing corpus walk, then asks `Mine this directory now? [Y/n]` (default yes) and runs `mine()` in-process if accepted. The estimate fires before the prompt so users on a real corpus aren't surprised by a minutes-long ChromaDB write. Declining prints the exact `mempalace mine <dir>` command for later. (#1181)
- **New `--auto-mine` flag on `mempalace init`** for the non-interactive path (`mempalace init --auto-mine <dir>` skips the mine prompt and runs mine directly). `--yes` retains its existing scope of entity auto-accept only and still prompts for the mine step, so existing scripted callers see no behaviour change; combining `--yes --auto-mine` gives a fully non-interactive setup. (#1181)
- **Cross-wing topic tunnels.** When two wings have confirmed `TOPIC` labels in common (the LLM-refine bucket from `mempalace init --llm`), the miner now drops a symmetric tunnel between them at mine time so the palace graph reflects shared themes (frameworks, vendors, recurring concepts). Tunnels are routed through the existing `create_tunnel` storage so they share dedup and persistence with explicit tunnels. Topic tunnels are stored under a synthetic `topic:<name>` room and tagged with `kind: "topic"` on the stored dict — this keeps them distinct from literal folder-derived rooms of the same name (a wing with both an `Angular` folder room and an `Angular` topic tunnel no longer collides at `follow_tunnels` read time) and gives LLMs scanning `list_tunnels` a visible discriminator. Threshold is configurable via `MEMPALACE_TOPIC_TUNNEL_MIN_COUNT` env var or `topic_tunnel_min_count` in `~/.mempalace/config.json` (default `1`). Manifest-dependency overlap and per-topic allow/deny lists remain out of scope. (#1180)
### Bug Fixes ### Bug Fixes
- **CLI `mempalace search` retrieval quality.** The CLI was using pure ChromaDB cosine distance with no BM25 rerank, so drawers containing every query term but embedding as noise (directory listings, diff output, shell logs) scored `Match: 0.0` alongside genuinely irrelevant results with no way to tell them apart. Wired the CLI through the same `_hybrid_rank` the `mempalace_search` MCP tool already used, and surfaced both `cosine=` and `bm25=` scores in the output so users see which component of the match is firing. MCP search was unaffected; this fixes the human-facing CLI parity gap. - **CLI `mempalace search` retrieval quality.** The CLI was using pure ChromaDB cosine distance with no BM25 rerank, so drawers containing every query term but embedding as noise (directory listings, diff output, shell logs) scored `Match: 0.0` alongside genuinely irrelevant results with no way to tell them apart. Wired the CLI through the same `_hybrid_rank` the `mempalace_search` MCP tool already used, and surfaced both `cosine=` and `bm25=` scores in the output so users see which component of the match is firing. MCP search was unaffected; this fixes the human-facing CLI parity gap.
- **Legacy-palace distance-metric warning.** CLI search now detects palaces created before `hnsw:space=cosine` was consistently set and prints a one-line notice pointing at `mempalace repair`. Without the warning such palaces silently used L2 distance, under which the similarity display floored every result to `Match: 0.0`. New palaces mined today already set cosine correctly and now have invariant tests pinning that behavior so future refactors can't silently regress it. (#1179) - **Legacy-palace distance-metric warning.** CLI search now detects palaces created before `hnsw:space=cosine` was consistently set and prints a one-line notice pointing at `mempalace repair`. Without the warning such palaces silently used L2 distance, under which the similarity display floored every result to `Match: 0.0`. New palaces mined today already set cosine correctly and now have invariant tests pinning that behavior so future refactors can't silently regress it. (#1179)
- **Graceful Ctrl-C during `mempalace mine`.** Interrupting a long mine no longer dumps a multi-frame `KeyboardInterrupt` traceback. The main file-processing loop now catches the signal, prints `files_processed: N/M`, `drawers_filed: K`, and `last_file:` so the user knows what landed, then exits with code 130 (standard SIGINT). Already-filed drawers are upserted idempotently on re-mine via deterministic IDs, so resuming is safe. The hooks PID lock at `~/.mempalace/hook_state/mine.pid` is now also actively cleaned up in a `finally` when its entry points at us — clean exit, error, or interrupt — preventing the next hook fire from briefly waiting on a stale PID. (#1182)
### Added
- **Cross-wing topic tunnels.** When two wings have confirmed `TOPIC` labels in common (the LLM-refine bucket from `mempalace init --llm`), the miner now drops a symmetric tunnel between them at mine time so the palace graph reflects shared themes (frameworks, vendors, recurring concepts). Tunnels are routed through the existing `create_tunnel` storage so they share dedup and persistence with explicit tunnels. Topic tunnels are stored under a synthetic `topic:<name>` room and tagged with `kind: "topic"` on the stored dict — this keeps them distinct from literal folder-derived rooms of the same name (a wing with both an `Angular` folder room and an `Angular` topic tunnel no longer collides at `follow_tunnels` read time) and gives LLMs scanning `list_tunnels` a visible discriminator. Threshold is configurable via `MEMPALACE_TOPIC_TUNNEL_MIN_COUNT` env var or `topic_tunnel_min_count` in `~/.mempalace/config.json` (default `1`). Manifest-dependency overlap and per-topic allow/deny lists remain out of scope. (#1180)
--- ---
+109
View File
@@ -156,6 +156,107 @@ def cmd_init(args):
# Pass 3: protect git repos from accidentally committing per-project files # Pass 3: protect git repos from accidentally committing per-project files
_ensure_mempalace_files_gitignored(args.dir) _ensure_mempalace_files_gitignored(args.dir)
# Pass 4: offer to run mine immediately. The directory just had its
# rooms + entities set up, so 99% of users will mine next anyway —
# asking here removes the "remember to type the next command" friction.
# `--auto-mine` skips the prompt and mines automatically; `--yes` is
# SCOPED to entity auto-accept and does NOT imply mining.
_maybe_run_mine_after_init(args, cfg)
def _format_size_mb(num_bytes: int) -> str:
"""Render a byte count as a human-readable size for the mine estimate.
< 1 MB rounds up to ``<1 MB`` so users never see a misleading ``0 MB``
on small projects. Otherwise reports an integer megabyte count.
"""
if num_bytes <= 0:
return "<1 MB"
mb = num_bytes / (1024 * 1024)
if mb < 1:
return "<1 MB"
return f"{mb:.0f} MB"
def _maybe_run_mine_after_init(args, cfg) -> None:
"""Prompt the user to mine the directory just initialised, or auto-mine
when ``--auto-mine`` was passed. Extracted so the prompt path is
unit-testable.
Behaviour matrix:
- default (no flags) — prompt, default Yes, mine in-process if accepted
- ``--yes`` — entity auto-accept only; STILL prompts for the mine step
- ``--auto-mine`` — skip the mine prompt and mine directly
- ``--yes --auto-mine`` — fully non-interactive
Mine errors are surfaced (not swallowed): a failing mine exits with a
non-zero status via :func:`sys.exit` so downstream scripts can see it.
The pre-scan that produces the file-count estimate is reused as the
mine input so we never walk the corpus twice.
"""
from .miner import mine, scan_project
project_dir = args.dir
auto_mine = bool(getattr(args, "auto_mine", False))
# Single corpus walk: this scan feeds BOTH the "what would be mined"
# estimate the user sees in the prompt AND the file list mine() will
# process. We pass the result into mine() via the `files` kwarg so it
# doesn't re-walk the tree.
try:
scanned_files = scan_project(project_dir)
file_count = len(scanned_files)
total_bytes = 0
for fp in scanned_files:
try:
total_bytes += fp.stat().st_size
except OSError:
# Skip files that vanished between scan and stat — mine()
# will skip them too.
continue
size_str = _format_size_mb(total_bytes)
except Exception:
scanned_files = None
file_count = None
size_str = None
# Show the scope estimate BEFORE the prompt so the user knows what
# they are agreeing to. On a real corpus mine takes minutes; hitting
# Enter on a default-Y prompt with no size cue is a footgun.
if isinstance(file_count, int):
if size_str:
print(f" ~{file_count} files (~{size_str}) would be mined into this palace.\n")
else:
print(f" ~{file_count} files would be mined into this palace.\n")
if not auto_mine:
try:
answer = input(" Mine this directory now? [Y/n] ").strip().lower()
except EOFError:
# Non-interactive stdin (e.g. piped) — treat like decline so
# we don't block. User can re-run with --auto-mine to opt in.
answer = "n"
if answer not in ("", "y", "yes"):
print(f"\n Skipped. Run `mempalace mine {shlex.quote(project_dir)}` when ready.")
return
palace_path = cfg.palace_path
try:
mine(
project_dir=project_dir,
palace_path=palace_path,
files=scanned_files,
)
except KeyboardInterrupt:
# mine() handles its own SIGINT summary + sys.exit(130); re-raise
# any KeyboardInterrupt that escapes (shouldn't happen) so the
# shell still sees a clean interrupt rather than a swallowed one.
raise
except Exception as e:
print(f"\n ERROR: mine failed: {e}", file=sys.stderr)
sys.exit(1)
def cmd_mine(args): def cmd_mine(args):
palace_path = os.path.expanduser(args.palace) if args.palace else MempalaceConfig().palace_path palace_path = os.path.expanduser(args.palace) if args.palace else MempalaceConfig().palace_path
@@ -585,6 +686,14 @@ def main():
action="store_true", action="store_true",
help="Auto-accept all detected entities (non-interactive)", help="Auto-accept all detected entities (non-interactive)",
) )
p_init.add_argument(
"--auto-mine",
action="store_true",
help=(
"Skip the post-init mine prompt and run mine automatically. "
"Combine with --yes for a fully non-interactive setup."
),
)
p_init.add_argument( p_init.add_argument(
"--lang", "--lang",
default=None, default=None,
+122 -45
View File
@@ -9,6 +9,7 @@ Stores verbatim chunks as drawers. No summaries. Ever.
import os import os
import sys import sys
import shlex
import hashlib import hashlib
import fnmatch import fnmatch
from pathlib import Path from pathlib import Path
@@ -981,8 +982,16 @@ def mine(
dry_run: bool = False, dry_run: bool = False,
respect_gitignore: bool = True, respect_gitignore: bool = True,
include_ignored: list = None, include_ignored: list = None,
files: list = None,
): ):
"""Mine a project directory into the palace.""" """Mine a project directory into the palace.
``files`` may optionally be a pre-scanned list of file paths from
:func:`scan_project`. When provided, the corpus walk is skipped — the
caller (e.g. ``init`` showing a file-count estimate before the mine
prompt) avoids walking the tree twice. When ``None`` (the default),
``mine`` walks the tree itself just like before.
"""
project_path = Path(project_dir).expanduser().resolve() project_path = Path(project_dir).expanduser().resolve()
config = load_config(project_dir) config = load_config(project_dir)
@@ -990,11 +999,12 @@ def mine(
wing = wing_override or config["wing"] wing = wing_override or config["wing"]
rooms = config.get("rooms", [{"name": "general", "description": "All project files"}]) rooms = config.get("rooms", [{"name": "general", "description": "All project files"}])
files = scan_project( if files is None:
project_dir, files = scan_project(
respect_gitignore=respect_gitignore, project_dir,
include_ignored=include_ignored, respect_gitignore=respect_gitignore,
) include_ignored=include_ignored,
)
if limit > 0: if limit > 0:
files = files[:limit] files = files[:limit]
@@ -1025,50 +1035,117 @@ def mine(
total_drawers = 0 total_drawers = 0
files_skipped = 0 files_skipped = 0
files_processed = 0
last_file = None
room_counts = defaultdict(int) room_counts = defaultdict(int)
for i, filepath in enumerate(files, 1): try:
drawers, room = process_file( for i, filepath in enumerate(files, 1):
filepath=filepath, try:
project_path=project_path, drawers, room = process_file(
collection=collection, filepath=filepath,
wing=wing, project_path=project_path,
rooms=rooms, collection=collection,
agent=agent, wing=wing,
dry_run=dry_run, rooms=rooms,
closets_col=closets_col, agent=agent,
dry_run=dry_run,
closets_col=closets_col,
)
except KeyboardInterrupt:
# Re-raise so the outer handler prints the summary; we
# capture the last-attempted file via last_file below.
last_file = filepath.name
raise
files_processed = i
last_file = filepath.name
if drawers == 0 and not dry_run:
files_skipped += 1
else:
total_drawers += drawers
room_counts[room] += 1
if not dry_run:
print(f" + [{i:4}/{len(files)}] {filepath.name[:50]:50} +{drawers}")
if not dry_run:
# Cross-wing topic tunnels: after every file in this wing has been
# processed, link this wing to any other wing that shares a
# confirmed TOPIC label. Out of scope for v1: manifest-dependency
# overlap, per-topic allow/deny lists, search-result surfacing.
try:
tunnels_added = _compute_topic_tunnels_for_wing(wing)
if tunnels_added:
print(f"\n Topic tunnels: +{tunnels_added} cross-wing link(s)")
except Exception as e:
# Tunnel computation must never fail a mine — degrade quietly.
print(
f"\n WARNING: topic tunnel computation skipped — {e}",
file=sys.stderr,
)
print(f"\n{'=' * 55}")
print(" Done.")
print(f" Files processed: {len(files) - files_skipped}")
print(f" Files skipped (already filed): {files_skipped}")
print(f" Drawers filed: {total_drawers}")
print("\n By room:")
for room, count in sorted(room_counts.items(), key=lambda x: x[1], reverse=True):
print(f" {room:20} {count} files")
print('\n Next: mempalace search "what you\'re looking for"')
print(f"{'=' * 55}\n")
except KeyboardInterrupt:
# Idempotent re-mine: deterministic drawer IDs mean already-filed
# drawers upsert to the same row on next run, so partial progress
# is safe to leave in place. A second Ctrl-C during this print
# propagates to the default handler — we don't try to catch
# everything.
print("\n\n Mine interrupted.")
print(f" files_processed: {files_processed}/{len(files)}")
print(f" drawers_filed: {total_drawers}")
print(f" last_file: {last_file or '<none>'}")
print(
f"\n Re-run `mempalace mine {shlex.quote(project_dir)}` to resume — "
"already-filed drawers are\n upserted idempotently and will not duplicate.\n"
) )
if drawers == 0 and not dry_run: sys.exit(130)
files_skipped += 1 finally:
else: # Clean up the hooks-side PID lock if it points at us. Stale
total_drawers += drawers # entries already pass _pid_alive() == False on POSIX, but
room_counts[room] += 1 # actively removing the file makes the state observable
if not dry_run: # (callers can stat it) and avoids accidental PID reuse on
print(f" + [{i:4}/{len(files)}] {filepath.name[:50]:50} +{drawers}") # short-lived test runs. Only remove if the file claims our
# own PID — never another process's.
_cleanup_mine_pid_file()
if not dry_run:
# Cross-wing topic tunnels: after every file in this wing has been
# processed, link this wing to any other wing that shares a
# confirmed TOPIC label. Out of scope for v1: manifest-dependency
# overlap, per-topic allow/deny lists, search-result surfacing.
try:
tunnels_added = _compute_topic_tunnels_for_wing(wing)
if tunnels_added:
print(f"\n Topic tunnels: +{tunnels_added} cross-wing link(s)")
except Exception as e:
# Tunnel computation must never fail a mine — degrade quietly.
print(f"\n WARNING: topic tunnel computation skipped — {e}", file=sys.stderr)
print(f"\n{'=' * 55}") def _cleanup_mine_pid_file() -> None:
print(" Done.") """Remove the global mine PID file if it currently points at us.
print(f" Files processed: {len(files) - files_skipped}")
print(f" Files skipped (already filed): {files_skipped}") The PID file (``~/.mempalace/hook_state/mine.pid``, written by the
print(f" Drawers filed: {total_drawers}") hook in :func:`mempalace.hooks_cli._spawn_mine`) tracks the PID of
print("\n By room:") the most recently spawned mine subprocess so the hook can dedup
for room, count in sorted(room_counts.items(), key=lambda x: x[1], reverse=True): concurrent auto-ingest fires. When that subprocess exits — cleanly,
print(f" {room:20} {count} files") on error, or via Ctrl-C — it should remove its own entry so the
print('\n Next: mempalace search "what you\'re looking for"') next hook fire isn't briefly fooled by a stale PID before
print(f"{'=' * 55}\n") ``_pid_alive`` returns False.
We only delete the file if it claims our own PID; any other PID is
left alone (could be an unrelated mine running concurrently from
a different worktree / session).
"""
try:
from .hooks_cli import _MINE_PID_FILE
except Exception:
return
try:
if not _MINE_PID_FILE.exists():
return
recorded = _MINE_PID_FILE.read_text().strip()
if recorded and recorded.isdigit() and int(recorded) == os.getpid():
_MINE_PID_FILE.unlink()
except OSError:
# Best-effort cleanup; never fail the mine over PID bookkeeping.
pass
def _compute_topic_tunnels_for_wing(wing: str) -> int: def _compute_topic_tunnels_for_wing(wing: str) -> int:
+229
View File
@@ -1,6 +1,7 @@
"""Tests for mempalace.cli — the main CLI dispatcher.""" """Tests for mempalace.cli — the main CLI dispatcher."""
import argparse import argparse
import shlex
import sys import sys
from pathlib import Path from pathlib import Path
from unittest.mock import MagicMock, patch from unittest.mock import MagicMock, patch
@@ -108,6 +109,7 @@ def test_cmd_init_no_entities(mock_config_cls, tmp_path):
with ( with (
patch("mempalace.entity_detector.scan_for_detection", return_value=[]), patch("mempalace.entity_detector.scan_for_detection", return_value=[]),
patch("mempalace.room_detector_local.detect_rooms_local") as mock_rooms, patch("mempalace.room_detector_local.detect_rooms_local") as mock_rooms,
patch("mempalace.cli._maybe_run_mine_after_init"),
): ):
cmd_init(args) cmd_init(args)
mock_rooms.assert_called_once_with(project_dir=str(tmp_path), yes=True) mock_rooms.assert_called_once_with(project_dir=str(tmp_path), yes=True)
@@ -126,6 +128,7 @@ def test_cmd_init_with_entities(mock_config_cls, tmp_path):
patch("mempalace.entity_detector.confirm_entities", return_value=confirmed), patch("mempalace.entity_detector.confirm_entities", return_value=confirmed),
patch("mempalace.room_detector_local.detect_rooms_local"), patch("mempalace.room_detector_local.detect_rooms_local"),
patch("builtins.open", MagicMock()), patch("builtins.open", MagicMock()),
patch("mempalace.cli._maybe_run_mine_after_init"),
): ):
cmd_init(args) cmd_init(args)
@@ -140,12 +143,238 @@ def test_cmd_init_with_entities_zero_total(mock_config_cls, tmp_path, capsys):
patch("mempalace.entity_detector.scan_for_detection", return_value=fake_files), patch("mempalace.entity_detector.scan_for_detection", return_value=fake_files),
patch("mempalace.entity_detector.detect_entities", return_value=detected), patch("mempalace.entity_detector.detect_entities", return_value=detected),
patch("mempalace.room_detector_local.detect_rooms_local"), patch("mempalace.room_detector_local.detect_rooms_local"),
patch("mempalace.cli._maybe_run_mine_after_init"),
): ):
cmd_init(args) cmd_init(args)
out = capsys.readouterr().out out = capsys.readouterr().out
assert "No entities detected" in out assert "No entities detected" in out
# ── _maybe_run_mine_after_init (init → mine prompt, #1181) ─────────────
def _init_args(tmp_path, *, yes=False, auto_mine=False):
return argparse.Namespace(dir=str(tmp_path), yes=yes, auto_mine=auto_mine)
def _fake_cfg(tmp_path):
cfg = MagicMock()
cfg.palace_path = str(tmp_path / "palace")
return cfg
def _fake_scanned(tmp_path, n=3):
"""Build n real Path objects with stat()-able sizes for the scan estimate."""
paths = []
for i in range(n):
p = tmp_path / f"f{i}.txt"
p.write_text("x" * 1024) # 1 KB each
paths.append(p)
return paths
def test_maybe_run_mine_prompt_accepted_runs_mine(tmp_path):
"""Empty / 'y' / 'yes' on the prompt triggers mine() in-process."""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=False, auto_mine=False)
cfg = _fake_cfg(tmp_path)
scanned = _fake_scanned(tmp_path, n=3)
with (
patch("mempalace.miner.mine") as mock_mine,
patch("mempalace.miner.scan_project", return_value=scanned),
patch("builtins.input", return_value=""),
):
_maybe_run_mine_after_init(args, cfg)
mock_mine.assert_called_once_with(
project_dir=str(tmp_path),
palace_path=cfg.palace_path,
files=scanned,
)
def test_maybe_run_mine_prompt_yes_accepted_runs_mine(tmp_path):
"""Explicit 'y' answer also runs mine()."""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=False, auto_mine=False)
cfg = _fake_cfg(tmp_path)
with (
patch("mempalace.miner.mine") as mock_mine,
patch("mempalace.miner.scan_project", return_value=[]),
patch("builtins.input", return_value="Y"),
):
_maybe_run_mine_after_init(args, cfg)
mock_mine.assert_called_once()
def test_maybe_run_mine_prompt_declined_prints_hint(tmp_path, capsys):
"""'n' answer skips mine() and prints the resume hint."""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=False, auto_mine=False)
cfg = _fake_cfg(tmp_path)
with (
patch("mempalace.miner.mine") as mock_mine,
patch("mempalace.miner.scan_project", return_value=[]),
patch("builtins.input", return_value="n"),
):
_maybe_run_mine_after_init(args, cfg)
mock_mine.assert_not_called()
out = capsys.readouterr().out
# shlex.quote is a no-op on POSIX-safe paths but wraps Windows paths
# (which contain backslashes) in single quotes, so the assertion has
# to mirror what the production code actually emits.
assert f"mempalace mine {shlex.quote(str(tmp_path))}" in out
assert "Skipped" in out
def test_maybe_run_mine_yes_alone_still_prompts(tmp_path):
"""`--yes` is scoped to entity auto-accept and MUST still prompt for mine.
Regression guard for the flag-overload review feedback on #1183: extending
`--yes` to also auto-mine would silently change behaviour for scripted
callers and turn a fast command into a minutes-long ChromaDB write.
"""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=True, auto_mine=False)
cfg = _fake_cfg(tmp_path)
with (
patch("mempalace.miner.mine") as mock_mine,
patch("mempalace.miner.scan_project", return_value=[]),
patch("builtins.input", return_value="n") as mock_input,
):
_maybe_run_mine_after_init(args, cfg)
mock_input.assert_called_once() # the prompt MUST fire
mock_mine.assert_not_called()
def test_maybe_run_mine_auto_mine_skips_prompt(tmp_path):
"""`--auto-mine` runs mine() automatically without calling input()."""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=False, auto_mine=True)
cfg = _fake_cfg(tmp_path)
scanned = _fake_scanned(tmp_path, n=2)
with (
patch("mempalace.miner.mine") as mock_mine,
patch("mempalace.miner.scan_project", return_value=scanned),
patch("builtins.input", side_effect=AssertionError("input() must not be called")),
):
_maybe_run_mine_after_init(args, cfg)
mock_mine.assert_called_once_with(
project_dir=str(tmp_path),
palace_path=cfg.palace_path,
files=scanned,
)
def test_maybe_run_mine_yes_and_auto_mine_fully_noninteractive(tmp_path):
"""`--yes --auto-mine` together: never call input(), always mine."""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=True, auto_mine=True)
cfg = _fake_cfg(tmp_path)
with (
patch("mempalace.miner.mine") as mock_mine,
patch("mempalace.miner.scan_project", return_value=[]),
patch("builtins.input", side_effect=AssertionError("input() must not be called")),
):
_maybe_run_mine_after_init(args, cfg)
mock_mine.assert_called_once()
def test_maybe_run_mine_decline_quotes_path_with_spaces(tmp_path, capsys):
"""The resume hint must shell-quote the project dir so paths with
spaces / metacharacters produce a copy-paste-safe command."""
from mempalace.cli import _maybe_run_mine_after_init
spaced_dir = tmp_path / "my project dir"
spaced_dir.mkdir()
args = argparse.Namespace(dir=str(spaced_dir), yes=False, auto_mine=False)
cfg = _fake_cfg(tmp_path)
with (
patch("mempalace.miner.mine"),
patch("mempalace.miner.scan_project", return_value=[]),
patch("builtins.input", return_value="n"),
):
_maybe_run_mine_after_init(args, cfg)
out = capsys.readouterr().out
# shlex.quote wraps paths with spaces (and Windows backslashes) in
# single quotes — the assertion must use the same shlex form so the
# test passes on every platform's tmp_path layout.
assert f"mempalace mine {shlex.quote(str(spaced_dir))}" in out
# Bare unquoted form must NOT appear — that's the bug we're guarding.
assert f"mempalace mine {spaced_dir} " not in out
assert f"mempalace mine {spaced_dir}`" not in out
def test_maybe_run_mine_eof_on_stdin_treated_as_decline(tmp_path, capsys):
"""Piped / non-interactive stdin (EOFError) declines without crashing."""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=False, auto_mine=False)
cfg = _fake_cfg(tmp_path)
with (
patch("mempalace.miner.mine") as mock_mine,
patch("mempalace.miner.scan_project", return_value=[]),
patch("builtins.input", side_effect=EOFError),
):
_maybe_run_mine_after_init(args, cfg)
mock_mine.assert_not_called()
assert "Skipped" in capsys.readouterr().out
def test_maybe_run_mine_failure_surfaces_via_exit(tmp_path, capsys):
"""Mine errors are not swallowed — they exit non-zero with an error line."""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=False, auto_mine=True)
cfg = _fake_cfg(tmp_path)
with (
patch("mempalace.miner.mine", side_effect=RuntimeError("boom")),
patch("mempalace.miner.scan_project", return_value=[]),
):
with pytest.raises(SystemExit) as exc_info:
_maybe_run_mine_after_init(args, cfg)
assert exc_info.value.code == 1
err = capsys.readouterr().err
assert "boom" in err
def test_maybe_run_mine_estimate_appears_before_prompt(tmp_path, capsys):
"""The file-count + size estimate line MUST render BEFORE the prompt.
Required by the spec: hitting Enter on a default-Y prompt with no size
info is a footgun on a real corpus where mine takes minutes. The user
must see scope before being asked to confirm.
"""
from mempalace.cli import _maybe_run_mine_after_init
args = _init_args(tmp_path, yes=False, auto_mine=False)
cfg = _fake_cfg(tmp_path)
scanned = _fake_scanned(tmp_path, n=4) # 4 files * 1 KB each
captured_when_prompted = {}
def fake_input(prompt):
# Snapshot what stdout looked like at the moment the prompt fires.
captured_when_prompted["stdout"] = capsys.readouterr().out
return "n"
with (
patch("mempalace.miner.mine"),
patch("mempalace.miner.scan_project", return_value=scanned),
patch("builtins.input", side_effect=fake_input),
):
_maybe_run_mine_after_init(args, cfg)
pre_prompt = captured_when_prompted["stdout"]
assert "4 files" in pre_prompt, f"file count missing from pre-prompt output: {pre_prompt!r}"
assert "MB" in pre_prompt, f"size estimate missing from pre-prompt output: {pre_prompt!r}"
assert "would be mined" in pre_prompt
# ── cmd_mine ─────────────────────────────────────────────────────────── # ── cmd_mine ───────────────────────────────────────────────────────────
+143
View File
@@ -1,4 +1,5 @@
import os import os
import shlex
import shutil import shutil
import tempfile import tempfile
from pathlib import Path from pathlib import Path
@@ -600,3 +601,145 @@ def test_mine_no_tunnel_when_only_one_wing_has_topics(tmp_path, monkeypatch):
mine(str(project_root), str(palace_path)) mine(str(project_root), str(palace_path))
assert palace_graph.list_tunnels() == [] assert palace_graph.list_tunnels() == []
# ── graceful Ctrl-C handling (#1182) ────────────────────────────────────
def _make_minable_project(project_root: Path, n_files: int = 3) -> None:
"""Create a tiny project with N readable files + a config so mine() runs."""
for idx in range(n_files):
write_file(
project_root / f"f{idx}.py",
f"def fn_{idx}():\n print('hi {idx}')\n" * 20,
)
with open(project_root / "mempalace.yaml", "w") as f:
yaml.dump(
{
"wing": "interrupt_test",
"rooms": [{"name": "general", "description": "General"}],
},
f,
)
def test_mine_keyboard_interrupt_prints_summary_and_exits_130(tmp_path, capsys):
"""A KeyboardInterrupt mid-loop produces the clean summary + exit 130."""
import pytest
from unittest.mock import patch
project_root = tmp_path / "proj"
project_root.mkdir()
_make_minable_project(project_root, n_files=4)
palace_path = project_root / "palace"
call_count = {"n": 0}
def fake_process_file(*args, **kwargs):
call_count["n"] += 1
if call_count["n"] == 2:
raise KeyboardInterrupt
return (1, "general")
with patch("mempalace.miner.process_file", side_effect=fake_process_file):
with pytest.raises(SystemExit) as exc_info:
mine(str(project_root), str(palace_path))
assert exc_info.value.code == 130
out = capsys.readouterr().out
assert "Mine interrupted." in out
assert "files_processed: 1/" in out
assert "drawers_filed:" in out
assert "last_file:" in out
assert "upserted idempotently" in out
def test_mine_keyboard_interrupt_quotes_path_with_spaces_in_resume_hint(tmp_path, capsys):
"""Resume hint must shell-quote the project dir so a path containing
spaces / metacharacters yields a copy-paste-safe `mempalace mine ...`
command. Otherwise users on a path like "My Project" hit a broken
invocation when they re-run after Ctrl-C."""
import pytest
from unittest.mock import patch
project_root = tmp_path / "my project"
project_root.mkdir()
_make_minable_project(project_root, n_files=2)
palace_path = project_root / "palace"
def fake_process_file(*args, **kwargs):
raise KeyboardInterrupt
with patch("mempalace.miner.process_file", side_effect=fake_process_file):
with pytest.raises(SystemExit):
mine(str(project_root), str(palace_path))
out = capsys.readouterr().out
# Use shlex.quote so the assertion matches whatever the production
# code emits on this platform (POSIX paths with spaces vs Windows
# paths with backslashes both end up wrapped in single quotes).
assert f"mempalace mine {shlex.quote(str(project_root))}" in out
def test_mine_cleans_up_pid_file_on_interrupt(tmp_path):
"""Our own PID entry in mine.pid is removed in the finally clause."""
import pytest
from unittest.mock import patch
project_root = tmp_path / "proj"
project_root.mkdir()
_make_minable_project(project_root, n_files=2)
palace_path = project_root / "palace"
pid_file = tmp_path / "mine.pid"
pid_file.write_text(str(os.getpid()))
def fake_process_file(*args, **kwargs):
raise KeyboardInterrupt
with (
patch("mempalace.hooks_cli._MINE_PID_FILE", pid_file),
patch("mempalace.miner.process_file", side_effect=fake_process_file),
):
with pytest.raises(SystemExit):
mine(str(project_root), str(palace_path))
assert not pid_file.exists(), "Our PID entry should be cleaned up on interrupt"
def test_mine_cleans_up_pid_file_on_clean_exit(tmp_path):
"""Successful mine also removes its own PID entry in the finally clause."""
from unittest.mock import patch
project_root = tmp_path / "proj"
project_root.mkdir()
_make_minable_project(project_root, n_files=1)
palace_path = project_root / "palace"
pid_file = tmp_path / "mine.pid"
pid_file.write_text(str(os.getpid()))
with patch("mempalace.hooks_cli._MINE_PID_FILE", pid_file):
mine(str(project_root), str(palace_path))
assert not pid_file.exists()
def test_mine_does_not_remove_other_processes_pid_file(tmp_path):
"""A PID file pointing at someone else's PID is left untouched."""
from unittest.mock import patch
project_root = tmp_path / "proj"
project_root.mkdir()
_make_minable_project(project_root, n_files=1)
palace_path = project_root / "palace"
other_pid = os.getpid() + 999_999 # a PID that isn't us
pid_file = tmp_path / "mine.pid"
pid_file.write_text(str(other_pid))
with patch("mempalace.hooks_cli._MINE_PID_FILE", pid_file):
mine(str(project_root), str(palace_path))
assert pid_file.exists(), "Foreign PID entries must not be removed"
assert pid_file.read_text().strip() == str(other_pid)