Files
mempalace/tests/test_searcher.py
T
Sergey Kuznetsov ae5196bc8d Мempalace backend seam (#413)
* refactor: add stage-1 backend abstraction seam

Introduce the first upstreamable storage seam for MemPalace without
bringing in the PostgreSQL spike or any benchmark artifacts.

This change adds a small backend package with:
- BaseCollection as the minimal collection contract
- ChromaBackend/ChromaCollection as the default implementation

It then routes the main runtime collection consumers through that seam:
- palace.py
- searcher.py
- layers.py
- palace_graph.py
- mcp_server.py
- miner.status()

Behavioral constraints kept for stage 1:
- ChromaDB remains the only backend and the default path
- no config/env backend selection yet
- no PostgreSQL code
- no benchmark or research files
- existing tests stay unchanged

Important compatibility details:
- read paths now call the seam with create=False so they still surface
  the existing 'no palace found' behavior instead of silently creating
  empty collections
- write paths keep create=True semantics through palace.get_collection()
- layers/searcher retain a chromadb module attribute so the existing
  mock-based tests can keep patching PersistentClient unchanged
- ChromaBackend only creates palace directories on create=True, which
  preserves mocked read-path tests that use fake read-only paths

Verification:
- python3 -m py_compile mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q  # 529 passed, 106 deselected

* refactor: clean up stage-1 seam compatibility shims

Tighten the stage-1 backend abstraction branch after review.

This follow-up does three small things:
- keep the chromadb compatibility hook in searcher.py and layers.py,
  but express it through the backends.chroma module so it no longer
  reads like an accidental unused import
- fix the palace_graph.py helper alias to avoid the local name collision
  flagged by ruff (imported helper vs local _get_collection wrapper)
- preserve the existing mock-based test patch points unchanged while
  keeping the new backend seam intact

Why this matters:
- the direct  form looked like a
  dead import in review, even though it was intentionally preserving the
  existing test seam ( and
  )
- palace_graph.py had a real lint issue ( redefinition) that was
  small but worth fixing before a public PR

Verification:
- /opt/homebrew/bin/ruff check mempalace/backends/__init__.py mempalace/backends/base.py mempalace/backends/chroma.py mempalace/palace.py mempalace/searcher.py mempalace/layers.py mempalace/palace_graph.py mempalace/mcp_server.py mempalace/miner.py
- pytest -q tests/test_layers.py tests/test_searcher.py
- pytest -q  # 529 passed, 106 deselected

* docs: explain backend shim imports in search paths

Add short code comments in searcher.py and layers.py explaining why the
module-level `chromadb` alias remains after the stage-1 backend seam
refactor.

The alias is intentional: it preserves the existing mock patch points used
by the current test suite (`mempalace.searcher.chromadb.PersistentClient`
and `mempalace.layers.chromadb.PersistentClient`) while the runtime logic
now flows through the backend abstraction.

This keeps the public PR easier to review because the apparent "unused
import" now has an explicit reason next to it.

Verification:
- /opt/homebrew/bin/ruff check mempalace/searcher.py mempalace/layers.py
- pytest -q tests/test_layers.py tests/test_searcher.py

* refactor: reuse a default backend instance in palace helper

Tighten the stage-1 backend seam by promoting the default Chroma backend
adapter to a module-level singleton in `mempalace/palace.py`.

This keeps the stage-1 scope unchanged — Chroma is still the only backend
wired in this branch — but avoids constructing a fresh `ChromaBackend()`
object on every `get_collection()` call. The backend is stateless today,
so this is a readability/cleanup change rather than a behavioral one.

Why this helps:
- makes `palace.get_collection()` read like a real default factory instead
  of an inline constructor call
- keeps the stage-1 branch a little cleaner before opening the public PR
- does not widen the backend surface or change any config/runtime behavior

Verification:
- python3 -m py_compile mempalace/palace.py
- pytest -q tests/test_miner.py tests/test_layers.py tests/test_searcher.py
- pytest -q  # 529 passed, 106 deselected

* fix: harden read-only seam behavior and update seam tests

Preserve the stage-1 backend abstraction while closing the real read-path
regression surfaced in PR review.

What changed:
- make ChromaBackend.get_collection(create=False) fail fast when the palace
  directory does not exist instead of letting PersistentClient create it as a
  side effect
- update miner.status() to call get_collection(..., create=False) so status
  keeps the historical 'No palace found' behavior
- remove the temporary chromadb shim aliases from layers.py and searcher.py
  now that the tests patch the seam directly
- add focused tests for the new backends package, including ChromaCollection
  delegation and ChromaBackend create=True/create=False behavior
- retarget layer/searcher tests to patch the backend seam instead of patching
  chromadb.PersistentClient inside production modules
- add a regression test that status() does not create an empty palace when the
  target path is missing

Verification:
- ruff check .
- uv run pytest -q
- uv run pytest -q tests/test_backends.py tests/test_cli.py tests/test_mcp_server.py tests/test_layers.py tests/test_searcher.py tests/test_miner.py

Notes:
- the separate benchmark/slow/stress layer was started as a soak but not used
  as the merge gate for this PR branch

* refactor: drop duplicate mcp collection cache declaration

Remove a redundant `_collection_cache = None` assignment in
`mempalace/mcp_server.py` left over after the stage-1 backend seam refactor.

This does not change behavior; it only trims review noise in the MCP server
module after the read-path hardening pass.

Verification:
- ruff check mempalace/mcp_server.py
- uv run pytest -q tests/test_mcp_server.py

---------

Co-authored-by: Sergey Kuznetsov <sergey@iterudit.com>
2026-04-11 16:16:49 -07:00

122 lines
5.1 KiB
Python

"""
test_searcher.py -- Tests for both search() (CLI) and search_memories() (API).
Uses the real ChromaDB fixtures from conftest.py for integration tests,
plus mock-based tests for error paths.
"""
from unittest.mock import MagicMock, patch
import pytest
from mempalace.searcher import SearchError, search, search_memories
# ── search_memories (API) ──────────────────────────────────────────────
class TestSearchMemories:
def test_basic_search(self, palace_path, seeded_collection):
result = search_memories("JWT authentication", palace_path)
assert "results" in result
assert len(result["results"]) > 0
assert result["query"] == "JWT authentication"
def test_wing_filter(self, palace_path, seeded_collection):
result = search_memories("planning", palace_path, wing="notes")
assert all(r["wing"] == "notes" for r in result["results"])
def test_room_filter(self, palace_path, seeded_collection):
result = search_memories("database", palace_path, room="backend")
assert all(r["room"] == "backend" for r in result["results"])
def test_wing_and_room_filter(self, palace_path, seeded_collection):
result = search_memories("code", palace_path, wing="project", room="frontend")
assert all(r["wing"] == "project" and r["room"] == "frontend" for r in result["results"])
def test_n_results_limit(self, palace_path, seeded_collection):
result = search_memories("code", palace_path, n_results=2)
assert len(result["results"]) <= 2
def test_no_palace_returns_error(self, tmp_path):
result = search_memories("anything", str(tmp_path / "missing"))
assert "error" in result
def test_result_fields(self, palace_path, seeded_collection):
result = search_memories("authentication", palace_path)
hit = result["results"][0]
assert "text" in hit
assert "wing" in hit
assert "room" in hit
assert "source_file" in hit
assert "similarity" in hit
assert isinstance(hit["similarity"], float)
def test_search_memories_query_error(self):
"""search_memories returns error dict when query raises."""
mock_col = MagicMock()
mock_col.query.side_effect = RuntimeError("query failed")
with patch("mempalace.searcher.get_collection", return_value=mock_col):
result = search_memories("test", "/fake/path")
assert "error" in result
assert "query failed" in result["error"]
def test_search_memories_filters_in_result(self, palace_path, seeded_collection):
result = search_memories("test", palace_path, wing="project", room="backend")
assert result["filters"]["wing"] == "project"
assert result["filters"]["room"] == "backend"
# ── search() (CLI print function) ─────────────────────────────────────
class TestSearchCLI:
def test_search_prints_results(self, palace_path, seeded_collection, capsys):
search("JWT authentication", palace_path)
captured = capsys.readouterr()
assert "JWT" in captured.out or "authentication" in captured.out
def test_search_with_wing_filter(self, palace_path, seeded_collection, capsys):
search("planning", palace_path, wing="notes")
captured = capsys.readouterr()
assert "Results for" in captured.out
def test_search_with_room_filter(self, palace_path, seeded_collection, capsys):
search("database", palace_path, room="backend")
captured = capsys.readouterr()
assert "Room:" in captured.out
def test_search_with_wing_and_room(self, palace_path, seeded_collection, capsys):
search("code", palace_path, wing="project", room="frontend")
captured = capsys.readouterr()
assert "Wing:" in captured.out
assert "Room:" in captured.out
def test_search_no_palace_raises(self, tmp_path):
with pytest.raises(SearchError, match="No palace found"):
search("anything", str(tmp_path / "missing"))
def test_search_no_results(self, palace_path, collection, capsys):
"""Empty collection returns no results message."""
# collection is empty (no seeded data)
result = search("xyzzy_nonexistent_query", palace_path, n_results=1)
captured = capsys.readouterr()
# Either prints "No results" or returns None
assert result is None or "No results" in captured.out
def test_search_query_error_raises(self):
"""search raises SearchError when query fails."""
mock_col = MagicMock()
mock_col.query.side_effect = RuntimeError("boom")
with patch("mempalace.searcher.get_collection", return_value=mock_col):
with pytest.raises(SearchError, match="Search error"):
search("test", "/fake/path")
def test_search_n_results(self, palace_path, seeded_collection, capsys):
search("code", palace_path, n_results=1)
captured = capsys.readouterr()
# Should have output with at least one result block
assert "[1]" in captured.out