feat(graph): cross-wing tunnels by shared topics (#1180)

When two wings have one or more confirmed TOPIC labels in common, the
miner now drops a symmetric tunnel between them at mine time so the
palace graph reflects shared themes (frameworks, vendors, recurring
concepts).

- llm_refine: TOPIC label routes to a dedicated `topics` bucket so the
  signal survives confirmation instead of getting collapsed into
  `uncertain` and dropped.
- entity_detector / project_scanner: bucket plumbed through the
  detection pipeline; `confirm_entities` returns confirmed topics
  alongside people/projects.
- miner.add_to_known_entities: optional `wing` parameter records the
  confirmed topics under `topics_by_wing` in
  `~/.mempalace/known_entities.json`. Wing names do NOT leak into the
  flat known-name set used by drawer-tagging.
- palace_graph: `compute_topic_tunnels` and `topic_tunnels_for_wing`
  create symmetric tunnels via the existing `create_tunnel` API so they
  share dedup and persistence with explicit tunnels.
- miner.mine: post-file-loop pass calls `topic_tunnels_for_wing` for
  the freshly-mined wing. Failures are logged but never abort the mine.
- config: `topic_tunnel_min_count` knob (env
  `MEMPALACE_TOPIC_TUNNEL_MIN_COUNT` or `~/.mempalace/config.json`),
  default 1.

Tests cover topic persistence through init->mine, tunnel creation when
wings share a topic, no tunnel below threshold, cross-wing tunnel
retrieval via `list_tunnels`, dedup on recompute, case-insensitive
overlap, and the end-to-end mine-time wiring.

Out of scope for this PR (called out in the PR body): manifest-
dependency overlap, per-topic allow/deny lists, search-result surfacing.
This commit is contained in:
Igor Lins e Silva
2026-04-24 19:19:58 -03:00
parent ed2ba726c9
commit fe051adc73
14 changed files with 678 additions and 28 deletions
+31 -3
View File
@@ -272,7 +272,9 @@ def test_apply_classifications_appends_reason_signal():
assert any("spoken of by name" in s for s in new["people"][0]["signals"])
def test_apply_classifications_topic_goes_to_uncertain():
def test_apply_classifications_topic_goes_to_topics_bucket():
"""TOPIC classifications now route to a dedicated ``topics`` bucket so the
miner can use them as cross-wing tunnel signal (issue #1180)."""
detected = {
"people": [],
"projects": [
@@ -289,8 +291,32 @@ def test_apply_classifications_topic_goes_to_uncertain():
decisions = {"Paris": ("TOPIC", "city, not a project")}
new, reclass, _ = _apply_classifications(detected, decisions)
assert len(new["projects"]) == 0
assert len(new["uncertain"]) == 0
assert len(new["topics"]) == 1
assert new["topics"][0]["name"] == "Paris"
assert new["topics"][0]["type"] == "topic"
assert reclass == 1
def test_apply_classifications_ambiguous_still_goes_to_uncertain():
detected = {
"people": [],
"projects": [
{
"name": "Foo",
"type": "project",
"confidence": 0.7,
"frequency": 5,
"signals": ["regex"],
}
],
"uncertain": [],
}
decisions = {"Foo": ("AMBIGUOUS", "context insufficient")}
new, reclass, _ = _apply_classifications(detected, decisions)
assert len(new["projects"]) == 0
assert len(new["uncertain"]) == 1
assert new["uncertain"][0]["name"] == "Paris"
assert new["uncertain"][0]["name"] == "Foo"
assert reclass == 1
@@ -469,7 +495,9 @@ def test_refine_entities_refines_high_confidence_regex_projects():
assert provider.call_count == 1
assert result.reclassified == 1
assert result.merged["projects"] == []
assert result.merged["uncertain"][0]["name"] == "OpenAPI"
# TOPIC labels go to the dedicated ``topics`` bucket so the miner can
# use them for cross-wing tunnel computation (issue #1180).
assert result.merged["topics"][0]["name"] == "OpenAPI"
def test_refine_entities_refines_regex_people_but_skips_git_people():