docs(website): align mempalaceofficial.com with honest benchmarks

Part of #875. Bring the VitePress site into line with the new README and the reproducibility scorecard: drop category-error comparisons, drop retracted claims, retain only metrics and caveats that survive audit. website/index.md - New tagline matches README (local-first, verbatim, pluggable backend, 96.6% R@5 raw, zero API calls). - Replace the "MemPalace hybrid 100% / Supermemory ~99% / Mastra 94.87% / Mem0 ~85%" comparison table with a single honest table showing MemPalace's own retrieval-recall numbers (raw 96.6%, hybrid v4 held-out 98.4%). Add an explicit sentence explaining why we no longer publish a cross-system table on the landing page (retrieval recall vs QA accuracy are different metrics). - Soften the "ChromaDB-powered vector search" feature blurb to be backend-agnostic, since the retrieval layer is pluggable. website/reference/benchmarks.md - Full rewrite of the retrieval-recall tables. No more "100%" headline; honest held-out 98.4% R@5 replaces it. Added the model-agnostic rerank result (99.2% R@5 / 100% R@10 with minimax-m2.7 via Ollama) to show the pipeline is not Haiku-specific. - Drop the LoCoMo "Hybrid v5 + Sonnet rerank (top-50) 100%" row. With per-conversation session counts of 19-32 and top_k=50, the retrieval stage returns every session by construction — the number measures an LLM's reading comprehension, not retrieval. - Drop the cross-system comparison tables. Link out to each project's own research page (Mastra, Mem0, Supermemory) for their published numbers and metric definitions. - Rewrite reproduction commands to use the correct repository and demonstrate the new --llm-backend ollama flag. website/concepts/the-palace.md - Remove the "+34%" row / paragraph. Wing/room filtering is standard metadata filtering in the vector store, not a novel retrieval mechanism — the April-7 note already retracted that framing; this finishes the retraction on the website where it had remained. website/guide/searching.md - Same treatment for "34% retrieval improvement". Reframe as operational scoping, not a novel boost. website/reference/contributing.md - Update the "palace structure matters" bullet to reflect the same framing: scoping-not-magic. website/concepts/knowledge-graph.md - Replace the MemPalace-vs-Zep feature matrix with a short "related work" note that links to Zep's own documentation for authoritative details on their deployment model. Avoids claims we cannot verify at source.
2026-04-14 21:37:45 -03:00
parent 65bf1ebda3
commit f20a1a30fe
6 changed files with 133 additions and 95 deletions
@@ -23,23 +23,16 @@ mempalace search "deploy process" --results 10

 ## How Search Works

-1. Your query is embedded using ChromaDB's default model (`all-MiniLM-L6-v2`)
-2. The embedding is compared against all drawers using cosine similarity
-3. Optional wing/room filters narrow the search scope
-4. Results are returned with similarity scores and source metadata
+1. Your query is embedded using the vector store's default model (`all-MiniLM-L6-v2` with the default ChromaDB backend).
+2. The embedding is compared against all drawers using cosine similarity.
+3. Optional wing/room filters narrow the search scope — standard metadata filtering in the underlying vector store.
+4. Results are returned with similarity scores and source metadata.

-### Why Structure Matters
+### Why Scoping Matters

-Tested on 22,000+ real conversation memories:
+Wing/room filtering is useful when a single palace contains many unrelated projects or people. Narrowing the search to a specific wing (or wing + room) means the vector store only scores candidates inside that scope, which keeps retrieval predictable as the palace grows.

-```
-Search all closets:          60.9%  R@10
-Search within wing:          73.1%  (+12%)
-Search wing + hall:          84.8%  (+24%)
-Search wing + room:          94.8%  (+34%)
-```
-
-Wings and rooms aren't cosmetic — they're a **34% retrieval improvement**.
+This is a metadata-filter feature of the vector store, not a novel retrieval mechanism. Treat it as an operational convenience: clear scoping rules that a human or an agent can apply predictably.

 ## Programmatic Search