Files

T

Igor Lins e Silva ee60cad652 docs: add CLOSETS.md — closet layer overview

Cherry-picked the docs portion of 67e4ac6 to accompany the closet
feature. Test coverage for closets is omnibus with tests for entity
metadata and BM25 (see PR targeting those features) and will land
together in a follow-up.

Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>

2026-04-13 07:38:43 -03:00

3.1 KiB

Raw Blame History

Closets — The Searchable Index Layer

What closets are

Drawers hold your verbatim content. Closets are the index — compact pointers that tell the searcher which drawers to open.

CLOSET: "built auth system|Ben;Igor|→drawer_api_auth_a1b2c3"
         ↑ topic           ↑ entities  ↑ points to this drawer

An agent searching "who built the auth?" hits the closet first (fast scan of short text), then opens the referenced drawer to get the full verbatim content.

Lifecycle

When are closets created?

Closets are created during mempalace mine. For each file mined:

Content is chunked into drawers (verbatim, ~800 chars each)
Topics, entities, and quotes are extracted from the content
A closet is created with pointer lines to those drawers

What's inside a closet?

Each line is one atomic topic pointer:

topic description|entity1;entity2|→drawer_id_1,drawer_id_2
"verbatim quote from the content"|entity1|→drawer_id_3

Topics are never split across closets. If adding a topic would exceed 1,500 characters, a new closet is created.

When do closets update?

When a file is re-mined (content changed), its drawers are replaced and new closets are built from the fresh content. The old closet content is replaced via upsert.

What about stale topics?

If a file's content changes and a topic no longer exists, the closet is rebuilt entirely from the new content — stale topics are gone. Closets are tied to source files, not to individual topics.

If you add content to an existing file (e.g., a daily diary growing throughout the day), new topics are appended to the existing closet until the 1,500-char limit, then a new closet is created.

Do closets survive palace rebuilds?

Closets are stored in the mempalace_closets ChromaDB collection alongside mempalace_drawers. If you delete and rebuild the palace, closets are recreated during the next mempalace mine.

How search uses closets

Query → search mempalace_closets (fast, small documents)
         ↓
    top closet hits → extract drawer IDs from pointer lines
         ↓
    fetch drawers from mempalace_drawers (full verbatim content)
         ↓
    BM25 hybrid re-rank (keyword match + vector similarity)
         ↓
    return results to user

If no closets exist (palace created before this feature), search falls back to direct drawer search. Closets are created on next mine.

Limits

Setting	Value	Reason
Max closet size	1,500 chars	Leaves buffer under ChromaDB's working limit
Max topics per file	12	Keeps closets focused
Max quotes per file	3	Most relevant only
Max entities per pointer	5	Top names by frequency
Max response chars	10,000	Prevents hydration blowup on large files

For developers

Closet functions live in mempalace/palace.py:

get_closets_collection() — get the closets ChromaDB collection
build_closet_lines() — extract topics/entities/quotes into pointer lines
upsert_closet_lines() — write lines to closets respecting the char limit
CLOSET_CHAR_LIMIT — the 1,500 char limit constant

3.1 KiB Raw Blame History