Cherry-picked the docs portion of 67e4ac6 to accompany the closet feature. Test coverage for closets is omnibus with tests for entity metadata and BM25 (see PR targeting those features) and will land together in a follow-up. Co-Authored-By: MSL <232237854+milla-jovovich@users.noreply.github.com>
3.1 KiB
Closets — The Searchable Index Layer
What closets are
Drawers hold your verbatim content. Closets are the index — compact pointers that tell the searcher which drawers to open.
CLOSET: "built auth system|Ben;Igor|→drawer_api_auth_a1b2c3"
↑ topic ↑ entities ↑ points to this drawer
An agent searching "who built the auth?" hits the closet first (fast scan of short text), then opens the referenced drawer to get the full verbatim content.
Lifecycle
When are closets created?
Closets are created during mempalace mine. For each file mined:
- Content is chunked into drawers (verbatim, ~800 chars each)
- Topics, entities, and quotes are extracted from the content
- A closet is created with pointer lines to those drawers
What's inside a closet?
Each line is one atomic topic pointer:
topic description|entity1;entity2|→drawer_id_1,drawer_id_2
"verbatim quote from the content"|entity1|→drawer_id_3
Topics are never split across closets. If adding a topic would exceed 1,500 characters, a new closet is created.
When do closets update?
When a file is re-mined (content changed), its drawers are replaced and new closets are built from the fresh content. The old closet content is replaced via upsert.
What about stale topics?
If a file's content changes and a topic no longer exists, the closet is rebuilt entirely from the new content — stale topics are gone. Closets are tied to source files, not to individual topics.
If you add content to an existing file (e.g., a daily diary growing throughout the day), new topics are appended to the existing closet until the 1,500-char limit, then a new closet is created.
Do closets survive palace rebuilds?
Closets are stored in the mempalace_closets ChromaDB collection alongside mempalace_drawers. If you delete and rebuild the palace, closets are recreated during the next mempalace mine.
How search uses closets
Query → search mempalace_closets (fast, small documents)
↓
top closet hits → extract drawer IDs from pointer lines
↓
fetch drawers from mempalace_drawers (full verbatim content)
↓
BM25 hybrid re-rank (keyword match + vector similarity)
↓
return results to user
If no closets exist (palace created before this feature), search falls back to direct drawer search. Closets are created on next mine.
Limits
| Setting | Value | Reason |
|---|---|---|
| Max closet size | 1,500 chars | Leaves buffer under ChromaDB's working limit |
| Max topics per file | 12 | Keeps closets focused |
| Max quotes per file | 3 | Most relevant only |
| Max entities per pointer | 5 | Top names by frequency |
| Max response chars | 10,000 | Prevents hydration blowup on large files |
For developers
Closet functions live in mempalace/palace.py:
get_closets_collection()— get the closets ChromaDB collectionbuild_closet_lines()— extract topics/entities/quotes into pointer linesupsert_closet_lines()— write lines to closets respecting the char limitCLOSET_CHAR_LIMIT— the 1,500 char limit constant