Files
mempalace/tests
jp ee12c07c54 fix(searcher): tolerate None documents in BM25 reranker
`_tokenize` calls `text.lower()` unconditionally; when ChromaDB returns a
drawer with `documents` containing `None`, the hybrid-rerank path raises
`AttributeError: 'NoneType' object has no attribute 'lower'`.

Observed in production daemon log (2026-04-24 21:07:05) during a search
that triggered `_hybrid_rank → _bm25_scores → _tokenize`:

    File "mempalace/searcher.py", line 81, in _bm25_scores
        tokenized = [_tokenize(d) for d in documents]
    File "mempalace/searcher.py", line 52, in _tokenize
        return _TOKEN_RE.findall(text.lower())
    AttributeError: 'NoneType' object has no attribute 'lower'

Closes the gap left by the upstream None-metadata audit (#999), which
covered metadata loops but not BM25 helpers. Returns `[]` for falsy input
so a None doc gets score 0.0 while the rest of the corpus reranks normally.

Three regression tests in TestBM25NoneSafety lock the behavior and reference
the production trace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 06:27:14 -07:00
..
2026-04-11 16:16:49 -07:00