fix: guard None metadata/doc in tool_check_duplicate and Layer1/Layer2
Chroma 1.5.x can return ``None`` inside the ``metadatas`` / ``documents`` lists of a query/get result for partially-flushed rows. The codebase already has a systemic None-guard pattern (merged #999, #1013, #1019) but three call sites were still unguarded: * ``mcp_server.tool_check_duplicate`` (``mcp_server.py:487-488``) — ``meta = results["metadatas"][0][i]`` followed by ``meta.get(...)`` raises ``AttributeError: 'NoneType' object has no attribute 'get'``. The broad ``except Exception`` wrapper (line 504) swallows it and returns an uninformative ``"Duplicate check failed"``. * ``layers.Layer1.generate`` (``layers.py:126``) — iterates ``zip(docs, metas)`` and calls ``meta.get(key)`` in the importance loop. A single None metadata blows up the entire wake-up render. * ``layers.Layer2.retrieve`` (``layers.py:224``) — same pattern, same crash path for the on-demand render. Apply the same ``meta = meta or {}`` / ``doc = doc or ""`` idiom used by the merged guards in the search path. Three-line additions, no behaviour change on well-formed results. Tests added: * ``test_check_duplicate_handles_none_metadata`` — mocks the collection query to return ``None`` for one metadata and document, asserts the call does not crash and the sentinel-rendered entry has wing/room "?" and empty content. * ``test_layer1_handles_none_metadata`` / ``_handles_none_document`` * ``test_layer2_handles_none_metadata`` Relationship to other open PRs: * **#1019** guarded ``searcher.py`` loops. This PR extends the same guard to the three call sites #1019 did not touch. * **#979** fixed ``tool_check_duplicate`` negative similarity but left the None-metadata path unguarded. * Does not overlap **#1013** (``Layer3.search_raw``) or **#999**.
This commit is contained in:
@@ -484,8 +484,10 @@ def tool_check_duplicate(content: str, threshold: float = 0.9):
|
||||
dist = results["distances"][0][i]
|
||||
similarity = round(1 - dist, 3)
|
||||
if similarity >= threshold:
|
||||
meta = results["metadatas"][0][i]
|
||||
doc = results["documents"][0][i]
|
||||
# Chroma 1.5.x can return None for partially-flushed rows;
|
||||
# coerce to empty sentinels so downstream .get() is safe.
|
||||
meta = results["metadatas"][0][i] or {}
|
||||
doc = results["documents"][0][i] or ""
|
||||
duplicates.append(
|
||||
{
|
||||
"id": drawer_id,
|
||||
|
||||
Reference in New Issue
Block a user