perf(mining): batch per-chunk upserts and add optional GPU acceleration
The miner upserted one drawer per ChromaDB call, paying tokenizer + ONNX session setup per chunk. The embedding device was CPU-only because no EmbeddingFunction was ever wired through the backend. Two changes, each a speedup in its own right; stacked they give ~10x end-to-end on a medium corpus (20 files, 568 drawers): 1. Batched upsert. `process_file` and `_file_chunks_locked` now collect all chunks of a file into a single `collection.upsert(...)` so the embedding model runs one forward pass per file instead of N. 2. Hardware-accelerated embedding function. New `mempalace/embedding.py` wraps `ONNXMiniLM_L6_V2` with configurable `preferred_providers`. `MEMPALACE_EMBEDDING_DEVICE` (or `embedding_device` in config.json) selects auto / cpu / cuda / coreml / dml. Unavailable accelerators log a warning and fall back to CPU. The factory subclasses `ONNXMiniLM_L6_V2` and spoofs its `name()` to `"default"` so the persisted EF identity matches existing palaces created with ChromaDB's bare `DefaultEmbeddingFunction` -- same model, same 384-dim vectors, no rebuild needed when turning GPU on. `ChromaBackend.get_collection` / `create_collection` now pass the resolved EF on every call so miner writes and searcher reads agree. Benchmarks (i9-12900KF + RTX 3090, medium scenario, 568 drawers): per-chunk + CPU 19.77s · 29 drw/s (baseline) batched + CPU 8.07s · 70 drw/s (2.4x) batched + CUDA 2.15s · 264 drw/s (9.2x) Reproducible via `benchmarks/mine_bench.py`. Install paths: pip install mempalace[gpu] # NVIDIA CUDA pip install mempalace[dml] # DirectML (Windows) pip install mempalace[coreml] # macOS Neural Engine Mine header now prints `Device: cpu|cuda|...` so users can confirm the accelerator engaged.
This commit is contained in:
@@ -405,6 +405,23 @@ class ChromaBackend(BaseBackend):
|
||||
self._freshness: dict[str, tuple[int, float]] = {}
|
||||
self._closed = False
|
||||
|
||||
@staticmethod
|
||||
def _resolve_embedding_function():
|
||||
"""Return the EF for the user's ``embedding_device`` setting.
|
||||
|
||||
Both ``get_collection`` and ``get_or_create_collection`` must receive
|
||||
the EF explicitly — ChromaDB 1.x does not persist it with the
|
||||
collection, so a reader that omits the argument silently gets the
|
||||
library default and its queries won't match the writer's vectors.
|
||||
"""
|
||||
try:
|
||||
from ..embedding import get_embedding_function
|
||||
|
||||
return get_embedding_function()
|
||||
except Exception:
|
||||
logger.exception("Failed to build embedding function; using chromadb default")
|
||||
return None
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Internal helpers
|
||||
# ------------------------------------------------------------------
|
||||
@@ -532,12 +549,15 @@ class ChromaBackend(BaseBackend):
|
||||
if options and isinstance(options, dict):
|
||||
hnsw_space = options.get("hnsw_space", hnsw_space)
|
||||
|
||||
ef = self._resolve_embedding_function()
|
||||
ef_kwargs = {"embedding_function": ef} if ef is not None else {}
|
||||
|
||||
if create:
|
||||
collection = client.get_or_create_collection(
|
||||
collection_name, metadata={"hnsw:space": hnsw_space}
|
||||
collection_name, metadata={"hnsw:space": hnsw_space}, **ef_kwargs
|
||||
)
|
||||
else:
|
||||
collection = client.get_collection(collection_name)
|
||||
collection = client.get_collection(collection_name, **ef_kwargs)
|
||||
return ChromaCollection(collection)
|
||||
|
||||
def close_palace(self, palace) -> None:
|
||||
@@ -578,8 +598,10 @@ class ChromaBackend(BaseBackend):
|
||||
self, palace_path: str, collection_name: str, hnsw_space: str = "cosine"
|
||||
) -> ChromaCollection:
|
||||
"""Create (not get-or-create) ``collection_name`` with the given HNSW space."""
|
||||
ef = self._resolve_embedding_function()
|
||||
ef_kwargs = {"embedding_function": ef} if ef is not None else {}
|
||||
collection = self._client(palace_path).create_collection(
|
||||
collection_name, metadata={"hnsw:space": hnsw_space}
|
||||
collection_name, metadata={"hnsw:space": hnsw_space}, **ef_kwargs
|
||||
)
|
||||
return ChromaCollection(collection)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user