Changelog
All notable changes to Recallm will be documented in this file.
The format is based on Keep a Changelog.
[0.2.0] — 2026-03-10
Added
SemanticCache.stats()returns aCacheStatsdataclass withhits,misses,hit_rate,avg_similarity, andnamespace_sizesfor development-time visibility into cache behaviourThreadSafeInMemoryStorage— RLock-protected drop-in replacement forInMemoryStorage, safe for multi-threaded and async-framework userecallmtop-level package:pip install recallmnow maps directly tofrom recallm import ..., eliminating the install/import name mismatch
Changed
- Default embedding model changed from
all-MiniLM-L6-v2toBAAI/bge-small-en-v1.5for improved first-run reliability - Default
cache_timeout_secondsraised from0.05to0.2(200 ms) to reduce silent cold-start bypasses - Cache timeout now emits a structlog
warning(cache.timeout_exceeded) withelapsed_ms,timeout_ms, andaction=bypassinstead of a silent error
[0.1.0] — 2026-03-06
Added
SemanticCache.wrap()with sync and async supportInMemoryStorage— zero-dependency in-process backendRedisStorage— persistent backend with lazy tombstone cleanupFastEmbedEmbedder— ONNX-based, ~20MB default embedderSentenceTransformerEmbedder— optional torch-based embedder- Three similarity threshold profiles:
strict(0.97),balanced(0.92),loose(0.85) - Namespace-based cache invalidation
- TTL support on cache entries
- Prometheus metrics: hits, misses, errors, embedding latency, similarity scores, stream bypass
- Structlog structured logging on all cache events with rich operational fields
stream=Truebypass with per-namespace counter- Fail-open on all cache operation failures
CacheContexttype alias for type-checking convenienceSemanticCache.async_warmup()for non-blocking model load in async frameworks- Grafana dashboard (8 panels,
$namespacevariable) - Benchmark suite with four realistic prompt distributions