Retrieval Internals¶

deep dive into the 8-stage hybrid retrieval pipeline. see Retrieval Pipeline for the user-facing guide.

intent classification¶

regex-based, zero cost:

INTENT_PATTERNS = {
    "why": r'\b(why|because|reason|cause|led to|resulted in)\b',
    "when": r'\b(when|date|time|before|after|during|timeline|history)\b',
    "who": r'\b(who|person|people|team|built|created|wrote)\b',
    "how": r'\b(how to|steps|procedure|process|workflow|debug|fix)\b',
}
# default: "what" (balanced weights)

each intent adjusts channel weights:

intent	dense	BM25	graph
why	1.0	0.8	1.5
when	0.8	1.2	0.8
who	0.8	0.8	1.8
how	1.2	1.0	0.8
what	1.0	1.0	1.0

RRF fusion¶

reciprocal rank fusion from Cormack et al. 2009:

score(doc) = Σ weight / (k + rank + 1)

k=60 is the standard constant. each channel contributes independently — a document ranked #1 in dense and #5 in BM25 gets a higher fused score than a document ranked #2 in both.

temporal boost¶

date detection via regex, then boost based on proximity to the query's temporal signal:

matched date in memory: 2x boost
episodic memories get recency decay: exp(-0.693 * age_days / half_life), floored at 50%
access frequency: 1.0 + 0.1 * log(1 + access_count)

cross-encoder¶

top-20 candidates from RRF get jointly scored by a cross-encoder. the cross-encoder sees (query, document) together, not just embeddings — much more accurate for nuanced relevance.

local: ms-marco-MiniLM-L-6-v2 (~300ms for 20 docs) cloud: rerank-2.5 via Voyage API (better quality, 32k context)

deep MLP reranker¶

optional 7th stage. 2-layer MLP trained on access patterns:

input: 10 features (cosine sim, importance, access count, age, layer one-hot, retention)
output: relevance prediction
persisted to ~/.local/share/engram/reranker.npz
trains on which memories actually get accessed after being returned in search

ACT-R noise + threshold¶

gaussian noise (σ=0.02) for beneficial retrieval variation — prevents the same top results every time. minimum score threshold gates out garbage (only applies when cross-encoder is active, since RRF scores are on a different scale).

key files¶

engram/retrieval.py — the full pipeline
engram/embeddings.py — dense search + cross-encoder
engram/ann_index.py — HNSW wrapper
engram/hopfield.py — associative channel
engram/deep_retrieval.py — learned reranker