Most LLM apps only keep the last N tokens after a context window is exceeded, so when the user said the important thing 50 turns ago, it's gone and the model can't answer.
This repo adds episodic memory to store experiences as vectors and retrieve by similarity when answering. This focuses on relevance rather than temporal proximity.
On 40 preference/recall tasks we see 97.5% success vs 85% for truncated history. On long tasks where the key info is at the start it is 88.9% vs 33.3%.
It has a Rust core, Python/Node/Go bindings, along with LangChain and LangGraph integrations. Fully open source.