Long-term memory for any agent
A self-improving memory layer any agent can use. Send conversations in, get structured memories out, with contradictions resolved and accuracy improving over time.

Most agents are stateless by default. Every session starts from zero, whatever was said last time is gone, and the agent meets a returning user as a stranger. This is the single most common gap between an agent demo that impresses and an agent product that frustrates, and it shows up regardless of what the agent is for: support, sales, coding, tutoring, research, all of it runs into the same wall the moment continuity matters.
Exabase Memory is the primitive that closes the gap. It's a self-improving memory layer any agent can use: send conversations in, get structured memories out, with contradictions resolved and stale context refreshed automatically. This page is the canonical version of the case, the memory layer on its own, before it's specialised into any particular use.
The problem
Statelessness isn't a property of language models so much as a gap in the infrastructure around them. The model can reason; it just has nowhere to keep what it learns, so by default it keeps nothing. Statelessness is a design choice, and for most products it's the wrong one, because agents that forget can't personalise, can't build on past interactions, and can't be trusted to know things they were told an hour ago.
The obvious patches don't hold. Stuffing conversation history into the prompt works until it overflows the context window, and then the agent is dropping context at random. Storing transcripts in a database gives you persistence without understanding, a pile of text with no notion of what's current, what's been superseded, or what matters. A vector database gets you closer but still isn't a memory system: it can find similar passages, but it can't tell you that a newer fact replaced an older one, can't resolve a contradiction, can't maintain an evolving picture rather than an ever-growing heap.
That maintenance is the hard part, and it's why building memory yourself is more than it first appears. Keeping a coherent, current picture across many interactions means resolving when two facts conflict (entity resolution), preventing the slow corruption of the picture as facts change (memory drift), and holding retrieval quality as the store grows (semantic collapse). Each of these is a real problem on its own. A memory layer is the system that handles all of them together.
What Exabase unlocks
With a memory layer underneath, an agent stops resetting and starts accumulating, which changes what kind of product it can be.
You send it conversations and it gives you back structured memories. You don't have to decide in advance what's worth keeping or define a schema for it; the layer extracts the facts, preferences, and events that matter from raw interaction text. The agent's knowledge of a user builds up as a natural byproduct of talking to them.
It keeps the picture current rather than letting it rot. When a user's circumstances change, the new fact supersedes the old one instead of sitting beside it, so the agent reasons from where things actually stand. The store gets more accurate as it grows, because contradictions are resolved as they arrive rather than accumulating into noise.
And it gives the agent recall worth having. Ask what the agent knows about a user and it returns a coherent, current picture, not a dump of every message they ever sent. This is the difference between an agent that has memory and an agent that merely has logs.
How it works
The memory layer is the whole story here, with Bases as the scoping mechanism that decides whose memory is whose.
Memory
Exabase Memory operates in the three phases that distinguish a memory layer from simpler retrieval. On the way in, it extracts what matters from interactions, builds relationships between facts, and resolves contradictions against what it already knows, so what's stored is a maintained picture rather than raw text. At query time, it retrieves against multiple signals rather than running a single similarity search, assembling a clean, current context rather than a bag of similar-looking chunks. And it keeps the whole store coherent over time, collapsing redundancy and refreshing stale context, which is what lets accuracy improve with use instead of degrading.
The interface is deliberately simple given what's happening underneath: you create memories by sending interactions, and you retrieve memories to get back a current picture. Simple retrieval runs around 200ms, fast enough to sit inside a live agent loop. The quality of this approach is measurable: Exabase reaches state-of-the-art on the LongMemEval benchmark, and does it with a smaller model, because precise memory beats brute-force context. Reliable memory has also been shown to cut hallucinations by around 28%.
Bases
A Base is how you scope memory: an isolated environment, so the memory for one user, agent, or tenant stays separate from another's. For a single-user agent you might use one workspace; for anything multi-user you give each their own Base, which keeps one user's memory from ever surfacing in another's session. Memory scoping is a single API concept rather than a partitioning scheme you build, and the full multi-tenant pattern is covered in multi-tenant memory for SaaS.
Example architecture
The loop is as simple as the value is large, which is the point of having it as a primitive.
Scope the memory. Decide whose memory is whose: one Base per user or tenant for a multi-user product, or a single workspace for a single-user agent.
Write on the way out. After an interaction, send it to Memory. The layer extracts and stores what matters and resolves it against what's already there. You don't curate; you just send the conversation.
Read on the way in. Before the agent responds, retrieve memory for the relevant current picture and put it in the prompt.
That's the entire integration: retrieve before responding, write after. Two calls around your existing agent loop turn a stateless agent into one that remembers, and everything hard, extraction, contradiction resolution, coherence over time, happens inside the layer.
What compounds over time
This is the use case where compounding is the whole proposition, because a memory layer is valuable precisely in proportion to how much it has accumulated.
An agent on its first interaction with a user has nothing to draw on. The same agent after months of interactions holds a rich, current picture, and because the layer self-organises, that picture gets sharper rather than noisier as it fills, contradictions resolved, redundancy collapsed, stale facts refreshed. The 28% reduction in hallucinations and the state-of-the-art recall both reflect this: good memory makes the agent more accurate, and the effect grows with use.
Building this yourself means building and maintaining the extraction, the contradiction resolution, the multi-signal retrieval, and the coherence machinery, and keeping all of it working as the store grows into the range where naive approaches break. That's a serious, ongoing engineering investment in something that isn't your product. Using a memory layer as a primitive means the part that compounds is handled, and you spend your effort on the agent instead. For where a framework's built-in memory stops being enough and a real layer becomes worth it, see when to outgrow your framework's built-in memory.
Who's building this
Every agent that talks to the same users more than once is a candidate, which is why this is the canonical case rather than a niche one. The specialised versions live across the other use cases: customer support agents, sales copilots, coding assistants, personal AI assistants, and learning copilots are all this same memory layer pointed at a particular kind of user. This page is the general primitive underneath them.
If you're choosing between memory platforms, the comparison of agent memory platforms and the head-to-heads with Mem0 and Supermemory lay out the differences. For a hands-on build, the memory-powered personal assistant example implements the core loop end to end.
Get started
Start with the getting started guide, then about memories, creating memories, and retrieving memories for the full loop. Memory scoping covers isolation, and there's a free tier to build against.
FAQs
What exactly do I send, and what do I get back?
You send interactions, typically raw conversation text. Exabase extracts the facts, preferences, and events worth keeping and stores them as structured memories. When you retrieve, you get back a coherent, current picture relevant to your query, rather than a transcript or a list of raw chunks.
Do I have to decide what the agent should remember?
No. The layer extracts what matters from what you send, so you don't define a schema or curate in advance. If you want deliberate control, you can also write specific memories directly.
How is this different from a vector database?
A vector database finds similar passages; it has no concept of one fact superseding another, no contradiction resolution, and no mechanism to keep an evolving picture current. A memory layer maintains that picture. The distinction is set out in why a vector database is not a memory system.
How does it handle contradictory information?
Through entity resolution. When a new fact conflicts with an existing one, the newer state supersedes the older rather than coexisting with it, which is what prevents memory drift and keeps the picture current.
Is it fast enough to use in a live agent?
Simple memory retrieval runs around 200ms, fast enough to sit inside a real-time agent loop without a noticeable pause.
How do I keep different users' memories separate?
Scope each user to their own Base, which fully isolates their memory from every other user's. The multi-tenant SaaS use case covers the pattern in depth.
How good is the recall, really?
Exabase reaches state-of-the-art on LongMemEval, a benchmark for long-term memory, and does so with a smaller model, because precise retrieval outperforms loading more context. Reliable memory has also been shown to reduce hallucinations by around 28%.







